Fix test hangs on Linux/macOS CI by mocking system dependencies #2353

m3nu · 2026-01-23T10:04:07Z

Summary

Fix test hang issues on Linux and macOS CI by properly mocking system dependencies that can hang or timeout in headless environments.

Problem

Tests were hanging or timing out on CI due to:

D-Bus connections in scheduler.py and network_manager.py hanging on headless systems
DNS lookups in _getfqdn() timing out on CI
CoreWLAN WiFi enumeration hanging on headless macOS
pytest-qt's processEvents() triggering D-Bus operations on macOS

Solution

All problematic system calls are now mocked in tests/conftest.py using pytest fixtures, keeping production code clean.

Changes

tests/conftest.py:

Mock QtDBus.QDBusConnection.systemBus to return disconnected bus (prevents D-Bus hangs)
Mock socket.getaddrinfo to return empty list (prevents DNS timeouts)
Mock vorta.utils.get_network_status_monitor to return empty WiFi list (prevents CoreWLAN hangs)
Monkey-patch pytest-qt's _process_events on macOS to prevent Qt event loop hangs

tests/integration/conftest.py:

Add all_workers_finished() helper for thorough thread cleanup
Add disconnect_all() helper for proper signal cleanup
Improve init_db fixture teardown to wait for all threads and clear state

tests/unit/conftest.py:

Same improvements as integration conftest
Fix borg_json_output fixture to properly close file handles

src/vorta/views/repo_add_dialog.py:

Fix bug: Change from thread.run() (synchronous) to jobs_manager.add_job(job) (async)

Test infrastructure:

Add tests/unit/test_constants.py with cross-platform path constants
Replace hardcoded /tmp paths with tempfile.gettempdir()
Standardize test timeouts using pytest._wait_defaults

Test plan

pytest tests/unit/ -v - 183 passed, 6 skipped
pytest tests/integration/ -v - 18 passed, 1 skipped
All pre-commit hooks pass

The tests test_create_repo and test_add_existing_repo were failing intermittently with RepoModelDoesNotExist and waitUntil timeout errors. Root cause: The tests used is_worker_running() which returns False as soon as current_job is None, but worker threads may still be alive and Qt signals may not be fully processed yet. Fix: Adopt the same reliable waiting patterns used in unit tests: - Add all_workers_finished() helper that checks worker.is_alive() - Add disconnect_all() helper for proper signal cleanup - Call QCoreApplication.processEvents() after waiting for workers - Update init_db fixture teardown to properly clean up state

D-Bus calls to systemd-logind and NetworkManager can hang indefinitely in CI environments where D-Bus is partially available. Check for sys._called_from_test (set by pytest) and skip these operations.

Print statements at key initialization points to identify where the test hang occurs in Linux Python 3.12 CI environment.

Track where hang occurs between VortaApp creation and test execution.

Check if test function is ever reached after fixture setup.

Pinpoint where hang occurs in fixture teardown.

qtbot.waitUntil processes Qt events while waiting, which can trigger D-Bus operations that hang on Linux Python 3.12 CI. Use simple time-based polling instead.

QCoreApplication.processEvents() in teardown triggers pending Qt events which can include D-Bus operations that hang on Linux Python 3.12 CI.

Replace QCoreApplication.processEvents() with time.sleep() to avoid triggering D-Bus operations that hang on Linux Python 3.12 CI.

Clean up debug print statements added during investigation of the Python 3.12 Linux CI hanging issue. The actual fixes (skipping D-Bus calls during tests, avoiding processEvents in fixtures) are preserved.

Adds timing instrumentation to identify the cause of ~20s delays per test on GitHub Actions macOS runners: - VortaApp.__init__: timing for each major initialization step - qapp fixture: timing for session setup - init_db fixture: timing for setup, teardown, and worker wait loop - load_window: timing for MainWindow recreation The debug output includes: - Elapsed time for each operation - Worker thread state (alive, current_job, process info) - Iteration count for the worker wait loop - Warning when wait loop times out This will help identify if the delay is: - In VortaApp/MainWindow initialization - In the worker wait loop (BorgVersionJob not finishing) - Somewhere else in the test infrastructure

The root cause of ~20s delays per test on macOS CI runners was CoreWLAN system calls hanging on headless runners without WiFi hardware. Call chain causing the delay: MainWindow.__init__ → ScheduleTab → NetworksPage → populate_wifi() → get_sorted_wifis() → get_network_status_monitor().get_known_wifis() → DarwinNetworkStatus._get_wifi_interface() → CWWiFiClient.sharedWiFiClient() ← HANGS ON HEADLESS CI Fix: Skip system WiFi enumeration during tests by checking the sys._called_from_test flag (already set in conftest.py). This is consistent with how D-Bus is already skipped in scheduler.py. Also removes debug timing code that was added to diagnose this issue.

The previous CoreWLAN fix didn't resolve the 20s delay on macOS CI. Adding timing around each major operation in MainWindow.__init__ to identify the exact source of the delay: - super().__init__ - setupUi - setWindowIcon / LoadingButton - Each tab creation (RepoTab, SourceTab, ArchiveTab, ScheduleTab, MiscTab, AboutTab) - populate_profile_selector - get_network_status_monitor().is_network_status_available() - set_icons

get_mount_points() iterates over all system processes which takes ~20 seconds on macOS CI runners. Since tests don't have actual borg mount processes, skip this enumeration during test runs.

The root cause was socket.getaddrinfo() timing out (~10s each) when format_archive_name() called _getfqdn() for archive name templates. This happened twice per MainWindow creation, adding 20s per test. Fix: Mock getaddrinfo in pytest_configure to return immediately for AI_CANONNAME requests, avoiding DNS lookups during tests. Also removes debug timing code and reverts unsuccessful earlier fixes.

The global getaddrinfo mock was breaking other networking code. Instead, cache the FQDN result in _getfqdn() so the slow DNS lookup only happens once per hostname, then returns cached value.

Move the test check inside _getfqdn() itself instead of patching from conftest.py. This avoids import timing issues that caused test hangs.

DNS skip confirmed working. Adding debug at: - MainWindow.__init__ end - load_window() start/end - init_db fixture yield point

pytest-qt's _process_events() hook causes hangs on macOS between tests. Setting qt_no_exception_capture = true disables this behavior. See: pytest-dev/pytest-qt#223

Move test-specific behavior out of production code into test fixtures. Instead of checking sys._called_from_test in app code, mock the problematic subsystems in tests/conftest.py: - Mock QtDBus.QDBusConnection.systemBus to prevent D-Bus hangs - Mock socket.getaddrinfo to prevent DNS lookup timeouts - Mock get_network_status_monitor to prevent WiFi enumeration hangs This keeps production code clean and follows Python testing best practices.

m3nu added 30 commits January 23, 2026 10:03

Test reliability fixes

e519e6e

Fix pythonpath

2d9c62e

Skip D-Bus calls during tests to fix CI hangs

4c295c4

D-Bus calls to systemd-logind and NetworkManager can hang indefinitely in CI environments where D-Bus is partially available. Check for sys._called_from_test (set by pytest) and skip these operations.

Add debug logging to diagnose CI hang

8dabc6b

Print statements at key initialization points to identify where the test hang occurs in Linux Python 3.12 CI environment.

Add more debug logging to init_db fixture and MainWindow

42d6160

Track where hang occurs between VortaApp creation and test execution.

Add debug to test function and fixture yield

01ce6fa

Check if test function is ever reached after fixture setup.

Add debug logging to init_db teardown

1badcbc

Pinpoint where hang occurs in fixture teardown.

Replace qtbot.waitUntil with simple polling in teardown

14f5f2e

qtbot.waitUntil processes Qt events while waiting, which can trigger D-Bus operations that hang on Linux Python 3.12 CI. Use simple time-based polling instead.

Remove processEvents from teardown to avoid D-Bus hang

a9c231e

QCoreApplication.processEvents() in teardown triggers pending Qt events which can include D-Bus operations that hang on Linux Python 3.12 CI.

Add more debug to teardown signal disconnect

25d1526

Add debug around mock_db.close()

8566053

Remove processEvents from load_window to avoid D-Bus hang

b6d496c

Replace QCoreApplication.processEvents() with time.sleep() to avoid triggering D-Bus operations that hang on Linux Python 3.12 CI.

Remove debug logging from CI hang investigation

e42baff

Clean up debug print statements added during investigation of the Python 3.12 Linux CI hanging issue. The actual fixes (skipping D-Bus calls during tests, avoiding processEvents in fixtures) are preserved.

Skip psutil.process_iter() during tests to fix slow macOS CI

3c81730

get_mount_points() iterates over all system processes which takes ~20 seconds on macOS CI runners. Since tests don't have actual borg mount processes, skip this enumeration during test runs.

Add debug timing inside ArchiveTab.__init__ to find slow line

82ac76e

Add more granular debug timing inside ArchiveTab methods

36d7bcc

Add timing to final section of populate_from_profile

f6d947e

Fix DNS mock to handle all getaddrinfo calls, not just AI_CANONNAME

4474a5f

Re-add CoreWLAN skip during tests to prevent hang on CI

1938a08

Fix CI hang: cache FQDN result instead of global DNS mock

a1c6ddd

The global getaddrinfo mock was breaking other networking code. Instead, cache the FQDN result in _getfqdn() so the slow DNS lookup only happens once per hostname, then returns cached value.

Revert FQDN caching to isolate hang issue

2f1036d

Patch _getfqdn specifically during tests to avoid slow DNS lookups

ddab310

Skip DNS lookup in _getfqdn during tests to fix CI slowness

d9c87e1

Move the test check inside _getfqdn() itself instead of patching from conftest.py. This avoids import timing issues that caused test hangs.

Add debug output to trace _getfqdn timing and flag state

b9c7a73

Add debug points to find actual hang location

e9ef30a

DNS skip confirmed working. Adding debug at: - MainWindow.__init__ end - load_window() start/end - init_db fixture yield point

m3nu added 9 commits January 23, 2026 14:07

Add debug to test_prune_intervals to verify test starts

938845a

Add detailed debug to test_prune_intervals

a10f184

Add debug to fixture teardown to find hang location

1844e6f

Add debug to archive_env fixture

efe0b98

Add debug at very start of init_db fixture

e61e468

Disable pytest-qt exception capture to fix macOS hang

65079e6

pytest-qt's _process_events() hook causes hangs on macOS between tests. Setting qt_no_exception_capture = true disables this behavior. See: pytest-dev/pytest-qt#223

Monkey-patch pytest-qt processEvents to prevent macOS hang

bcaac8e

Remove debug print statements after fixing macOS test hang

b5de80a

m3nu changed the title ~~Fix flaky integration tests in test_init.py~~ Fix test hangs on Linux/macOS CI by mocking system dependencies Jan 23, 2026

m3nu merged commit 95e0cd2 into borgbase:master Jan 23, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test hangs on Linux/macOS CI by mocking system dependencies #2353

Fix test hangs on Linux/macOS CI by mocking system dependencies #2353

m3nu commented Jan 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix test hangs on Linux/macOS CI by mocking system dependencies #2353

Fix test hangs on Linux/macOS CI by mocking system dependencies #2353

Conversation

m3nu commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

m3nu commented Jan 23, 2026 •

edited

Loading