Skip to content

Conversation

@m3nu
Copy link
Contributor

@m3nu m3nu commented Jan 23, 2026

Summary

Fix test hang issues on Linux and macOS CI by properly mocking system dependencies that can hang or timeout in headless environments.

Problem

Tests were hanging or timing out on CI due to:

  • D-Bus connections in scheduler.py and network_manager.py hanging on headless systems
  • DNS lookups in _getfqdn() timing out on CI
  • CoreWLAN WiFi enumeration hanging on headless macOS
  • pytest-qt's processEvents() triggering D-Bus operations on macOS

Solution

All problematic system calls are now mocked in tests/conftest.py using pytest fixtures, keeping production code clean.

Changes

tests/conftest.py:

  • Mock QtDBus.QDBusConnection.systemBus to return disconnected bus (prevents D-Bus hangs)
  • Mock socket.getaddrinfo to return empty list (prevents DNS timeouts)
  • Mock vorta.utils.get_network_status_monitor to return empty WiFi list (prevents CoreWLAN hangs)
  • Monkey-patch pytest-qt's _process_events on macOS to prevent Qt event loop hangs

tests/integration/conftest.py:

  • Add all_workers_finished() helper for thorough thread cleanup
  • Add disconnect_all() helper for proper signal cleanup
  • Improve init_db fixture teardown to wait for all threads and clear state

tests/unit/conftest.py:

  • Same improvements as integration conftest
  • Fix borg_json_output fixture to properly close file handles

src/vorta/views/repo_add_dialog.py:

  • Fix bug: Change from thread.run() (synchronous) to jobs_manager.add_job(job) (async)

Test infrastructure:

  • Add tests/unit/test_constants.py with cross-platform path constants
  • Replace hardcoded /tmp paths with tempfile.gettempdir()
  • Standardize test timeouts using pytest._wait_defaults

Test plan

  • pytest tests/unit/ -v - 183 passed, 6 skipped
  • pytest tests/integration/ -v - 18 passed, 1 skipped
  • All pre-commit hooks pass

m3nu added 30 commits January 23, 2026 10:03
The tests test_create_repo and test_add_existing_repo were failing
intermittently with RepoModelDoesNotExist and waitUntil timeout errors.

Root cause: The tests used is_worker_running() which returns False as
soon as current_job is None, but worker threads may still be alive and
Qt signals may not be fully processed yet.

Fix: Adopt the same reliable waiting patterns used in unit tests:
- Add all_workers_finished() helper that checks worker.is_alive()
- Add disconnect_all() helper for proper signal cleanup
- Call QCoreApplication.processEvents() after waiting for workers
- Update init_db fixture teardown to properly clean up state
D-Bus calls to systemd-logind and NetworkManager can hang indefinitely
in CI environments where D-Bus is partially available. Check for
sys._called_from_test (set by pytest) and skip these operations.
Print statements at key initialization points to identify where
the test hang occurs in Linux Python 3.12 CI environment.
Track where hang occurs between VortaApp creation and test execution.
Check if test function is ever reached after fixture setup.
Pinpoint where hang occurs in fixture teardown.
qtbot.waitUntil processes Qt events while waiting, which can trigger
D-Bus operations that hang on Linux Python 3.12 CI. Use simple time-based
polling instead.
QCoreApplication.processEvents() in teardown triggers pending Qt events
which can include D-Bus operations that hang on Linux Python 3.12 CI.
Replace QCoreApplication.processEvents() with time.sleep() to avoid
triggering D-Bus operations that hang on Linux Python 3.12 CI.
Clean up debug print statements added during investigation of the
Python 3.12 Linux CI hanging issue. The actual fixes (skipping D-Bus
calls during tests, avoiding processEvents in fixtures) are preserved.
Adds timing instrumentation to identify the cause of ~20s delays
per test on GitHub Actions macOS runners:

- VortaApp.__init__: timing for each major initialization step
- qapp fixture: timing for session setup
- init_db fixture: timing for setup, teardown, and worker wait loop
- load_window: timing for MainWindow recreation

The debug output includes:
- Elapsed time for each operation
- Worker thread state (alive, current_job, process info)
- Iteration count for the worker wait loop
- Warning when wait loop times out

This will help identify if the delay is:
- In VortaApp/MainWindow initialization
- In the worker wait loop (BorgVersionJob not finishing)
- Somewhere else in the test infrastructure
The root cause of ~20s delays per test on macOS CI runners was CoreWLAN
system calls hanging on headless runners without WiFi hardware.

Call chain causing the delay:
  MainWindow.__init__ → ScheduleTab → NetworksPage → populate_wifi()
  → get_sorted_wifis() → get_network_status_monitor().get_known_wifis()
  → DarwinNetworkStatus._get_wifi_interface()
  → CWWiFiClient.sharedWiFiClient() ← HANGS ON HEADLESS CI

Fix: Skip system WiFi enumeration during tests by checking the
sys._called_from_test flag (already set in conftest.py). This is
consistent with how D-Bus is already skipped in scheduler.py.

Also removes debug timing code that was added to diagnose this issue.
The previous CoreWLAN fix didn't resolve the 20s delay on macOS CI.
Adding timing around each major operation in MainWindow.__init__ to
identify the exact source of the delay:

- super().__init__
- setupUi
- setWindowIcon / LoadingButton
- Each tab creation (RepoTab, SourceTab, ArchiveTab, ScheduleTab, MiscTab, AboutTab)
- populate_profile_selector
- get_network_status_monitor().is_network_status_available()
- set_icons
get_mount_points() iterates over all system processes which takes
~20 seconds on macOS CI runners. Since tests don't have actual borg
mount processes, skip this enumeration during test runs.
The root cause was socket.getaddrinfo() timing out (~10s each) when
format_archive_name() called _getfqdn() for archive name templates.
This happened twice per MainWindow creation, adding 20s per test.

Fix: Mock getaddrinfo in pytest_configure to return immediately for
AI_CANONNAME requests, avoiding DNS lookups during tests.

Also removes debug timing code and reverts unsuccessful earlier fixes.
The global getaddrinfo mock was breaking other networking code.
Instead, cache the FQDN result in _getfqdn() so the slow DNS lookup
only happens once per hostname, then returns cached value.
Move the test check inside _getfqdn() itself instead of patching
from conftest.py. This avoids import timing issues that caused
test hangs.
DNS skip confirmed working. Adding debug at:
- MainWindow.__init__ end
- load_window() start/end
- init_db fixture yield point
m3nu added 9 commits January 23, 2026 14:07
pytest-qt's _process_events() hook causes hangs on macOS between tests.
Setting qt_no_exception_capture = true disables this behavior.

See: pytest-dev/pytest-qt#223
Move test-specific behavior out of production code into test fixtures.
Instead of checking sys._called_from_test in app code, mock the
problematic subsystems in tests/conftest.py:

- Mock QtDBus.QDBusConnection.systemBus to prevent D-Bus hangs
- Mock socket.getaddrinfo to prevent DNS lookup timeouts
- Mock get_network_status_monitor to prevent WiFi enumeration hangs

This keeps production code clean and follows Python testing best practices.
@m3nu m3nu changed the title Fix flaky integration tests in test_init.py Fix test hangs on Linux/macOS CI by mocking system dependencies Jan 23, 2026
@m3nu m3nu merged commit 95e0cd2 into borgbase:master Jan 23, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant