Skip to content

More CI fixes#1854

Open
azat wants to merge 12 commits intolibevent:masterfrom
azat:more-ci-fixes
Open

More CI fixes#1854
azat wants to merge 12 commits intolibevent:masterfrom
azat:more-ci-fixes

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Mar 25, 2026

No description provided.

@azat azat force-pushed the more-ci-fixes branch 2 times, most recently from 9558c04 to 2f9bc49 Compare March 26, 2026 21:15
azat added 8 commits March 27, 2026 08:29
The stddev tolerance of 100 was too tight for CI environments where
scheduling variance naturally exceeds this threshold (observed 104.67).
Increase to 150 to reduce flaky failures.
On macOS, the test setup calls thread_policy_set() to move threads to
realtime scheduling class. If this fails (e.g. after fork() where Mach
ports may be stale), the process called exit(1), killing the forked
child and causing "[Lost connection!]" failures in CI.

The realtime scheduling is a best-effort optimization for the test
suite. Just warn and continue instead of aborting.
This test is flaky on Windows where socket closure detection timing
differs from Unix, causing assertion failures on req and connection
cleanup. Making it retriable allows CI to pass on retries.
Otherwise it is hard to understand what is going on here:

    38/65 Test libevent#30: regress__POLL .......................***Failed  117.26 sec
    FAILED
    [Lost connection!]
      [FAILED http/https_openssl_connection (0 retries)]
    [msg] Nameserver 127.0.0.1:25962 has failed: request timed out.
    [msg] All nameservers have failed
    1/434 TESTS FAILED. (47 skipped)
On windows you can have different allowed ports for TCP and UDP, so we
need to make sure that the selected port works for both.

Refs:
- docker/for-win#3171
- https://blog.deanosim.net/windows-10-winnat-and-why-your-programs-cant-listen-on-certain-ports/
CI:

    FAIL D:\a\libevent\libevent\test\regress_bufferevent.c:1101: assert(labs(timeval_msec_diff(((&started_at)), ((&res1.write_timeout_at))) - (100)) <= 100): 140 vs 100bufferevent/bufferevent_timeout_filter_pair:
    [FAILED bufferevent/bufferevent_timeout_filter_pair (0 retries)]
On windows:

  [RETRYING http/data_length_constraints (attempts left 2, delay 1 sec)]

  [RETRYING http/data_length_constraints (attempts left 1, delay 1 sec)]

  [RETRYING http/data_length_constraints (attempts left 0, delay 1 sec)]
azat added 4 commits March 27, 2026 09:26
Windows allocates ephemeral ports sequentially per protocol, and
Hyper-V/WinNAT can exclude 200+ contiguous ports for TCP but not
UDP (or vice versa).  regress_pick_port() binds UDP first (port 0),
then cross-checks TCP on the same port.  When the UDP counter lands
inside the TCP excluded range, all 5 sequential retries fail with
WSAEACCES.

Bump the retry count from 5 to 256 to reliably skip past any
excluded range.  Add diagnostic logging (port number + error) on
bind failure for future troubleshooting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant