test: refactor test suite to (hopefully) fix flakiness#65
Merged
Conversation
|
🔥 Run benchmarks comparing 856dd89 against gh workflow run bench.yaml -f pr_number=65Note: this comment will update with each new commit. |
Owner
Author
|
No dice on the more minimal changes, let's see if dropping net.Pipe will help us out here |
d3363ec to
fe05e87
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #65 +/- ##
==========================================
+ Coverage 96.66% 96.99% +0.32%
==========================================
Files 2 2
Lines 570 432 -138
==========================================
- Hits 551 419 -132
+ Misses 14 7 -7
- Partials 5 6 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Owner
Author
|
Ended up doing a much bigger refactor to ensure all tests were using real TCP connections between a client and a real server instead of reading from and writing to a single shared connection. Unfortunately … we're still failing in confusing and weird ways: |
7a5b6fa to
6ac9e78
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The theory here is that the new, more rigorous tests for closing handshake added as part of #63 have caused significantly more flakiness because of setupRawConnWithHandler's use of a throwaway bufio.Reader around the underlying net.Conn just to read the HTTP 101 response for the opening handshake.
This causes a race with tests where the server might write websocket data to the connection immediately after the handshake (e.g. when a test immediately closes the connection), because the bufio.Reader is likely to read (partial) websocket data while reading the HTTP response, leaving subsequent reads with only partial/incomplete data and causing them to block until the test suite times out.
I'm very curious to see if this fixes the tests. If it works, credit goes to Claude Code for helping identify the race condition. If it doesn't, well, obviously LLMs are useless.
Update: while that issue was likely a source of flakiness, fixing it did not actually make the test suite any less flaky.
Ended up doing a much larger refactor to set up complete client and server connections instead of setting up a single connection to simulate a client and server reading from/writing to the same conn.
This seems to have helped a bit, maybe, but we're still seeing intermittent test failures due to timeouts.