http_server: fix libevent crash on connection drop#11843
Conversation
Removes dangerous connection->event.data assignments (which crash the libevent backend on Windows) and instead preserves connection->user_data so the event handler can safely find and clean up the session. Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
📝 WalkthroughWalkthroughThe HTTP server connection/session association mechanism was refactored to consolidate pointer tracking. Sessions are now associated with connections exclusively via ChangesConnection-Session Association Consolidation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9dc9923f15
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (session == NULL) { | ||
| session = (struct flb_http_server_session *) event->data; | ||
| if (session != NULL && | ||
| (session->connection == NULL || | ||
| session->connection->fd == FLB_INVALID_SOCKET)) { | ||
| event->data = NULL; | ||
| session->drop_pending = FLB_FALSE; | ||
| flb_http_server_session_destroy(session); | ||
| } | ||
| return -1; |
There was a problem hiding this comment.
Destroy dropped sessions when user_data is already cleared
Returning immediately when connection->user_data is NULL skips the only cleanup path for sessions marked in flb_http_server_connection_drop(). That callback sets session->drop_pending = FLB_TRUE and then clears connection->user_data, so after this change a dropped connection can leave its session stranded in server->clients indefinitely because flb_http_server_reap_stale_sessions() only reaps sessions with drop_pending == FLB_FALSE. Under repeated client disconnects, this leaks session slots and can eventually cause max_connections rejections.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/http_server/flb_http_server.c (1)
123-132:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep
connection->user_datauntil the stale-session cleanup branch consumes it.
flb_http_server_client_activity_event_handler()now usesconnection->user_dataas the only session lookup path, butflb_http_server_connection_drop()clears that field before the queued activity event can run. In that case the handler returns onsession == NULL,drop_pendingstays true, andflb_http_server_reap_stale_sessions()will never reap the leaked session.Suggested fix
static void flb_http_server_connection_drop(struct flb_connection *connection) { struct flb_http_server_session *session; if (connection == NULL) { return; } session = connection->user_data; if (session != NULL && session->connection == connection) { session->connection = NULL; session->drop_pending = FLB_TRUE; } - connection->user_data = NULL; connection->drop_notification_callback = NULL; }if (session->connection == NULL || session->connection->fd == FLB_INVALID_SOCKET) { session->drop_pending = FLB_FALSE; + connection->user_data = NULL; flb_http_server_session_destroy(session); return -1; }Also applies to: 389-392
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/http_server/flb_http_server.c` around lines 123 - 132, flb_http_server_connection_drop currently clears connection->user_data (and drop_notification_callback) immediately, which breaks flb_http_server_client_activity_event_handler because it looks up sessions via connection->user_data and leaves sessions stuck with session->drop_pending = FLB_TRUE so flb_http_server_reap_stale_sessions never reaps them; fix by not clearing connection->user_data in flb_http_server_connection_drop (leave connection->user_data set until the stale-session cleanup consumes it), only null out session->connection and set session->drop_pending = FLB_TRUE, and ensure you do not remove connection->drop_notification_callback prematurely; apply the same change to the other similar branch (lines referenced as 389-392) so both code paths preserve connection->user_data for the reap logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/http_server/flb_http_server.c`:
- Around line 123-132: flb_http_server_connection_drop currently clears
connection->user_data (and drop_notification_callback) immediately, which breaks
flb_http_server_client_activity_event_handler because it looks up sessions via
connection->user_data and leaves sessions stuck with session->drop_pending =
FLB_TRUE so flb_http_server_reap_stale_sessions never reaps them; fix by not
clearing connection->user_data in flb_http_server_connection_drop (leave
connection->user_data set until the stale-session cleanup consumes it), only
null out session->connection and set session->drop_pending = FLB_TRUE, and
ensure you do not remove connection->drop_notification_callback prematurely;
apply the same change to the other similar branch (lines referenced as 389-392)
so both code paths preserve connection->user_data for the reap logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ade7770d-0998-45f7-8d70-0d1ea4d6c2a5
📒 Files selected for processing (1)
src/http_server/flb_http_server.c
|
The fix direction looks correct: removing However, the PR as-is introduces a cleanup regression. I reproduced this on PR head with:
Result on PR head: Changing the drop path to leave the detached session reapable fixes it: diff --git a/src/http_server/flb_http_server.c b/src/http_server/flb_http_server.c
index 64045e404..7a717a7e6 100644
--- a/src/http_server/flb_http_server.c
+++ b/src/http_server/flb_http_server.c
@@ -125,7 +125,7 @@ static void flb_http_server_connection_drop(struct flb_connection *connection)
if (session != NULL &&
session->connection == connection) {
session->connection = NULL;
- session->drop_pending = FLB_TRUE;
+ session->drop_pending = FLB_FALSE;
}
connection->user_data = NULL; After that change:
|
Removes dangerous connection->event.data assignments (which crash the libevent backend on Windows)
and instead preserves connection->user_data
so the event handler can safely find and clean up the session.
Closes #11842.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
After applying this patch, Fluent Bit now starts to process HTTP requests with internal HTTP server again on Windows:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit