-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Description
Original PR with repro: #8504
After upgrading go-grpc, we started see the following error:
stream terminated by RST_STREAM with error code: REFUSED_STREAM
Upon investigation, it appears that in some scenarios, we fail to properly delete streams from the activeStreams map, resulting in this error being wrongly returned.
I added a test case that seems to trigger the problem:
PS: To test the behavior i added a "ActiveStreamTracker" hook -> im not sure this is the best way but i wanted just to show the problem
end2end_test.go:4015: leak streams: 5
The issue seems to have been introduced in the following PR: #8071
Specifically, the check added here:
grpc-go/internal/transport/http2_server.go
Lines 1357 to 1359 in 57b69b4
| if oldState == streamDone { | |
| return | |
| } |
If this conditional is removed, all streams are correctly cleaned up from the activeStreams map, and the test passes without leaks.
Root cause
- The server stream exhausts the stream quota, and starts buffering data frames.
- The server attempts to finish the RPC gracefully by sending trailers by calling finishStream.
- The write of the header frame gets blocked due to the pending data frames that are awaiting flow control quota.
- Once the deadline expires, the server's deadline monitoring timer tries to call closeStream, which is a no-op since the stream state is already marked
streamDonebyfinishStreamin point 1. - The client sends and RST stream, the server calls closeStream which is again a no-op.
The simplest fix is to remove the guard block added in closeStream in #8071. We may end up pushing redundant cleanupStream events on to the controlbuf, but it doesn't cause correctness issues since cleanupStreamHandler returns early in such cases.