Flaky panic in concurrent AppendToStream error handling

I have seen this panic a couple of times in our integration tests:

```
{"level":"error","msg":"subscription has dropped. Reason: [rpc error: code = Unavailable desc = error reading from server: EOF]"}
panic: send on closed channel

goroutine 6833 [running]:
github.com/EventStore/EventStore-Client-Go/v4/esdb.(*grpcClient).handleError(0xc0007bc120, 0xc0003a1830, 0xec7f80?, 0xc000649b90, {0x101a620, 0xc00006ecf8})
	/home/runner/go/pkg/mod/github.com/!event!store/!event!store-!client-!go/v4@v4.1.0/esdb/impl.go:78 +0x5f8
github.com/EventStore/EventStore-Client-Go/v4/esdb.(*Client).AppendToStream(0xc000891e70, {0x1024240, 0x1691160}, {0xc000136f80, 0x69}, {{0x1018180, 0x1691160}, 0x0, 0x0, 0x0}, ...)
	/home/runner/go/pkg/mod/github.com/!event!store/!event!store-!client-!go/v4@v4.1.0/esdb/client.go:87 +0xbae
project-othermodule/eventstore.ESDBCoreService.AppendToStream(...)
	/home/runner/work/someproject/someproject/backend/go-common/eventstore/core.request.go:331
project-othermodule/eventstore/service.ESDBDataService.updateDataProgress.func1()
	/home/runner/work/someproject/someproject/backend/go-common/eventstore/service/service.progress.go:170 +0x4d7
created by project-othermodule/eventstore/service.ESDBDataService.updateDataProgress in goroutine 6793
	/home/runner/work/someproject/someproject/backend/go-common/eventstore/service/service.progress.go:147 +0x234
FAIL eventstore/service.TestFlakyGraphqlReads (-1.00s)
=== RUN   TestFlakyGraphqlReads/DoesNotGetErrorStreamNotFound
```

The line that panics from `impl.go` in `handleError()`:
<img width="399" alt="Pasted Graphic" src="https://github.com/user-attachments/assets/160ab5ba-4a1e-45b0-8d5d-59d3559b0325">

Just before the panic we have a failing `AppendToStream`, that returns an error, that i'm afraid i have not logged. I suspect there is a problem with intermittent network errors in the github runners, because we get quite a few random flaky tests at the moment (we're running integration tests against eventstore via testcontainers-go).

Any way i'm not completely sure on the order of calls, but the net result is that `func (client *grpcClient) close() {}` must have been called _before_ `func (client *grpcClient) handleError()` which causes the panic above.

Some details about the test i'm running:
- I'm testing a complex service doing both reads, appends and subscribes
- I'm using multiple esdb clients from 5 different go routines (due to https://discuss.eventstore.com/t/grpc-connection-pool-for-subscriptions/5210)
- I'm _not_ explicitly closing any of those clients
- I might cancel contexts to the different esdb operations

I see there has already been some mitigation of concurrency errors with the `closeFlag` in the `type grpcClient struct`. I don't think it's sufficient though. If `AppendToStream` are to be thread safe i don't really see any way around a regular `sync.Mutex` being locked on all writes to the `channel` of the `grpcClient struct` as well as on closure.

Let me know if you need any more details, thanks in advance


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky panic in concurrent AppendToStream error handling #187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flaky panic in concurrent AppendToStream error handling #187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions