Describe the bug
Ingester.Push (pkg/ingester/ingester.go:1438) acquires a TSDB appender via app := db.Appender(ctx).(extendedAppender) and then performs request validation that can return without ever reaching app.Commit() or app.Rollback(). The clearest leak path is the out-of-order-label-set check at pkg/ingester/ingester.go:1454-1456:
if i.isLabelSetOutOfOrder(tsLabels) {
i.metrics.oooLabelsTotal.WithLabelValues(userID).Inc()
return nil, wrapWithUser(errors.Errorf("out-of-order label set found when push: %s", tsLabels), userID)
}
When this branch fires, the appender's TSDB head series references, mmap'd chunks and pending state are never released. Each such request leaks one active appender. Observable via the cortex_ingester_tsdb_head_active_appenders gauge growing monotonically.
The two existing explicit app.Rollback() blocks inside the per-sample / per-histogram failure paths only cover specific append-error returns; they don't protect the OOO check above them, and a future return path added between them and Commit() would have the same problem.
To Reproduce
- Run the ingester with at least one tenant that has an active TSDB head.
- Send a Push request whose
Timeseries[0].Labels are not sorted (the ingester normally relies on the distributor to sort labels, but the check exists as a guard).
- Observe
cortex_ingester_tsdb_head_active_appenders for that tenant — it increments by 1 per Push and never drops.
Regression test (added in #7528):
go test -tags \"netgo slicelabels\" -run TestIngester_Push_OutOfOrderLabels_AppenderNotLeaked ./pkg/ingester/...
Without the fix, the test pushes 5 OOO-label-set requests and cortex_ingester_tsdb_head_active_appenders reads 5 instead of 0.
Expected behavior
The appender is released (via Rollback()) on every path that returns from Push before Commit() succeeds. The active-appenders gauge returns to 0 after a request is rejected.
Environment
- Cortex master at
29b6167d6 (and reproducible on later commits).
- Found via a static audit of CPU / memory / goroutine leaks across
pkg/.
Additional context
Fix is being submitted as #7528: defer Rollback() immediately after db.Appender(ctx), flip a committed boolean to true right before app.Commit() (Prometheus's headAppender.Commit closes the appender even on failure, so a second Rollback from the defer must be skipped). The two redundant in-loop Rollback() calls are consolidated under the same defer. Regression test asserts cortex_ingester_tsdb_head_active_appenders == 0 after repeated OOO pushes.
🤖 Reported with help from Claude Code
Describe the bug
Ingester.Push(pkg/ingester/ingester.go:1438) acquires a TSDB appender viaapp := db.Appender(ctx).(extendedAppender)and then performs request validation that can return without ever reachingapp.Commit()orapp.Rollback(). The clearest leak path is the out-of-order-label-set check atpkg/ingester/ingester.go:1454-1456:When this branch fires, the appender's TSDB head series references, mmap'd chunks and pending state are never released. Each such request leaks one active appender. Observable via the
cortex_ingester_tsdb_head_active_appendersgauge growing monotonically.The two existing explicit
app.Rollback()blocks inside the per-sample / per-histogram failure paths only cover specific append-error returns; they don't protect the OOO check above them, and a future return path added between them andCommit()would have the same problem.To Reproduce
Timeseries[0].Labelsare not sorted (the ingester normally relies on the distributor to sort labels, but the check exists as a guard).cortex_ingester_tsdb_head_active_appendersfor that tenant — it increments by 1 per Push and never drops.Regression test (added in #7528):
Without the fix, the test pushes 5 OOO-label-set requests and
cortex_ingester_tsdb_head_active_appendersreads5instead of0.Expected behavior
The appender is released (via
Rollback()) on every path that returns fromPushbeforeCommit()succeeds. The active-appenders gauge returns to0after a request is rejected.Environment
29b6167d6(and reproducible on later commits).pkg/.Additional context
Fix is being submitted as #7528: defer
Rollback()immediately afterdb.Appender(ctx), flip acommittedboolean to true right beforeapp.Commit()(Prometheus'sheadAppender.Commitcloses the appender even on failure, so a secondRollbackfrom the defer must be skipped). The two redundant in-loopRollback()calls are consolidated under the same defer. Regression test assertscortex_ingester_tsdb_head_active_appenders == 0after repeated OOO pushes.🤖 Reported with help from Claude Code