Skip to content

fix(sei-db): fix flaky RocksDB parallel iteration test#2928

Closed
pdrobnjak wants to merge 3 commits intomainfrom
pd/fix-rocks-db-ci-fail
Closed

fix(sei-db): fix flaky RocksDB parallel iteration test#2928
pdrobnjak wants to merge 3 commits intomainfrom
pd/fix-rocks-db-ci-fail

Conversation

@pdrobnjak
Copy link
Contributor

Summary

  • Fixes a use-after-free race condition in the RocksDB iterator that caused TestDatabaseParallelIterationVersions to flake
  • The ReadOptions object (and its timestamp slice) created by newTSReadOptions() was passed to NewIteratorCF() but never stored, making it eligible for GC while the C++ iterator still held a dangling Slice pointer to the timestamp data
  • Under parallel access (99 goroutines in the test), GC pressure caused iterators to read with corrupted timestamps, returning values from wrong versions (e.g., expected val002-011 but got val002-017)
  • The fix stores ReadOptions in the iterator struct and calls Destroy() on Close(), ensuring the timestamp data remains valid for the iterator's lifetime
  • Also fixes the same ReadOptions leak in getSlice() (used by Get/Has)

Test plan

  • TestDatabaseParallelIterationVersions — the previously flaky test should now pass consistently
  • All other RocksDB tests unaffected (no API changes beyond the internal plumbing)

🤖 Generated with Claude Code

@github-actions
Copy link

github-actions bot commented Feb 19, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedFeb 27, 2026, 2:22 PM

@pdrobnjak pdrobnjak force-pushed the pd/fix-rocks-db-ci-fail branch from 5a7372d to 65d9a47 Compare February 19, 2026 09:33
@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.65%. Comparing base (83c9561) to head (c91e04b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2928      +/-   ##
==========================================
- Coverage   58.12%   57.65%   -0.48%     
==========================================
  Files        2111     2111              
  Lines      173544   173671     +127     
==========================================
- Hits       100879   100130     -749     
- Misses      63706    64586     +880     
+ Partials     8959     8955       -4     
Flag Coverage Δ
sei-db 69.72% <ø> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/db_engine/rocksdb/mvcc/db.go 58.80% <ø> (+0.39%) ⬆️

... and 78 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The ReadOptions (and its timestamp slice) created by newTSReadOptions()
was not stored after being passed to NewIteratorCF(), making it eligible
for GC while the C++ iterator still held a dangling Slice pointer to the
timestamp data. Under parallel access this caused iterators to read with
corrupted timestamps, returning values from wrong versions.

Store ReadOptions in the iterator struct and Destroy() it on Close().
Also fix the same leak in getSlice() used by Get/Has.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pdrobnjak pdrobnjak enabled auto-merge (squash) February 19, 2026 20:57
Resolve conflicts in sei-db/db_engine/rocksdb/mvcc/:
- db.go: remove redundant [:] on already-slice variables, drop trivial
  whitespace difference
- iterator.go: keep main's sync.Once-guarded Close() with nil-checks
  for thread-safe iterator cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pdrobnjak pdrobnjak closed this Feb 27, 2026
auto-merge was automatically disabled February 27, 2026 14:22

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants