Skip to content

bug: fix fetch range entry from deleted table#2

Merged
zhangh43 merged 1 commit into
eloqdata:mainfrom
lokax:fix-post-ckpt-assert
Jul 10, 2025
Merged

bug: fix fetch range entry from deleted table#2
zhangh43 merged 1 commit into
eloqdata:mainfrom
lokax:fix-post-ckpt-assert

Conversation

@lokax

@lokax lokax commented Jul 10, 2025

Copy link
Copy Markdown
Collaborator

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

@CLAassistant

CLAassistant commented Jul 10, 2025

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@zhangh43 zhangh43 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@zhangh43 zhangh43 merged commit d78c276 into eloqdata:main Jul 10, 2025
2 checks passed
githubzilla added a commit that referenced this pull request Mar 21, 2026
Bug #3 (root cause of assertion crash): CleanBucketData/CleanRangeData could
free CcEntries with BeingCkpt=true, causing dirty count double-decrement when
the checkpoint callback (UpdateCceCkptTsCc) later runs on the freed entry.
Fix: CanBeCleaned() now returns !GetBeingCkpt() for CleanBucketData,
CleanRangeData, and CleanRangeDataForMigration. Entries being checkpointed
are skipped and retried later.

Bug #1: TemplateCcMap::BackFill called SetCkptTs() before
SetCommitTsPayloadStatus(), which overwrites commit_ts_and_status_ and
clears the flush bit, leaving the entry dirty without incrementing the
counter. Also missing OnCommittedUpdate in ReadOutsideCc backfill path.
Fix: Reorder to SetCommitTsPayloadStatus first, then SetCkptTs, and add
OnCommittedUpdate in both BackFill and ReadOutsideCc paths.

Bug #2: ClusterConfigCcMap called SetCommitTsPayloadStatus() at two sites
without OnCommittedUpdate(), making entries dirty without counting them.
Fix: Add OnCommittedUpdate after both SetCommitTsPayloadStatus calls.

Also relax UpdateCceCkptTsCc assertions to allow IsPersistent() being true,
since concurrent BackFill/ReadOutsideCc can legitimately mark an entry
persistent before the checkpoint callback runs.
githubzilla added a commit that referenced this pull request Mar 21, 2026
Bug #3 (root cause of assertion crash): CleanBucketData/CleanRangeData could
free CcEntries with BeingCkpt=true, causing dirty count double-decrement when
the checkpoint callback (UpdateCceCkptTsCc) later runs on the freed entry.
Fix: CanBeCleaned() now returns !GetBeingCkpt() for CleanBucketData,
CleanRangeData, and CleanRangeDataForMigration. Entries being checkpointed
are skipped and retried later.

Bug #1: TemplateCcMap::BackFill called SetCkptTs() before
SetCommitTsPayloadStatus(), which overwrites commit_ts_and_status_ and
clears the flush bit, leaving the entry dirty without incrementing the
counter. Also missing OnCommittedUpdate in ReadOutsideCc backfill path.
Fix: Reorder to SetCommitTsPayloadStatus first, then SetCkptTs, and add
OnCommittedUpdate in both BackFill and ReadOutsideCc paths.

Bug #2: ClusterConfigCcMap called SetCommitTsPayloadStatus() at two sites
without OnCommittedUpdate(), making entries dirty without counting them.
Fix: Add OnCommittedUpdate after both SetCommitTsPayloadStatus calls.

Also relax UpdateCceCkptTsCc assertions to allow IsPersistent() being true,
since concurrent BackFill/ReadOutsideCc can legitimately mark an entry
persistent before the checkpoint callback runs.
githubzilla added a commit that referenced this pull request Mar 24, 2026
Bug #3 (root cause of assertion crash): CleanBucketData/CleanRangeData could
free CcEntries with BeingCkpt=true, causing dirty count double-decrement when
the checkpoint callback (UpdateCceCkptTsCc) later runs on the freed entry.
Fix: CanBeCleaned() now returns !GetBeingCkpt() for CleanBucketData,
CleanRangeData, and CleanRangeDataForMigration. Entries being checkpointed
are skipped and retried later.

Bug #1: TemplateCcMap::BackFill called SetCkptTs() before
SetCommitTsPayloadStatus(), which overwrites commit_ts_and_status_ and
clears the flush bit, leaving the entry dirty without incrementing the
counter. Also missing OnCommittedUpdate in ReadOutsideCc backfill path.
Fix: Reorder to SetCommitTsPayloadStatus first, then SetCkptTs, and add
OnCommittedUpdate in both BackFill and ReadOutsideCc paths.

Bug #2: ClusterConfigCcMap called SetCommitTsPayloadStatus() at two sites
without OnCommittedUpdate(), making entries dirty without counting them.
Fix: Add OnCommittedUpdate after both SetCommitTsPayloadStatus calls.

Also relax UpdateCceCkptTsCc assertions to allow IsPersistent() being true,
since concurrent BackFill/ReadOutsideCc can legitimately mark an entry
persistent before the checkpoint callback runs.
liunyl pushed a commit that referenced this pull request Jun 15, 2026
Bug #3 (root cause of assertion crash): CleanBucketData/CleanRangeData could
free CcEntries with BeingCkpt=true, causing dirty count double-decrement when
the checkpoint callback (UpdateCceCkptTsCc) later runs on the freed entry.
Fix: CanBeCleaned() now returns !GetBeingCkpt() for CleanBucketData,
CleanRangeData, and CleanRangeDataForMigration. Entries being checkpointed
are skipped and retried later.

Bug #1: TemplateCcMap::BackFill called SetCkptTs() before
SetCommitTsPayloadStatus(), which overwrites commit_ts_and_status_ and
clears the flush bit, leaving the entry dirty without incrementing the
counter. Also missing OnCommittedUpdate in ReadOutsideCc backfill path.
Fix: Reorder to SetCommitTsPayloadStatus first, then SetCkptTs, and add
OnCommittedUpdate in both BackFill and ReadOutsideCc paths.

Bug #2: ClusterConfigCcMap called SetCommitTsPayloadStatus() at two sites
without OnCommittedUpdate(), making entries dirty without counting them.
Fix: Add OnCommittedUpdate after both SetCommitTsPayloadStatus calls.

Also relax UpdateCceCkptTsCc assertions to allow IsPersistent() being true,
since concurrent BackFill/ReadOutsideCc can legitimately mark an entry
persistent before the checkpoint callback runs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants