Skip to content

Range sizes definition#438

Merged
yangsw26 merged 1 commit into
shard_keys_one_core_for_rangepartition_basefrom
range_size_definition
Mar 13, 2026
Merged

Range sizes definition#438
yangsw26 merged 1 commit into
shard_keys_one_core_for_rangepartition_basefrom
range_size_definition

Conversation

@yangsw26

@yangsw26 yangsw26 commented Mar 12, 2026

Copy link
Copy Markdown
Collaborator

In order to trigger range splits in a timely manner, the range size is tracked in memory.

Init range sizes interface
Update range size interface
Reset range splitting status interface

Summary by CodeRabbit

  • New Features

    • Automatic per-range size tracking with on-demand lazy loading for partitioned data
    • Range split decisions triggered when partitions exceed configured size thresholds
    • Ability to initialize or reset range status to control retry/loading behavior
  • Documentation

    • Added usage notes for range initialization and reset behaviors

@yangsw26 yangsw26 self-assigned this Mar 12, 2026
@yangsw26 yangsw26 requested a review from liunyl March 12, 2026 04:22
@coderabbitai

coderabbitai Bot commented Mar 12, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e9f20503-671f-4677-b5dd-6d83cf67b874

📥 Commits

Reviewing files that changed from the base of the PR and between 7098f16 and 9a30770.

📒 Files selected for processing (4)
  • tx_service/include/cc/cc_map.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/type.h
  • tx_service/src/cc/cc_map.cpp
🚧 Files skipped from review as they are similar to previous changes (3)
  • tx_service/src/cc/cc_map.cpp
  • tx_service/include/type.h
  • tx_service/include/cc/cc_map.h

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description explains the purpose and lists the three main interface additions, but does not follow the provided template structure with required sections like tests, documentation, issue references, or test suite confirmation. Update the description to follow the template format by adding sections for tests, documentation links, issue references (if applicable), RFC references, and test suite status confirmation.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Range sizes definition' accurately summarizes the main change: introducing new interfaces and data structures for tracking range sizes in memory.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch range_size_definition

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@yangsw26 yangsw26 linked an issue Mar 12, 2026 that may be closed by this pull request
8 tasks

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tx_service/include/cc/template_cc_map.h`:
- Around line 11940-11953: The code path that transitions
std::get<0>(it->second) from RangeSizeStatus::kNotInitialized to kLoading
returns before applying the incoming delta, losing the first delta_size; modify
the branch handling (around the check of RangeSizeStatus and the call to
shard_->FetchTableRangeSize(table_name_, static_cast<int32_t>(partition_id),
cc_ng_id_, ng_term)) to record/apply the current delta_size into the stored
tuple (.second) before returning (e.g., update the size/delta field in
it->second so the initial in-memory change is preserved while the async load is
in progress).
- Around line 11969-11981: The current logic computes need_split and then
unconditionally writes it back into std::get<2>(it->second), which lets a
pre-set true flag be overwritten to false on a later update; change the update
to latch the split flag until an explicit reset clears it by reading the
existing flag first and only setting std::get<2>(it->second) = existing_flag ||
need_split (or equivalently compute need_split = existing_flag ||
new_condition), so the stored split state for the tuple containing
std::get<2>(it->second) is preserved across updates until the reset path clears
it; reference symbols: need_split, std::get<2>(it->second),
StoreRange::range_max_size, table_name_.StringView(), partition_id.

In `@tx_service/src/cc/cc_map.cpp`:
- Around line 481-486: After applying the persisted_size + delta and clearing
the delta in the async-load success path, run the same split-evaluation logic
used elsewhere so an oversized range doesn't remain unsplit: compute final_size,
compare it to the split threshold, set the per-range "split triggered" flag if
not already set, and either call the existing UpdateRangeSize (or the same
enqueue/signal used by writes to request a split) or surface a callback so the
caller can enqueue the split immediately; update the code around the block that
modifies std::get<0>(it->second) and std::get<1>(it->second) to perform that
check and signal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cf47539a-aaaf-4f4b-8531-5724b969884b

📥 Commits

Reviewing files that changed from the base of the PR and between f94af00 and 7098f16.

📒 Files selected for processing (4)
  • tx_service/include/cc/cc_map.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/type.h
  • tx_service/src/cc/cc_map.cpp

Comment thread tx_service/include/cc/template_cc_map.h
Comment thread tx_service/include/cc/template_cc_map.h Outdated
Comment thread tx_service/src/cc/cc_map.cpp
Init range sizes interface

the interface of update range size

fix rang size status when update

add reset range status interface

fix range size interface
@yangsw26 yangsw26 force-pushed the range_size_definition branch from 7098f16 to 9a30770 Compare March 12, 2026 06:13
@yangsw26 yangsw26 changed the base branch from main to shard_keys_one_core_for_rangepartition_base March 12, 2026 06:18
@yangsw26 yangsw26 mentioned this pull request Mar 12, 2026
5 tasks
@yangsw26 yangsw26 merged commit 46b9d87 into shard_keys_one_core_for_rangepartition_base Mar 13, 2026
4 checks passed
yangsw26 added a commit that referenced this pull request Mar 16, 2026
* Range sizes definition (#438)

In order to trigger range splits in a timely manner, the range size is tracked in memory.

Init range sizes interface
Update range size interface
Reset range splitting status interface

* Update store range size after datasync (#439)

1. Update storage range size
After a successful flush, the store range size is persisted.

2. Load range size from storage
When accessing the memory range size, if it has not yet been initialized, a fetch operation from storage is performed to retrieve the range size.

* Maintaining range size on postwritecc (#440)

1. During a post-write operation, the range size information for the corresponding partition is maintained.

2. If a range is in the process of splitting, the range size is updated in the delta size information.

3. For a "double-write" operation, only the range size information for the newly split partition is updated.

* Reset range size after range split (#441)

1. In the post-commit phase of a range split transaction, the range size of all related partitions is updated: base range size + delta size.

2. Reset the range splitting flag.

3. Update the kickoutcc process to accommodate the new key sharding logic.

4. Update the processing procedure for SampleSubRangeKeys to accommodate the new key sharding logic.

* Update range size during create secondary index (#442)

1. Update range size during UploadBatchCc request for new index.

2. Update the UploadBatchCc process to accommodate the new key sharding logic.

3. Update the create secondary index process to accommodate the new key sharding logic.

* Update range split replay log execution (#443)

1. Update the range size during data log replay.

2. For post commit range split log, update range size for each newly splitting ranges.

3. Update the data log replay process to accommodate the new key sharding logic.

* Update scanslice request (#444)

Update the structure definitions and related processing procedures of ScanSliceCc and RemoteScanSlice to adapt to the new key sharding logic.

* Adapt DataSync with new key sharding (#445)

1. Update the structure definitions and related processing procedures of DataSyncScanCc and ScanSliceDeltaSizeCc, as well as the DataSync processing procedure, to adapt to the new key sharding logic.

2. Update key shard code for UpdateCkptTs request.

* Adapt load slice with new key sharding (#446)

Update the structure definition and related processing procedures of FillStoreSlice to adapt to the new key sharding logic.

* Update process read/batchread operation (#447)

Adapt read operation with new key sharding for range partition.

* Create a high-priority DataSync task to trigger range split (#448)

Update datasynctask constructor to check new and old range owner shard

* Adapt cache sender with new key sharding (#449)

To reduce cache hit rate, during range splitting, keys located on the new range that fall on other cores (local nodes or remote nodes) can be sent to the corresponding core.

1. Update the logic and related requests for sending range cache during range split

2. Update key shard for UploadBatchSlices rpc.

* Update keycach to adapt to the new key sharding strategy. (#450)

Including InitKeyCacheCc , UpdateKeyCacheCc ,StoreSlice::cache_validity_  , StoreRange::key_cache_

* Reset range splitting status after datasync (#451)

If a datasync task that is supposed to trigger a range split ends without actually triggering the split, the range splitting state needs to be reset.

* Update comment (#455)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Range size tracking

2 participants