Skip to content

Bound node memory limit by node_memory_limit#354

Merged
liunyl merged 1 commit into
mainfrom
memory_limit
Jan 9, 2026
Merged

Bound node memory limit by node_memory_limit#354
liunyl merged 1 commit into
mainfrom
memory_limit

Conversation

@liunyl

@liunyl liunyl commented Jan 9, 2026

Copy link
Copy Markdown
Contributor

node_memory_limit now bounds the total memory usage of the process instead of only the data substrate part. Eloqstore mem usage is subtracted from the node mem limit.

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

  • New Features

    • Node memory limit is now configurable via command-line flag with automatic detection of available system memory to ensure optimal performance.
  • Configuration Changes

    • Improved memory allocation strategy for storage components based on total available node memory.

✏️ Tip: You can customize this high-level summary in your review settings.

@liunyl liunyl requested a review from MrGuin January 9, 2026 08:02
@coderabbitai

coderabbitai Bot commented Jan 9, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@liunyl has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 33 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7bcfa6d and 0815c3d.

📒 Files selected for processing (5)
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/storage_init.cpp
  • core/src/tx_service_init.cpp
  • store_handler/eloq_data_store_service/eloq_store_config.cpp

Walkthrough

The PR refactors node memory limit handling by moving the node_memory_limit_mb gflag from tx_service_init.cpp to data_substrate.cpp, centralizing memory configuration in CoreConfig, and eliminating the remaining_node_memory_mb_ member from DataSubstrate. Additionally, the EloqStore default index buffer pool allocation is reduced from 50% to 30% of node memory.

Changes

Cohort / File(s) Summary
Memory limit configuration centralization
core/include/data_substrate.h, core/src/data_substrate.cpp
Added node_memory_limit_mb field to CoreConfig; removed remaining_node_memory_mb_ from DataSubstrate; introduced gflag node_memory_limit_mb (default 8192 MB) and moved memory limit calculation logic from Start() to LoadCoreAndNetworkConfig() with auto-configuration support and logging.
Memory limit reference updates
core/src/storage_init.cpp, core/src/tx_service_init.cpp
Updated references from remaining_node_memory_mb_ to core_config_.node_memory_limit_mb; removed duplicate gflag declaration from tx_service_init.cpp; streamlined memory configuration logic and added logging.
EloqStore default allocation adjustment
store_handler/eloq_data_store_service/eloq_store_config.cpp
Changed default index buffer pool size calculation from node_memory_mb / 2 (50%) to node_memory_mb / 10 * 3 (30%) when not provided by config.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • xiexiaoy
  • zhangh43

Poem

🐰 A rabbit hops through memory's land,
Flags consolidated, refactored with care,
From scattered stages to one master plan—
Limits now centered, configuration fair!
Buffers resized with wisdom's design,
Where memory flows, the system shines ✨

🚥 Pre-merge checks | ❌ 3
❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the intent but is incomplete; all required checklist items (tests, documentation, issue reference, RFC reference, and test suite pass confirmation) remain unchecked and unresolved. Complete the pre-merge checklist by adding test coverage, documentation updates, issue/RFC references, and confirming test suite passage before merging.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Bound node memory limit by node_memory_limit' is vague and unclear—it uses a redundant phrasing that doesn't clearly convey the main change to a reader scanning history. Rephrase to be more specific and clear, such as 'Make node_memory_limit bound total process memory usage' or 'Refactor memory limiting to use node_memory_limit_mb for all subsystems'.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch memory_limit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
core/src/data_substrate.cpp (1)

740-742: Potential integer overflow in memory calculation.

The expression meminfo.totalram * meminfo.mem_unit could overflow on systems with very large memory. While meminfo.totalram is unsigned long and meminfo.mem_unit is typically 1 (bytes) or a small power of 2, on a 32-bit build with PAE or in edge cases, this multiplication before division could overflow.

Consider restructuring to divide earlier or use explicit 64-bit arithmetic:

♻️ Suggested fix
-            uint32_t mem_limit_mib =
-                meminfo.totalram * meminfo.mem_unit / (1024 * 1024) * 4 / 5;
+            uint64_t total_mem_mib =
+                static_cast<uint64_t>(meminfo.totalram) * meminfo.mem_unit / (1024 * 1024);
+            uint32_t mem_limit_mib = static_cast<uint32_t>(total_mem_mib * 4 / 5);
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b3feed and 7bcfa6d.

📒 Files selected for processing (5)
  • core/include/data_substrate.h
  • core/src/data_substrate.cpp
  • core/src/storage_init.cpp
  • core/src/tx_service_init.cpp
  • store_handler/eloq_data_store_service/eloq_store_config.cpp
🧰 Additional context used
🧬 Code graph analysis (1)
core/src/data_substrate.cpp (4)
core/include/data_substrate.h (1)
  • CheckCommandLineFlagIsDefault (312-323)
store_handler/eloq_data_store_service/eloq_store_config.cpp (2)
  • CheckCommandLineFlagIsDefault (133-145)
  • CheckCommandLineFlagIsDefault (133-133)
store_handler/eloq_data_store_service/main.cpp (2)
  • CheckCommandLineFlagIsDefault (89-101)
  • CheckCommandLineFlagIsDefault (89-89)
store_handler/eloq_data_store_service/rocksdb_config.cpp (2)
  • CheckCommandLineFlagIsDefault (311-323)
  • CheckCommandLineFlagIsDefault (311-311)
🔇 Additional comments (5)
core/include/data_substrate.h (1)

139-139: LGTM!

The addition of node_memory_limit_mb to CoreConfig centralizes memory configuration appropriately. This aligns with the PR objective of having the node memory limit bound the total process memory rather than just the data substrate part.

core/src/storage_init.cpp (1)

291-292: Verify the intent of modifying core_config_.node_memory_limit_mb by reference.

The EloqStoreConfig constructor takes node_memory_mb as a non-const reference and subtracts the buffer pool size from it (line 389 in eloq_store_config.cpp). This means core_config_.node_memory_limit_mb will be permanently reduced after this call.

If this is intentional (so that the remaining memory is available for tx_service), then this is correct. However, be aware that:

  1. GetCoreConfig() will return the modified value after this point
  2. The log message "Data substrate memory limit" in tx_service_init.cpp will show the reduced value

If the original limit should be preserved, consider passing a copy instead.

store_handler/eloq_data_store_service/eloq_store_config.cpp (1)

369-376: Default buffer pool allocation reduced from 50% to 30%.

This is a significant behavioral change that reduces the default EloqStore index buffer pool from 50% to 30% of available memory. While this aligns with the PR's goal of better bounding total memory usage, existing deployments relying on the previous default may experience different performance characteristics.

Consider documenting this change in release notes or migration guides to inform users who may need to explicitly configure eloq_store_index_buffer_pool_size to maintain previous behavior.

core/src/tx_service_init.cpp (1)

190-196: LGTM!

The refactoring to use core_config_.node_memory_limit_mb instead of a local gflag correctly centralizes memory configuration. The log message provides useful visibility into the configured memory limit.

Note: As mentioned in the storage_init.cpp review, by this point the value may have been reduced by the EloqStore buffer pool allocation if InitializeStorageHandler was called first. The log accurately reflects the memory available for the data substrate portion.

core/src/data_substrate.cpp (1)

721-752: LGTM - Memory limit configuration logic is well-structured.

The configuration precedence (command-line flag → config file → auto-detection) is consistent with the pattern used for other configuration values like core_number. The auto-detection using 80% of system memory with a 2048 MiB floor is a sensible default.

One minor observation: the log message at lines 743-746 says "available memory" but shows totalram which is total physical RAM, not necessarily available. This is fine for the intended purpose but could be clarified if needed.

node_memory_limit now bounds the total memory usage of the process
instead of only the data substrate part. Eloqstore mem usage is
subtracted from the node mem limit.
@liunyl liunyl merged commit 203c8ce into main Jan 9, 2026
4 checks passed
@liunyl liunyl deleted the memory_limit branch January 9, 2026 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants