Skip to content

feat: metrics for message pool internal maps#7133

Merged
hanabi1224 merged 3 commits into
mainfrom
hm/metrics-for-mpool-internal-maps
Jun 3, 2026
Merged

feat: metrics for message pool internal maps#7133
hanabi1224 merged 3 commits into
mainfrom
hm/metrics-for-mpool-internal-maps

Conversation

@hanabi1224

@hanabi1224 hanabi1224 commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary of changes

Changes introduced in this pull request:

# HELP mpool_pending_size_bytes Allocation size of message pool pending messages in bytes
# TYPE mpool_pending_size_bytes gauge
# UNIT mpool_pending_size_bytes bytes
mpool_pending_size_bytes 2464
# HELP mpool_pending_len Length of the message pool pending messages
# TYPE mpool_pending_len gauge
mpool_pending_len 14
# HELP mpool_pending_cap Capacity of the message pool pending messages
# TYPE mpool_pending_cap gauge
mpool_pending_cap 14

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • New Features

    • Added monitoring metrics for the message-pool pending queue, exposing pending size, entry count, and capacity via Prometheus.
  • Improvements

    • Reduced memory usage by trimming internal map capacities automatically after reorganizations and during routine processing to lower retained memory.

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cb78a0d6-be39-47a1-8d82-a218c2621435

📥 Commits

Reviewing files that changed from the base of the PR and between 660d08c and 9c5c69d.

📒 Files selected for processing (2)
  • src/chain_sync/chain_follower.rs
  • src/message_pool/msgpool/pending_store.rs
✅ Files skipped from review due to trivial changes (1)
  • src/chain_sync/chain_follower.rs

Walkthrough

Adds shrink-to-fit memory trimming and a Prometheus collector for the message-pool pending map; replaces the map implementation. Also trims HashMap/sets capacity in chain follower hotspots after updates.

Changes

Pending Pool Memory Management and Metrics

Layer / File(s) Summary
PendingStore API and memory management
src/message_pool/msgpool/pending_store.rs
Replaces ahash::HashMap with hashbrown::HashMap and adds pub(in crate::message_pool) fn shrink_to_fit(&self) that acquires the pending map write lock and calls shrink_to_fit() on the underlying map.
Metrics collection infrastructure
src/message_pool/msgpool/pending_store.rs
Adds InnerMetricsCollector implementing prometheus_client::collector::Collector and registers it during PendingStore::new. The collector encodes three gauges: mpool_pending_size, mpool_pending_len, and mpool_pending_cap by reading the pending map.
Reorg flow integration
src/message_pool/msgpool/reorg.rs
Calls self.pending.shrink_to_fit() at the end of apply_head_change to trim the pending pool capacity after reorg processing.

Chain follower capacity trims

Layer / File(s) Summary
Trim tasks_set and tipsets capacity
src/chain_sync/chain_follower.rs
Calls tasks_set.shrink_to_fit() after spawning/processing sync tasks in the update loop, and calls self.tipsets.shrink_to_fit() after inserting a merged FullTipset to reduce retained map capacity.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • ChainSafe/forest#6965: Modifies the PendingStore structure; this PR builds on that refactor by adding shrink and metrics behavior.
  • ChainSafe/forest#7131: Also touches chain_follower sizing/metrics; related to the capacity-trimming changes in this PR.
  • ChainSafe/forest#7033: Changes MessagePool::apply_head_change and reorg logic; related to the reorg integration that now calls pending.shrink_to_fit().

Suggested reviewers

  • sudo-shashank
  • LesnyRumcajs
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: metrics for message pool internal maps' accurately describes the main change: adding Prometheus metrics for the message pool's pending map. It is specific, concise, and directly reflects the primary objective shown in the PR description and code changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/metrics-for-mpool-internal-maps
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/metrics-for-mpool-internal-maps

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review June 3, 2026 05:06
@hanabi1224 hanabi1224 requested a review from a team as a code owner June 3, 2026 05:06
@hanabi1224 hanabi1224 requested review from akaladarshi and sudo-shashank and removed request for a team June 3, 2026 05:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/message_pool/msgpool/pending_store.rs (1)

146-193: 💤 Low value

Optional: collapse the three near-identical gauge blocks into a helper.

The size/len/cap blocks differ only by name, help, unit, and value. A small local closure removes the copy-paste (and the stale size_metric_encoder name reused in the len/cap blocks).

♻️ Sketch
let mut encode_gauge =
    |name: &str, help: &str, unit: Option<&Unit>, value: i64| -> Result<(), std::fmt::Error> {
        let g: Gauge = Default::default();
        g.set(value);
        let e = encoder.encode_descriptor(name, help, unit, g.metric_type())?;
        g.encode(e)
    };

let pending = self.pending.read();
encode_gauge("mpool_pending_size", "...", Some(&Unit::Bytes), pending.allocation_size() as i64)?;
encode_gauge("mpool_pending_len", "...", None, pending.len() as i64)?;
encode_gauge("mpool_pending_cap", "...", None, pending.capacity() as i64)?;

Reading the lock once also gives a consistent snapshot across the three gauges.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/message_pool/msgpool/pending_store.rs` around lines 146 - 193, The three
nearly identical Gauge blocks inside InnerMetricsCollector::encode should be
collapsed into a small helper/closure to avoid repetition and the stale variable
name; read the pending lock once into a local (e.g., let pending =
self.pending.read()) to get a consistent snapshot, then create a closure like
encode_gauge(name, help, unit, value) that creates a Gauge, sets it, calls
encoder.encode_descriptor(...) and then encodes the gauge; call this helper for
"mpool_pending_size" (Some(&Unit::Bytes), pending.allocation_size() as i64),
"mpool_pending_len" (None, pending.len() as i64) and "mpool_pending_cap" (None,
pending.capacity() as i64) within InnerMetricsCollector::encode.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/message_pool/msgpool/pending_store.rs`:
- Around line 154-160: The metric help text for "mpool_pending_size" is
misleading: size_in_bytes is computed from hashbrown::HashMap::allocation_size()
which only measures the map's table/bucket allocation and does not include heap
memory held by MsgSet values; update the descriptor passed to
encoder.encode_descriptor (for "mpool_pending_size") to explicitly state it
measures the hash map's internal table/bucket allocation (not total
pending-message heap) so dashboards/alerts are not misled; keep the metric name
and units unchanged and only change the help string referenced where
size_in_bytes and encoder.encode_descriptor are used.
- Around line 44-53: PendingStore::new currently calls
crate::metrics::register_collector(Box::new(InnerMetricsCollector(inner.shallow_clone())))
unconditionally, which causes duplicate metric registration when multiple
MessagePool::new run; change this so the collector is registered only once (or
use a guarded/conditional registration) — e.g., add a check/once-guard around
register_collector or move registration out of PendingStore::new into a single
initialization path (call site for MessagePool::new), or have
InnerMetricsCollector register itself idempotently (check registry for existing
collector name) before calling register_collector; update references to
PendingStore::new, InnerMetricsCollector, register_collector and
MessagePool::new accordingly to ensure only one collector registration per
process.

---

Nitpick comments:
In `@src/message_pool/msgpool/pending_store.rs`:
- Around line 146-193: The three nearly identical Gauge blocks inside
InnerMetricsCollector::encode should be collapsed into a small helper/closure to
avoid repetition and the stale variable name; read the pending lock once into a
local (e.g., let pending = self.pending.read()) to get a consistent snapshot,
then create a closure like encode_gauge(name, help, unit, value) that creates a
Gauge, sets it, calls encoder.encode_descriptor(...) and then encodes the gauge;
call this helper for "mpool_pending_size" (Some(&Unit::Bytes),
pending.allocation_size() as i64), "mpool_pending_len" (None, pending.len() as
i64) and "mpool_pending_cap" (None, pending.capacity() as i64) within
InnerMetricsCollector::encode.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c60327c3-583c-4ecf-b24d-1a96a604280d

📥 Commits

Reviewing files that changed from the base of the PR and between 39a6969 and 660d08c.

📒 Files selected for processing (2)
  • src/message_pool/msgpool/pending_store.rs
  • src/message_pool/msgpool/reorg.rs

Comment thread src/message_pool/msgpool/pending_store.rs
Comment thread src/message_pool/msgpool/pending_store.rs
@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 24.48980% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.30%. Comparing base (39a6969) to head (9c5c69d).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/message_pool/msgpool/pending_store.rs 21.73% 36 Missing ⚠️
src/chain_sync/chain_follower.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/message_pool/msgpool/reorg.rs 65.82% <100.00%> (+0.43%) ⬆️
src/chain_sync/chain_follower.rs 33.41% <50.00%> (+0.04%) ⬆️
src/message_pool/msgpool/pending_store.rs 82.85% <21.73%> (-14.24%) ⬇️

... and 14 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 39a6969...9c5c69d. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hanabi1224 hanabi1224 added this pull request to the merge queue Jun 3, 2026
Merged via the queue into main with commit b9220b9 Jun 3, 2026
99 of 102 checks passed
@hanabi1224 hanabi1224 deleted the hm/metrics-for-mpool-internal-maps branch June 3, 2026 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants