Skip to content

fix(spartan): point IRM dashboard + alert at renamed checkpoint metrics#22972

Closed
AztecBot wants to merge 1 commit into
nextfrom
claudebox/3eebfbfc0754c503-3
Closed

fix(spartan): point IRM dashboard + alert at renamed checkpoint metrics#22972
AztecBot wants to merge 1 commit into
nextfrom
claudebox/3eebfbfc0754c503-3

Conversation

@AztecBot

@AztecBot AztecBot commented May 5, 2026

Copy link
Copy Markdown
Collaborator

Summary

The IRM block-height dashboard and the NoNewBlocks alert were left querying rollup_pending_block_number / rollup_proven_block_number after commit 99bb54d (Nov 2025, "Rename block to checkpoint on l1") renamed the exporter's gauges to rollup_pending_checkpoint_number / rollup_proven_checkpoint_number. Combined with noDataState: KeepLast, this caused the alert to silently stick in whatever state it was in at the rename and the panels to flatline — making perfectly healthy networks look halted.

This PR updates the dashboard JSON and the alert rule to use the current metric names, and renames the alert to NoNewCheckpoints to match the new vocabulary.

Changes

  • spartan/metrics/irm-monitor/grafana/dashboards/block-height-monitor.json: switch all 4 PromQL exprs to rollup_*_checkpoint_number; rename panel titles + legends to "Checkpoint"; bump version 12 → 13.
  • spartan/metrics/irm-monitor/alerting/alert-rules.yml: switch both PromQL exprs to rollup_pending_checkpoint_number; rename refIds, condition, and alertname (NoNewBlocksNoNewCheckpoints); update title and summary text. The alert UID (deq0l9s3xwetcf) and dashboardUid are preserved so Grafana treats this as an in-place update rather than a new alert.

Notes

  • next-net is intentionally not in scope — only testnet-irm and mainnet-irm exporters are deployed, and per Alex it stays that way.
  • After this lands, the alert state will still be whatever KeepLast was holding. Once the renamed rule is reapplied to Grafana Cloud the next evaluation pulls fresh data; if it remains stuck, the alert needs to be manually reset (toggle noDataState to OK for one cycle, or delete-and-reimport).

ClaudeBox log: https://claudebox.work/s/3eebfbfc0754c503?run=3

@AztecBot AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels May 5, 2026
@AztecBot AztecBot closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant