Add --bucket time-series to summary and compare#483
Conversation
`--bucket <DURATION>` turns the `--since` window into a fixed-width
time-series: instead of one total, emit the verb's report per bucket as
`{ bucketSeconds, buckets: [{ start, end, ...report... }] }`.
- Shared primitive in query_verbs/mod.rs: parse_bucket (grammar 30s/5m/1h/
12h/1d/7d — note `m` = minutes here, vs `m` = month for --since, since the
callers want minute buckets), iso_z_to_epoch_secs, and a Buckets partition
(anchored on the normalized --since, or the earliest turn; MAX_BUCKETS guard).
- summary: LedgerHandle::summary_timeseries — turn-level partition (the grouped
model/provider/tag fold is pure per-turn), rejected for the by-tool/subagent
attribution modes.
- compare: LedgerHandle::compare_timeseries — turn-level per-bucket compare.
- CLI: --bucket on `summary` and `compare`; JSON emits the series, human output
prints a per-bucket line.
- Tests: parser grammar, ISO<->epoch round-trip, Buckets edges/indices,
summary_timeseries placement reconciling to the un-bucketed total, and
attribution-mode rejection.
hotspots/overhead deferred: their aggregation isn't a pure per-turn fold
(cross-turn cost attribution; window-amortized per-session denominators), so
they need session-granular bucketing — a focused follow-up. The flag is simply
absent there rather than shipping incorrect attribution.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 41 minutes and 44 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughAdds a ChangesPer-bucket time-series for
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ccb914e18
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pub(crate) fn clamp_bucket_anchor(anchor: i64, end: i64, bucket_secs: u64) -> i64 { | ||
| let max_span = MAX_BUCKETS.saturating_mul(bucket_secs.max(1) as i64); | ||
| if end.saturating_sub(anchor) > max_span { | ||
| end - max_span |
There was a problem hiding this comment.
Reject bucket ranges instead of truncating data
For any requested range that would produce more than 10,000 buckets, this advances the bucket anchor to end - max_span even though the ledger query has already loaded the full --since window. In a common case like burn summary --since 24h --bucket 1s (and compare, which uses the same helper), turns from the first ~21 hours are silently dropped from the time-series and the bucket totals no longer reconcile with the unbucketed result. Please fail fast or otherwise preserve the full requested range instead of changing the lower bound.
Useful? React with 👍 / 👎.
| let handle = Ledger::open(ledger_opts)?; | ||
|
|
||
| // `--bucket` switches to a per-bucket time-series via the SDK verb. | ||
| if let Some(bucket_raw) = args.bucket.as_deref() { |
There was a problem hiding this comment.
Honor --csv when bucketed compare is requested
When --bucket is present, this branch returns before reaching the existing CSV renderer and never checks args.csv. As a result, burn compare modelA,modelB --bucket 1h --csv exits successfully but prints the human bucket summary lines, which breaks callers that requested CSV output. Please either reject --csv with --bucket or render a CSV time-series.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request adds a new --bucket <DURATION> option to the burn summary and burn compare commands, enabling users to emit a per-bucket time-series across a specified window. The implementation includes CLI argument updates, time-bucketing logic in the SDK, and corresponding tests. Feedback on the changes highlights a potential runtime panic in bucket_secs_from_str when slicing a string by byte indices if it ends with a multi-byte UTF-8 character; using chars.next_back() and chars.as_str() is recommended for a safer and more idiomatic solution.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| fn bucket_secs_from_str(s: &str) -> Option<u64> { | ||
| if s.len() < 2 { | ||
| return None; | ||
| } | ||
| let unit = s.as_bytes()[s.len() - 1] as char; | ||
| let num = &s[..s.len() - 1]; | ||
| if num.is_empty() || !num.bytes().all(|b| b.is_ascii_digit()) { | ||
| return None; | ||
| } | ||
| let n: u64 = num.parse().ok()?; | ||
| if n == 0 { | ||
| return None; | ||
| } | ||
| match unit { | ||
| 's' => Some(n), | ||
| 'm' => n.checked_mul(60), | ||
| 'h' => n.checked_mul(3_600), | ||
| 'd' => n.checked_mul(86_400), | ||
| 'w' => n.checked_mul(7 * 86_400), | ||
| _ => None, | ||
| } | ||
| } |
There was a problem hiding this comment.
Slicing the string s using byte indices (&s[..s.len() - 1]) can cause a runtime panic if the last character of the string is a multi-byte UTF-8 character (e.g., full-width Unicode characters), because s.len() - 1 will not lie on a valid UTF-8 character boundary.
Using chars.next_back() and chars.as_str() is a much safer and more idiomatic way to split the string into the numeric prefix and the unit suffix without risking any panics.
fn bucket_secs_from_str(s: &str) -> Option<u64> {
let mut chars = s.chars();
let unit = chars.next_back()?;
let num = chars.as_str();
if num.is_empty() || !num.chars().all(|c| c.is_ascii_digit()) {
return None;
}
let n: u64 = num.parse().ok()?;
if n == 0 {
return None;
}
match unit {
's' => Some(n),
'm' => n.checked_mul(60),
'h' => n.checked_mul(3_600),
'd' => n.checked_mul(86_400),
'w' => n.checked_mul(7 * 86_400),
_ => None,
}
}Replace the real-time moving-average stream with a bucketed range chart. A segmented switch picks the window; each range maps to a burn --bucket size and refresh cadence (5m→30s/3s … 7d→12h/300s). Per provider, query `burn summary --since <iso> --bucket <b> --json` and plot per-bucket rate (tokens/bucketSeconds) + a running cumulative line, keeping the overlaid per-provider lines (Claude coral, Codex blue/purple) and the show/hide toggles. Switching range clears + re-queries immediately and retimes the loop; stale in-flight results are dropped. Background `ingest --watch` still provides freshness. Depends on burn `--bucket` (PR #483): the release build.sh bundles a --bucket-capable burn once #483 is on main. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (1)
CHANGELOG.md (1)
7-7: ⚡ Quick winTighten this
[Unreleased]bullet to be impact-first and concise.The entry currently mixes shipped behavior with deep implementation rationale; please trim it to user-visible effect plus key constraints.
Suggested rewrite
-- `--bucket <DURATION>` on `burn summary` and `burn compare`: emit a per-bucket time-series across the `--since` window instead of a single total. JSON is `{ "bucketSeconds": N, "buckets": [ { "start", "end", ...the verb's report... } ] }`. Bucket grammar is the chart-axis form `30s` / `5m` (minutes) / `1h` / `12h` / `1d` / `7d` — note `m` is minutes here, unlike `--since` where `m` is months. Summary buckets are a pure per-turn fold, so per-bucket totals reconcile with the un-bucketed total; only the default grouped (`byModel` / `--by-provider`) summary supports `--bucket`. (`--bucket` on `hotspots` / `overhead` is not yet wired — their cross-turn attribution and window-amortized denominators need session-granular bucketing.) +- `burn summary` and `burn compare` now accept `--bucket <DURATION>` to emit per-bucket time-series output over the `--since` window (`{ bucketSeconds, buckets: [...] }` in JSON). Supported bucket units: `30s`, `5m`, `1h`, `12h`, `1d`, `7d` (`m` = minutes for this flag). `burn summary --bucket` is available only in grouped modes; `hotspots`/`overhead` are unchanged.As per coding guidelines, changelog entries should be concise, impact-first, and avoid internal implementation backstory unless it explains shipped impact.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CHANGELOG.md` at line 7, The CHANGELOG.md entry for the `--bucket` feature is too verbose and includes deep implementation rationale that obscures user-visible impact. Rewrite this bullet point to lead with what users can now do (emit per-bucket time-series data with specified bucket grammar), then briefly mention the key constraint (only works with default grouped summaries, not yet wired for hotspots/overhead). Remove the technical details about cross-turn attribution, window-amortized denominators, and session-granular bucketing, as these are implementation concerns that don't directly affect how users interact with the feature. Keep the JSON structure reference only if it clarifies the shipped interface.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/relayburn-cli/src/commands/compare.rs`:
- Around line 216-228: The code currently falls through to printing
TTY-formatted bucket summaries regardless of whether the csv flag is set in
globals, and it silently returns success when series.buckets is empty without
any output. Add a check after the json flag check to reject or handle the csv
flag appropriately for bucket mode (rejecting it unless CSV timeseries rendering
is added), and add a check for an empty series.buckets collection to print the
same no-data marker used by the summary renderer instead of silently returning
Ok(0).
- Around line 184-193: The bucket parsing error handling in the
`args.bucket.as_deref()` block is inconsistent with other argument validations
because it directly prints to stderr using eprintln and returns Ok(2) instead of
propagating the error through the normal validation path. Move the bucket
parsing logic to occur before Ledger::open is called, and change the error
handling for the relayburn_sdk::parse_bucket call to use the question mark
operator or return Err so that validation errors are routed through the standard
error handling that respects the --json flag and emits the documented error
envelope.
In `@crates/relayburn-cli/src/commands/summary.rs`:
- Around line 293-310: The bucket parsing and validation using
relayburn_sdk::parse_bucket is currently happening after the ledger has been
opened and after optional ingest operations have completed, which allows
unnecessary work to occur before returning argument validation errors. Move the
bucket argument parsing and validation logic to occur before the ledger is
opened and before any ingest operations. Additionally, add validation to reject
the --quality flag if the timeseries response does not include quality data,
ensuring invalid argument combinations are caught early and prevent redundant
processing.
In `@crates/relayburn-sdk/src/query_verbs/compare.rs`:
- Around line 199-201: The `clamp_bucket_anchor` call may advance the `anchor`
parameter, causing turns before the new anchor to be silently excluded from the
comparison without alerting the caller that the requested `--since` window was
adjusted. To fix this, detect when clamping occurs by comparing the original
`anchor` value with the value returned by `clamp_bucket_anchor`, and if they
differ, either log a warning, return an error, or modify the return value of
`compare_timeseries` to explicitly indicate that the comparison is partial and
the actual window differs from what was requested.
In `@crates/relayburn-sdk/src/query_verbs/summary.rs`:
- Around line 613-615: The issue is that clamp_bucket_anchor silently moves the
anchor forward when the requested bucket window exceeds MAX_BUCKETS, causing
subsequent Buckets::index_for calls to drop data before the new anchor and lose
accuracy. Instead of allowing this silent truncation, validate that the
requested bucket window (from anchor to now with bucket_secs) does not exceed
MAX_BUCKETS before calling clamp_bucket_anchor. If the window is too large,
return an error immediately rather than proceeding with the clamped anchor. This
ensures bucket totals accurately represent the requested summary window without
silent data loss.
- Around line 581-583: The summary_timeseries function silently ignores the
include_quality option instead of rejecting it like summary_report does, causing
callers to receive successful responses without quality data. Add a validation
check in summary_timeseries similar to the existing check for group_by_tag with
bucket (shown in the diff) that rejects the request when opts.include_quality is
set with an appropriate error message explaining that quality data is not
supported in timeseries mode.
---
Nitpick comments:
In `@CHANGELOG.md`:
- Line 7: The CHANGELOG.md entry for the `--bucket` feature is too verbose and
includes deep implementation rationale that obscures user-visible impact.
Rewrite this bullet point to lead with what users can now do (emit per-bucket
time-series data with specified bucket grammar), then briefly mention the key
constraint (only works with default grouped summaries, not yet wired for
hotspots/overhead). Remove the technical details about cross-turn attribution,
window-amortized denominators, and session-granular bucketing, as these are
implementation concerns that don't directly affect how users interact with the
feature. Keep the JSON structure reference only if it clarifies the shipped
interface.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9db6edf9-ebc5-43d0-9141-43c0154a38e5
📒 Files selected for processing (9)
CHANGELOG.mdcrates/relayburn-cli/src/cli.rscrates/relayburn-cli/src/commands/compare.rscrates/relayburn-cli/src/commands/summary.rscrates/relayburn-sdk/src/query_verbs/compare.rscrates/relayburn-sdk/src/query_verbs/mod.rscrates/relayburn-sdk/src/query_verbs/summary.rscrates/relayburn-sdk/src/query_verbs/tests.rspackages/relayburn/CHANGELOG.md
- bucket_secs_from_str: split on the final char, not byte, so a trailing multi-byte UTF-8 unit can't panic on a mid-codepoint slice. - Replace silent clamp_bucket_anchor with ensure_bucket_span: fail fast when a window exceeds MAX_BUCKETS instead of advancing the anchor and dropping already-queried turns (which broke per-bucket reconciliation). Applied in both summary_timeseries and compare_timeseries. - summary_timeseries: reject --quality (was silently ignored). - summary CLI: parse/validate --bucket (mode + --quality) before opening the ledger or running ingest, so bad invocations fail fast. - compare CLI: parse --bucket before ledger open and route errors through the normal Err path (honors --json envelope); reject --bucket with --csv; print the no-data marker for an empty bucket series instead of silent success. - Tests: multi-byte unit rejection, --quality rejection, oversized-window rejection. - Tighten the CHANGELOG bullet to be impact-first. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the real-time moving-average stream with a bucketed range chart. A segmented switch picks the window; each range maps to a burn --bucket size and refresh cadence (5m→30s/3s … 7d→12h/300s). Per provider, query `burn summary --since <iso> --bucket <b> --json` and plot per-bucket rate (tokens/bucketSeconds) + a running cumulative line, keeping the overlaid per-provider lines (Claude coral, Codex blue/purple) and the show/hide toggles. Switching range clears + re-queries immediately and retimes the loop; stale in-flight results are dropped. Background `ingest --watch` still provides freshness. Depends on burn `--bucket` (PR #483): the release build.sh bundles a --bucket-capable burn once #483 is on main. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…485) * Live monitor: stop ingesting per poll (3–4s); use the background watch Root of the slow "loading": each live poll ran a one-shot `burn ingest`, which is ~3–5s on a 591MB ledger (even with nothing new). Read-only `burn summary` is only ~10ms. Move freshness back to a continuous background `burn ingest --watch` (FS-event driven, ~1s poll, incremental — verified it catches new turns), and keep the poll path to summary only. Poll cadence back to 1.5s. (Keeps the moving-average rate fix from the prior commit.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Live tab: overlay both providers with toggles; Codex blue/purple - LiveBurnViewModel now polls every provider each cycle and keeps a per-provider sample series; exposes show/hide toggles (won't hide the last one). - LiveBurnView overlays one color-coded line per provider (Claude coral, Codex blue/purple) on both the rate and cumulative charts, with toggle chips that double as the legend. Headline is the combined rate/spend across shown providers. Independent of the Usage tab's single-select provider. - Codex brandColor → blue/purple (#5B6CFF) to match its icon's color scheme, replacing the green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Live tab: time-range switcher (5m/1h/12h/1d/7d) via summary --bucket Replace the real-time moving-average stream with a bucketed range chart. A segmented switch picks the window; each range maps to a burn --bucket size and refresh cadence (5m→30s/3s … 7d→12h/300s). Per provider, query `burn summary --since <iso> --bucket <b> --json` and plot per-bucket rate (tokens/bucketSeconds) + a running cumulative line, keeping the overlaid per-provider lines (Claude coral, Codex blue/purple) and the show/hide toggles. Switching range clears + re-queries immediately and retimes the loop; stale in-flight results are dropped. Background `ingest --watch` still provides freshness. Depends on burn `--bucket` (PR #483): the release build.sh bundles a --bucket-capable burn once #483 is on main. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Live charts: adaptive time (x) axis + softer gridlines Add a "legend for time" — an x-axis whose labels adapt to the selected range (clock time for 5m–1d, calendar date for 7d) on both the rate and cumulative charts. Soften gridlines (0.08→0.06) for a calmer, more readable chart. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Live tab: show range totals (tokens + cost) in the headline For the selected timeframe, the headline now leads with the combined total tokens and total cost across shown providers (e.g. "Last 7d · 2.1B tokens · $1,444"), with the current burn rate kept as secondary context. Totals come from the cumulative line's last point per provider. Add a billions (B) tier to the token formatter for the large multi-day sums. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Live tab: fix stuck spinner on fast range switches setRange cleared the series and kicked a refresh, but refresh's hard `guard !refreshing` dropped the new request while one was in flight, and the in-flight one discarded its result as stale — so nothing repopulated until the next timer tick (up to 5 min on 7d), leaving the warming spinner stuck. Coalesce instead: a refresh requested while one runs sets `refreshAgain`, and the running refresh loops one more pass for the latest range. Publish only when the queried range still matches. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Adds
--bucket <DURATION>to the read verbs that take--since, turning a single windowed total into a time-series: the verb's report computed per fixed-width time bucket. Built for a live spend chart that needs selectable ranges (last 5m / 1h / 12h / 1d / 7d) sourced from one indexed query rather than N client-side round-trips.JSON shape (both verbs):
{ "bucketSeconds": N, "buckets": [ { "start": ISO, "end": ISO, ...the verb's normal report... } ] }.What's included
query_verbs/mod.rs):parse_bucket(grammar30s/5m/1h/12h/1d/7d),iso_z_to_epoch_secs, and aBucketspartition anchored on the normalized--since(or the earliest turn), with aMAX_BUCKETSguard. Reuses the existing fetch (query_turns+idx_turns_ts),cost_for_turn,load_pricing, andprovider_for.summary_timeseries— turn-level partition (the grouped model/provider/tag path is a pure per-turn fold). Rejected with a clear error for the--by-tool/subagent/relationship attribution modes.compare_timeseries— per-bucket compare table.--bucketonsummaryandcompare; JSON emits the series, human output prints a per-bucket line.Bucketsedges/indices,summary_timeseriesplacement reconciling exactly to the un-bucketed total, attribution-mode rejection.Note:
m= minutes for--bucket--sincetreatsmas month, but month-sized buckets are meaningless and the consumer wants minute buckets ("last 5 minutes"). So--bucketusesm= minutes and addss= seconds. Documented in the flag help and code.Deferred: hotspots / overhead
Both also take
--since, but their aggregation isn't a pure per-turn fold —hotspotsattributes turn N's cost to turn N−1'stool_use(ordering/cross-turn + session-scoped side queries) andoverheadcomputes window-amortized per-session denominators (and does filesystem work). Correct bucketing for these needs session-granular partitioning (assign whole sessions to a bucket by earliestts). Rather than ship incorrect attribution,--bucketis simply absent on those two; it's a clean follow-up.Verification
cargo build --workspace✅ ·cargo test --workspace✅ (all suites pass; 5 new). Smoke vs a real ledger:summary --provider anthropic --since 1h --bucket 5m --json→bucketSeconds=300, 12 buckets, and bucket-sum tokens == the plain--since 1htotal exactly.🤖 Generated with Claude Code