Skip to content

perf: read verbs skip ingest on read; index bounded ts windows (all reads <100ms)#479

Merged
willwashburn merged 1 commit into
mainfrom
perf/summary-hotspots-sub-100ms
Jun 17, 2026
Merged

perf: read verbs skip ingest on read; index bounded ts windows (all reads <100ms)#479
willwashburn merged 1 commit into
mainfrom
perf/summary-hotspots-sub-100ms

Conversation

@willwashburn

@willwashburn willwashburn commented Jun 17, 2026

Copy link
Copy Markdown
Member

Summary

Every burn read command now returns in under 100ms on a large ledger. Two independent costs were dominating, both addressed here.

1. Read verbs ran a synchronous full-store ingest sweep

summary and hotspots called ingest_all before querying. That sweep re-stats every session file under every harness store (~44k+ OpenCode session files on the test ledger) on each invocation — seconds of wall-clock, ~99% of the command's time.

Fix: both verbs are read-only by default with an opt-in --ingest flag, matching the existing burn compare precedent. Freshness stays the job of burn ingest --watch (FS-event driven) or the Claude Stop hook. When the sweep is skipped, an empty IngestReport keeps the banner / JSON ingest block byte-identical to a no-op sweep.

2. Windowed turn queries full-scanned the table

query_turns runs SELECT record_json FROM turns WHERE ts >= ? ORDER BY rowid. The ledger has idx_turns_ts, but ORDER BY rowid made the planner pick a full SCAN turns — reading all ~97k rows to return the ~1,123 in a 24h window (EXPLAIN QUERY PLAN confirmed).

Fix: force INDEXED BY idx_turns_ts for a bounded ts window when no session_id filter is present (new TableFilters.prefer_ts_index, set only by the full-window query_turns; the session-IN path rides idx_turns_session and opts out, as do non-turns tables). The matched set is sorted by rowid in memory, so output is byte-identical (same rows, same order).

Results

Median, release binary, live ~614MB burn.sqlite, --since 24h:

Command Before After
hotspots 3088 ms 76 ms
summary 3298 ms 62 ms
compare 50 ms 16 ms
overhead 49 ms 13 ms
state status 62 ms 65 ms
sessions / stamps / flow ~4–5 ms ~4–5 ms

Correctness

  • All BURN_GOLDEN command snapshots still pass byte-for-byte (summary, hotspots, overhead, compare, flow) — the index change only alters the access path.
  • New tests: 4 smoke tests (summary/hotspots ingest gate) + reader.rs::build_select_sql_tests (index-hint conditions).
  • cargo fmt/clippy -D warnings clean; full workspace suite green.
  • Pre-existing, unrelated drift noted: the BURN_GOLDEN-gated state/state-json snapshots pin schema version: 5 while SCHEMA_VERSION is 6 — worth a separate snapshot refresh.

Tooling

Adds scripts/bench-summary.py (pnpm run bench:summary) — times the release binary end-to-end against a live ledger and exits non-zero over a budget, so it doubles as a regression gate.

Behavior change to flag

summary/hotspots no longer auto-freshen on read. Consumers that relied on that (e.g. polling summary --json) should run burn ingest --watch or pass --ingest.

🤖 Generated with Claude Code

Review in cubic

Two changes that put every `burn` read command under 100ms on a large
ledger (measured on a ~614MB burn.sqlite, `--since 24h`):

1. `summary` and `hotspots` no longer run a synchronous full-store
   `ingest_all` sweep before querying. That sweep re-stats every session
   file under every harness store (~44k+ OpenCode session files here),
   costing seconds. Both verbs now take an opt-in `--ingest` flag and are
   read-only by default, matching the existing `burn compare` precedent.
   Freshness stays the job of `burn ingest --watch` / the Claude Stop hook.

2. `query_turns` forces `idx_turns_ts` for a bounded `ts` window with no
   `session_id` filter. The statement ends in `ORDER BY rowid`, which made
   SQLite prefer a full `SCAN` of all ~97k rows even for a 24h window; the
   index seek + an in-memory rowid sort of the small matched set is far
   cheaper. Output is byte-identical (same rows, same order) — gated behind
   the new `TableFilters.prefer_ts_index`, set only by the full-window
   `query_turns` (the session-IN path and non-`turns` tables opt out).

Results (median): hotspots 3088->76ms, summary 3298->62ms,
compare 50->16ms, overhead 49->13ms. sessions/stamps/flow already ~4-5ms.
All command golden snapshots remain byte-identical.

Adds `scripts/bench-summary.py` (`pnpm run bench:summary`) as a timing
gate, and unit/smoke tests for both changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@willwashburn, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 2 minutes and 12 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6e51e94d-6926-4dc2-b4ab-1e2f59002ed3

📥 Commits

Reviewing files that changed from the base of the PR and between 2a745c0 and bae2989.

📒 Files selected for processing (7)
  • CHANGELOG.md
  • crates/relayburn-cli/src/commands/hotspots.rs
  • crates/relayburn-cli/src/commands/summary.rs
  • crates/relayburn-cli/tests/smoke.rs
  • crates/relayburn-sdk/src/ledger/reader.rs
  • package.json
  • scripts/bench-summary.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/summary-hotspots-sub-100ms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bae2989ae9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/bench-summary.py
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
env=env,
check=False,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail the benchmark on command errors

When the measured burn subprocess exits non-zero (for example with a corrupt ledger, invalid --summary-args, or a CLI regression), check=False still records the elapsed time while stdout/stderr are discarded, so a fast failure can be included in the samples and make the budget gate report PASS. Check the return code (or use check=True) before accepting a sample.

Useful? React with 👍 / 👎.

@willwashburn willwashburn merged commit b72a3ec into main Jun 17, 2026
12 checks passed
@willwashburn willwashburn deleted the perf/summary-hotspots-sub-100ms branch June 17, 2026 19:49
willwashburn added a commit that referenced this pull request Jun 17, 2026
Two causes, both fixed:

1. Stale ledger. `burn summary` is read-only now (#479) and the background
   `ingest --watch` doesn't keep up (manual ingest pulled in 3290 backlogged
   turns), so the polled totals didn't move. Replace the watch with a one-shot
   incremental `burn ingest` at the top of each poll (BurnLedger.ingest()) — a
   warm sweep is ~1–3s, so the poll cadence is now 2.5s.

2. Negative/clamped rate. The rate was the delta between two trailing-5-min
   window totals, which dips negative as old turns age out of the window and was
   clamped to 0 — reading as "no usage". Replace with a moving average: each poll
   queries the trailing 60s and divides, which is non-negative and robust to the
   window sliding and to late ingests. The cumulative line is now the integral of
   that rate (a monotonic session total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
willwashburn added a commit that referenced this pull request Jun 18, 2026
…#482)

* macOS: Settings cog (appearance + quit), move tab toggle into the top bar

- Move the Usage/Live segmented toggle up into the header row, next to the
  Codex/Claude provider toggle.
- Add a gear (settings) button on the right of that bar. It toggles a Settings
  tab with:
  - Appearance — Light / Dark / System (follow system), persisted in the
    "appearance" UserDefault and applied to the whole popover (chrome + content)
    by AppDelegate via popover.appearance, live as it changes.
  - Quit Burn (the global ⌘Q still works too).
- Picking Usage/Live also dismisses Settings.

This re-lands the settings work that never reached main (it was committed to the
live-burn branch after #480 merged).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Fix live burn monitor showing no usage

Two causes, both fixed:

1. Stale ledger. `burn summary` is read-only now (#479) and the background
   `ingest --watch` doesn't keep up (manual ingest pulled in 3290 backlogged
   turns), so the polled totals didn't move. Replace the watch with a one-shot
   incremental `burn ingest` at the top of each poll (BurnLedger.ingest()) — a
   warm sweep is ~1–3s, so the poll cadence is now 2.5s.

2. Negative/clamped rate. The rate was the delta between two trailing-5-min
   window totals, which dips negative as old turns age out of the window and was
   clamped to 0 — reading as "no usage". Replace with a moving average: each poll
   queries the trailing 60s and divides, which is non-negative and robust to the
   window sliding and to late ingests. The cumulative line is now the integral of
   that rate (a monotonic session total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant