Context
Two related reader upgrades, bundled because they both land across packages/reader + packages/ledger and share test infrastructure.
Part A: Incremental file cursor
Today parseClaudeSession in packages/reader/src/claude.ts re-parses every session JSONL file in full on every run. On a user with months of sessions this is the hot path for burn summary, and it creates unnecessary duplicate-append pressure on the ledger.
Plan (pattern cribbed from TokenTracker rollout.js:74-183):
- Persist cursor state in
\$RELAYBURN_HOME/cursors.json:
{ \"files\": { \"/abs/path/session.jsonl\": { \"inode\": 12345, \"offsetBytes\": 98765, \"mtimeMs\": 1700000000000 } } }
- On next run:
fstat each session file. If inode unchanged and mtime ≥ stored mtime, seek to offsetBytes and parse only the tail. Otherwise treat as a new file and parse from zero (log rotation / file replacement).
- Tail-safety: only advance
offsetBytes past the last complete newline. Never record a position mid-line.
- Concurrency: file-lock
cursors.json during write (session read is append-only and safe without one).
Part B: Git-canonical project key
Today TurnRecord.project is set to cwd at claude.ts:141. That means /Users/will/Projects/burn and /Users/will/burn-worktree-2 — the same repo — roll up separately, and nothing rolls up across machines.
Plan (pattern from TokenTracker rollout.js:1608-1630):
- Add a small helper:
resolveProject(cwd): { project: string, projectKey?: string }.
- Walk up from
cwd looking for .git/config. Parse [remote \"origin\"] url. Canonicalize to host/owner/repo (strip .git, strip git@host:, normalize https://host/ to host/).
- Keep
project: cwd for backward compatibility; add projectKey as the rollup key.
- Queries group by
projectKey when present, fall back to project.
Part C: Ledger idempotency
While we're in the ledger: dedup by (source, sessionId, messageId) hash on append. Prevents double-counting when a session is re-parsed (which will still happen anytime a file is rewritten — e.g. Claude Code's session save cadence).
- Maintain a secondary sidecar index at
\$RELAYBURN_HOME/ledger.idx — a simple newline-delimited list of hashes, or a Bloom filter if memory becomes a concern.
- On
appendTurns, skip any turn whose hash is already indexed.
- Expose a
ledger.rebuildIndex() for recovery.
Acceptance
- Second
burn summary run over the same data is ≥ 10× faster than the first (measured on a fixture with ≥ 100k turns).
- A session from
/Users/will/Projects/burn and a copy at /tmp/burn-worktree-2 roll up under the same projectKey in burn summary.
- Repeated parse of the same session file produces zero duplicate ledger entries. Verified by a test that parses the same fixture twice and asserts ledger byte-length is unchanged on the second pass.
- Log rotation (inode change) correctly triggers full re-parse of the new file.
Depends on
Nothing. Can land in parallel with #1.
Context
Two related reader upgrades, bundled because they both land across
packages/reader+packages/ledgerand share test infrastructure.Part A: Incremental file cursor
Today
parseClaudeSessioninpackages/reader/src/claude.tsre-parses every session JSONL file in full on every run. On a user with months of sessions this is the hot path forburn summary, and it creates unnecessary duplicate-append pressure on the ledger.Plan (pattern cribbed from TokenTracker
rollout.js:74-183):\$RELAYBURN_HOME/cursors.json:{ \"files\": { \"/abs/path/session.jsonl\": { \"inode\": 12345, \"offsetBytes\": 98765, \"mtimeMs\": 1700000000000 } } }fstateach session file. If inode unchanged and mtime ≥ stored mtime, seek tooffsetBytesand parse only the tail. Otherwise treat as a new file and parse from zero (log rotation / file replacement).offsetBytespast the last complete newline. Never record a position mid-line.cursors.jsonduring write (session read is append-only and safe without one).Part B: Git-canonical project key
Today
TurnRecord.projectis set tocwdatclaude.ts:141. That means/Users/will/Projects/burnand/Users/will/burn-worktree-2— the same repo — roll up separately, and nothing rolls up across machines.Plan (pattern from TokenTracker
rollout.js:1608-1630):resolveProject(cwd): { project: string, projectKey?: string }.cwdlooking for.git/config. Parse[remote \"origin\"]url. Canonicalize tohost/owner/repo(strip.git, stripgit@host:, normalizehttps://host/tohost/).project: cwdfor backward compatibility; addprojectKeyas the rollup key.projectKeywhen present, fall back toproject.Part C: Ledger idempotency
While we're in the ledger: dedup by
(source, sessionId, messageId)hash on append. Prevents double-counting when a session is re-parsed (which will still happen anytime a file is rewritten — e.g. Claude Code's session save cadence).\$RELAYBURN_HOME/ledger.idx— a simple newline-delimited list of hashes, or a Bloom filter if memory becomes a concern.appendTurns, skip any turn whose hash is already indexed.ledger.rebuildIndex()for recovery.Acceptance
burn summaryrun over the same data is ≥ 10× faster than the first (measured on a fixture with ≥ 100k turns)./Users/will/Projects/burnand a copy at/tmp/burn-worktree-2roll up under the sameprojectKeyinburn summary.Depends on
Nothing. Can land in parallel with #1.