cli/sdk: inference-flow DAG with dispatch/return edges (#431)#453
Conversation
|
Warning Review limit reached
More reviews will be available in 33 minutes and 56 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (14)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces the burn flow CLI command and underlying SDK APIs (LedgerHandle::turn_span_tree, session_span_trees, and flow_graph) to construct and visualize a session's inference-flow DAG. It implements OTel-style per-turn span trees (TurnSpanTree) and per-session flow graphs (FlowGraph) for both Claude Code and Codex harnesses, rendering them as Mermaid, deterministic SVG, or JSON. The review feedback is highly actionable and focuses on improving cross-platform compatibility and rendering robustness, specifically by enforcing a minimum SVG width to prevent legend clipping, handling file renaming on Windows, and adding a fallback to the USERPROFILE environment variable.
| let (max_x, max_y) = graph.nodes.iter().fold((0_i32, 0_i32), |(mx, my), n| { | ||
| (mx.max(n.x + NODE_WIDTH), my.max(n.y + NODE_HEIGHT)) | ||
| }); | ||
| let width = max_x + SVG_MARGIN * 2; |
There was a problem hiding this comment.
The SVG legend on row 2 contains items that extend up to x = 426 (for the "unattached" label). If the session has few turns (e.g., 3 turns, resulting in a width of 328 as seen in the snapshot), the legend items for "subagent" and "unattached" are clipped and rendered outside the SVG viewBox.
Enforcing a minimum width of 512 pixels ensures the legend is always fully visible.
| let width = max_x + SVG_MARGIN * 2; | |
| let width = (max_x + SVG_MARGIN * 2).max(512); |
There was a problem hiding this comment.
Fixed in 7d48fb2. The SVG width now honors a LEGEND_MIN_WIDTH floor derived from the actual legend layout (SVG_MARGIN + (LEGEND_ROW2_ITEMS - 1) * LEGEND_ITEM_STEP + 18 + LEGEND_LAST_LABEL_RESERVE + SVG_MARGIN = 530 user units) rather than a hardcoded magic number, so future legend changes propagate automatically. Width on the 3-turn fixture went from 328 -> 530 and the snapshot was regenerated. Two new unit tests cover both the min-width branch (small graphs) and the wide-graph branch (svg_width_honors_legend_minimum_on_small_graphs, svg_width_exceeds_legend_minimum_on_wide_graphs).
Generated by Claude Code
| fs::write(&tmp, bytes)?; | ||
| fs::rename(&tmp, path) |
There was a problem hiding this comment.
On Windows, std::fs::rename fails if the destination file already exists. Since write_atomic is used to overwrite the output SVG file, this command will fail on Windows if the file already exists.
Deleting the destination file first on Windows before renaming resolves this issue.
fs::write(&tmp, bytes)?;
if cfg!(windows) && path.exists() {
let _ = fs::remove_file(path);
}
fs::rename(&tmp, path)There was a problem hiding this comment.
Fixed in 7d48fb2. write_atomic now drops the destination file on Windows before renaming, matching the suggested shape. Guarded with cfg!(windows) so the branch still type-checks on Linux/macOS while the dead-code optimizer strips it on non-Windows targets.
Generated by Claude Code
| let home = std::env::var_os("HOME") | ||
| .map(std::path::PathBuf::from) | ||
| .unwrap_or_else(|| std::path::PathBuf::from(".")); |
There was a problem hiding this comment.
On Windows, the HOME environment variable is typically not set by default (instead, USERPROFILE is used). Falling back directly to . when HOME is missing will prevent the SDK from locating the Claude projects directory on Windows systems.
Adding a fallback to USERPROFILE provides robust cross-platform support without adding external dependencies.
let home = std::env::var_os("HOME")
.or_else(|| std::env::var_os("USERPROFILE"))
.map(std::path::PathBuf::from)
.unwrap_or_else(|| std::path::PathBuf::from("."));There was a problem hiding this comment.
Fixed in 7d48fb2. Added the USERPROFILE fallback chain (HOME -> USERPROFILE -> ".") at both call sites in query_verbs.rs — the flow path's discover_and_pair_subagents (target of this comment) and the summary path's compute_summary_subagent_counts (which had the same bug). Stock Windows shells without HOME set will now resolve to %USERPROFILE%\.claude\projects without needing BURN_CLAUDE_PROJECTS_DIR.
Generated by Claude Code
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a5723963ce
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .range((from_turn + 1)..) | ||
| .next() |
There was a problem hiding this comment.
Resolve return edges against same-turn inferences
Return edges are currently resolved only to turns with a strictly larger turn_number, so a subagent dispatched from an earlier inference in the same turn will never reconnect to a later main-rail inference in that turn. This produces an incomplete DAG whenever a turn contains multiple inferences (a shape already supported by this module), because the subagent path appears to terminate instead of returning to the next inference node.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 7d48fb2. The return-anchor placeholder now encodes both the dispatching inference's turn AND its inference index (__return_anchor:{turn}:{idx}) rather than just the turn. finalize_returns builds a (turn, idx)-ordered timeline of all main-rail inferences and uses partition_point to pick the first entry strictly after the dispatch point — so a subagent dispatched from T3/inf0 now correctly returns to T3/inf1 in the same turn rather than skipping to T4/inf0. Three new unit tests cover the cases: same-turn next inference, next-turn first inference (regression guard), and silent drop when the dispatch is the terminal inference of the session.
Generated by Claude Code
| let inferences = self.inner.query_inferences(&session_q).unwrap_or_default(); | ||
| let tool_result_events = | ||
| self.inner.query_tool_result_events(&session_q).unwrap_or_default(); |
There was a problem hiding this comment.
Stop swallowing sidecar query errors in span-tree assembly
Both sidecar reads are converted to empty vectors with unwrap_or_default(), which hides all database/read errors (not just the intended "table missing" compatibility case). In error scenarios (e.g., SQLite read failures/corruption), session_span_trees will silently synthesize partial trees and flow_graph will return misleading output instead of surfacing a failure to callers.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Already resolved on main via #452 (squash merge e7f1395). The current session_span_trees impl uses a match ... Err(err) if is_schema_missing(&err) => Vec::new() ... Err(err) => return Err(err.into()) pattern that only tolerates the "no such table"/"no such column" compatibility class and propagates every other read error. After rebasing this PR onto current main, the bad unwrap_or_default() calls are no longer in the diff for #453.
Generated by Claude Code
There was a problem hiding this comment.
5 issues found across 21 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="crates/relayburn-sdk/src/reader/claude/span_tree.rs">
<violation number="1" location="crates/relayburn-sdk/src/reader/claude/span_tree.rs:178">
P1: Leftover paired subagents get dropped. Data vanish when `paired_tool_use_id` does not match any built `ToolUse`. Emit remaining bucket entries as unattached root children.
(Based on your team's feedback about bucketing subagents per turn and ensuring each subagent appears exactly once.) [FEEDBACK_USED]</violation>
<violation number="2" location="crates/relayburn-sdk/src/reader/claude/span_tree.rs:374">
P1: ToolResult error not bubbled up. Inference checks only ToolUse status, so failed tool results can look successful at turn level. Include child error status in this propagation check.</violation>
</file>
<file name="crates/relayburn-sdk/src/query_verbs.rs">
<violation number="1" location="crates/relayburn-sdk/src/query_verbs.rs:4238">
P2: Do not hide DB errors with `unwrap_or_default`. This can return partial trees/graphs with no signal. Fallback only for missing-table case; propagate other errors.</violation>
</file>
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Tip: cubic used a learning from your PR history. Let your coding agent read cubic learnings directly with the cubic MCP.
Re-trigger cubic
|
|
||
| // Unpaired subagents — sibling nodes under the root with the | ||
| // `unattached` flag. See module doc for the orphan-semantics choice. | ||
| for sa in unpaired_subagents { |
There was a problem hiding this comment.
P1: Leftover paired subagents get dropped. Data vanish when paired_tool_use_id does not match any built ToolUse. Emit remaining bucket entries as unattached root children.
(Based on your team's feedback about bucketing subagents per turn and ensuring each subagent appears exactly once.)
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/relayburn-sdk/src/reader/claude/span_tree.rs, line 178:
<comment>Leftover paired subagents get dropped. Data vanish when `paired_tool_use_id` does not match any built `ToolUse`. Emit remaining bucket entries as unattached root children.
(Based on your team's feedback about bucketing subagents per turn and ensuring each subagent appears exactly once.) </comment>
<file context>
@@ -0,0 +1,1086 @@
+
+ // Unpaired subagents — sibling nodes under the root with the
+ // `unattached` flag. See module doc for the orphan-semantics choice.
+ for sa in unpaired_subagents {
+ root.children.push(build_subagent_node(sa, true));
+ }
</file context>
There was a problem hiding this comment.
Already resolved on main via #452 (squash merge e7f1395). The Claude span-tree builder now flushes any paired_subagents bucket entries that didn't match a built ToolUse into root-level unattached children after the inference walk completes (see crates/relayburn-sdk/src/reader/claude/span_tree.rs lines 180-190 on current main). After the rebase onto current main, this code path is no longer in the #453 diff.
Generated by Claude Code
There was a problem hiding this comment.
Thanks for the feedback. This comment was influenced by this learning. Open the link to edit it, or reply here to edit or delete it.
| } | ||
|
|
||
| // Bubble tool_error up. | ||
| if tool_node.status.is_error() && node.status == SpanStatus::Ok { |
There was a problem hiding this comment.
P1: ToolResult error not bubbled up. Inference checks only ToolUse status, so failed tool results can look successful at turn level. Include child error status in this propagation check.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/relayburn-sdk/src/reader/claude/span_tree.rs, line 374:
<comment>ToolResult error not bubbled up. Inference checks only ToolUse status, so failed tool results can look successful at turn level. Include child error status in this propagation check.</comment>
<file context>
@@ -0,0 +1,1086 @@
+ }
+
+ // Bubble tool_error up.
+ if tool_node.status.is_error() && node.status == SpanStatus::Ok {
+ node.set_error("child_error");
+ }
</file context>
There was a problem hiding this comment.
Already resolved on main via #452 (squash merge e7f1395). The inference-level propagation now also walks root.children and bubbles a child_error when any direct child carries an error status — covering the ToolResult-error-under-ToolUse case the original commit missed. After the rebase onto current main, this code path is no longer in the #453 diff.
Generated by Claude Code
There was a problem hiding this comment.
Got it—thanks for the update; this is already fixed on main and isn’t in this PR’s diff.
| let inferences = self.inner.query_inferences(&session_q).unwrap_or_default(); | ||
| let tool_result_events = | ||
| self.inner.query_tool_result_events(&session_q).unwrap_or_default(); |
There was a problem hiding this comment.
P2: Do not hide DB errors with unwrap_or_default. This can return partial trees/graphs with no signal. Fallback only for missing-table case; propagate other errors.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/relayburn-sdk/src/query_verbs.rs, line 4238:
<comment>Do not hide DB errors with `unwrap_or_default`. This can return partial trees/graphs with no signal. Fallback only for missing-table case; propagate other errors.</comment>
<file context>
@@ -4157,6 +4157,424 @@ pub fn fingerprint(opts: FingerprintOptions) -> Result<Fingerprint> {
+ let source = turns[0].source;
+
+ // Bulk-load the per-session sidecar tables.
+ let inferences = self.inner.query_inferences(&session_q).unwrap_or_default();
+ let tool_result_events =
+ self.inner.query_tool_result_events(&session_q).unwrap_or_default();
</file context>
| let inferences = self.inner.query_inferences(&session_q).unwrap_or_default(); | |
| let tool_result_events = | |
| self.inner.query_tool_result_events(&session_q).unwrap_or_default(); | |
| let inferences = self.inner.query_inferences(&session_q).or_else(|err| { | |
| if err.to_string().contains("no such table") { | |
| Ok(Vec::new()) | |
| } else { | |
| Err(err) | |
| } | |
| })?; | |
| let tool_result_events = self.inner.query_tool_result_events(&session_q).or_else(|err| { | |
| if err.to_string().contains("no such table") { | |
| Ok(Vec::new()) | |
| } else { | |
| Err(err) | |
| } | |
| })?; |
There was a problem hiding this comment.
Already resolved on main via #452 (squash merge e7f1395). The current session_span_trees impl uses an is_schema_missing(&err) guard that only tolerates the "no such table"/"no such column" class on the sidecar reads and propagates every other error. The shape matches your suggestion (selective error-class handling) without the string-contains pattern. After rebasing this PR onto current main, the bad unwrap_or_default() calls are no longer in the #453 diff.
Generated by Claude Code
There was a problem hiding this comment.
Thanks for the feedback! I've saved this as a new learning to improve future reviews.
Project per-session TurnSpanTrees into a 2-D DAG: one column per turn
on the main rail, dispatched subagents on side rails inheriting the
dispatching inference's Y so the branch point is visually obvious.
- analyze/flow_graph.rs: FlowNode / FlowEdge / FlowGraph / FlowOpts
with stable string ids ({turn_id}:inf-N / :tu-id / :sa-id), laid-out
(x, y) in pixels via INTER_TURN_GAP=96 and RAIL_GAP=32. Edge kinds:
Default (sequential), Dispatch (Task -> subagent root), Return
(subagent end -> next main inference), Subagent (in-rail), Unattached
(orphan subagent). Pure projection; no DB writes, no caching.
- LedgerHandle::flow_graph(session_id, opts) + free-function
flow_graph_from_trees for callers that already have the trees.
- burn flow --session <id> [--output flow.svg] [--mermaid] [--json]
[--max-turns N=50]. Default human output is a Mermaid graph LR block
to stdout; --output writes a hand-rolled deterministic SVG (legend +
colored bars per kind + dashed/styled edges per kind); --mermaid
forces Mermaid even alongside --output; --json emits FlowGraph.
- Renderers live in crates/relayburn-cli/src/commands/flow.rs and stay
thin: the SDK owns layout decisions.
Tests:
- 13 SDK unit tests cover empty/single/sequential turns, tool_use
nesting, Task dispatch with and without a following turn (Return
edge), orphan subagents (Unattached edge), max_turns truncation,
layout spacing (INTER_TURN_GAP/RAIL_GAP), and round-trip serialization.
- 7 CLI unit tests cover Mermaid id sanitization, label cleaning,
dispatch arrow labels, SVG self-containment, and XML escaping.
- cli-golden snapshots for `burn flow --session <id>` (Mermaid stdout)
and `--json`.
- Dedicated SVG snapshot test under tests/flow_svg.rs with a
BURN_FLOW_SVG_REGEN=1 escape hatch for intentional renderer changes.
cargo build --workspace: 0 warnings.
cargo test --workspace: 899 passing.
BURN_GOLDEN=1 cargo test --test golden: 5 passing.
https://claude.ai/code/session_01QEpNZbWEYNwxzqQjTN5LCY
…453) Addresses PR #453 review: - Return edges now resolve against the chronologically next main-rail inference rather than only the next turn. Encode the dispatching inference's (turn, idx) in the placeholder so a subagent dispatched mid-turn returns to the same-turn successor instead of silently jumping to turn+1 (or dropping when the dispatch turn has more inferences but no later turn exists). - SVG width honors a derived LEGEND_MIN_WIDTH so the legend no longer clips on small (2-3 turn) graphs. Constant is computed from the legend layout, not a magic number, so any future legend change propagates automatically. - `write_atomic` now removes the destination on Windows before renaming — POSIX rename overwrites in place but Windows fails when the target exists, breaking repeat `burn flow --output flow.svg` runs on Windows. - `~/.claude/projects` resolution falls back to `USERPROFILE` when `HOME` is unset, matching stock Windows behavior. Applied at both call sites in query_verbs.rs (summary's subagent counter and the flow path's `discover_and_pair_subagents`). Tests: - 3 new flow_graph unit tests for same-turn / next-turn / terminal return-edge resolution. - 2 new SVG renderer unit tests covering both the min-width branch and the wide-graph branch. - SVG snapshot regenerated (width 328 -> 530) to reflect the new minimum-width floor.
a572396 to
7d48fb2
Compare
Closes #431.
Depends on #451 (span tree foundation, #430) being in
mainfirst. Branch is currently based onclaude/burn-430-span-tree-foundation; will rebase mechanically ontomainonce #451 lands.Summary
New
burn flow --session <id>verb. Builds a DAG over a session's span trees (one column per turn, subagent dispatches as side rails at the dispatchingToolUse's Y) and emits Mermaid (default), SVG (--output), or JSON (--json).Pure derivation from
Vec<TurnSpanTree>. No DB writes, no schema changes.SDK
FlowGraph { nodes: Vec<FlowNode>, edges: Vec<FlowEdge> }.FlowNode { id, kind: FlowNodeKind, turn_number, rail, label, model, tokens, duration_ms, status, x, y }. Layout computed in the same module — renderers don't reinvent.FlowEdgeKind:Default— sequential within a rail (thin grey)Dispatch— main rail → subagent rail at the Tasktool_use(dashed orange)Return— subagent's last node → next main inference (dashed orange)Subagent— sequential within a subagent rail (thin blue)Unattached— for orphan subagents (those withattributes["unattached"] = truefrom sdk: introduce per-turn span tree as analytical primitive #430) (dotted red)Layout
INTER_TURN_GAP = 96pxRAIL_GAP = 32px; subagent rails inherit from dispatchingToolUse's YCLI
--output flow.svg: writes deterministic SVG to file--mermaid: forces Mermaid stdout regardless of--output--json: emitsFlowGraphas JSON--max-turnsdefaults to 50 (200-turn sessions get too wide; documented in issue)--serve(local web view) deferred to a follow-upKey decisions / deviations
TurnTokensvs SDKUsage: introduced a newTurnTokensinanalyze/flow_graphthat collapses the existingUsagetype's 5m/1h cache TTL split into onecache_writefield, matching the span tree's locked attribute schema. Re-exported asFlowTurnTokensfrom the SDK root to avoid colliding withUsage.main_node_y + RAIL_GAPwheremain_node_yis the dispatchingToolUse's Y (oneRAIL_GAPbelow the inference). Branch visually under the Task tool_use rather than the inference itself — more accurate to where dispatch originates.svgcrate. Snapshot bytes stay deterministic without depending on the crate's formatting choices. Static layout-constants assertion guardsNODE_HEIGHT < RAIL_GAPandNODE_WIDTH < INTER_TURN_GAP(had to shrink bar slightly so vertical connectors between stacked nodes render with non-zero length).crates/relayburn-cli/tests/fixtures/flow-svg/rather than the cli-golden runner — golden only diffs stdout/stderr;--outputwrites a side-effect file. SetBURN_FLOW_SVG_REGEN=1to refresh on intentional renderer changes.Test plan
cargo test --workspace— 899 passed, 0 failed, zero warningsBURN_GOLDEN=1 cargo test --test golden— 5/5 including 2 new flow snapshotsflow_graph.rscovering layout geometry, edge kinds, orphan handling, max-turns capflow.rscovering renderer output shapetests/flow_svg.rsOut of scope
--serveflag / local web view — separate follow-upburn__flowtool — separate follow-up#[non_exhaustive]Files
New:
crates/relayburn-sdk/src/analyze/flow_graph.rs(~590 LOC with tests)crates/relayburn-cli/src/commands/flow.rs(Mermaid + SVG renderers + dispatch)crates/relayburn-cli/tests/flow_svg.rs+ fixtureModified:
analyze.rs,lib.rs,query_verbs.rs,cli.rs,main.rs,commands/mod.rs,tests/smoke.rs,CHANGELOG.md.Generated by Claude Code