fix(mount): clamp exportTimeout under positive bootstrap hard cap + narrow 429 fallback to workspace_busy (#223 follow-up)#224
Conversation
… narrow 429 fallback to workspace_busy Follow-up to #223 — brings the pr-reviewer bot's two hardening fixes (5219a1b) onto main. They landed on the PR branch AFTER #223 was merged, so they are not yet on main; hand-merged here (NOT cherry-picked) to preserve the gemini clamp-warning log already on main, and excluding the bot commit's unrelated .trajectories/* debris. 1. Hard-cap clamp: also bound exportTimeout below 3/4 of a POSITIVE bootstrapTimeout (RELAYFILE_BOOTSTRAP_TIMEOUT), taking the min with the no-progress idle-watchdog clamp. Without this, an operator-set hard cap shorter than the export sub-deadline cancels the parent bootstrap ctx before the export's own deadline fires -> pullRemoteFullExport propagates instead of falling through to the resumable tree pull, defeating same-cycle convergence. No-op under the recommended unset (0/unbounded) config; purely defensive. 2. Narrow the 429 -> tree fallback to workspace_busy specifically. Other 429 classes (global rate limits, queue pressure) are not export-specific and should remain visible to the caller after retries are exhausted rather than flooding the per-file tree path. Matches the prod signal (429 workspace_busy). Tests: TestReconcileFallsBackToTreeWhenExportExceedsHardBootstrapCap (hard cap shorter than sub-deadline -> still falls to tree + bootstrap completes); +2 supported-429 classification cases (rate_limited, queue_full stay visible). Full internal/mountsync suite green; go vet + gofmt clean. No cloud-contract change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR refines export operation timeout management and HTTP error classification. The timeout clamping now applies a dual-window constraint (idle watchdog and bootstrap hard cap), and HTTP 429 handling is narrowed so only ChangesExport timeout and error classification
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request ensures that the export timeout is properly clamped based on the active bootstrap window, taking into account any hard bootstrap timeout cap, so that slow exports yield to resumable tree pulls before the context is canceled. It also refines HTTP 429 error handling to only fall back to tree pulls for "workspace_busy" errors, leaving other 429 errors visible. Feedback on this PR suggests improving the clamping log message to avoid confusing output when no hard bootstrap timeout cap is active.
| if maxExportTimeout > 0 && exportTimeout > maxExportTimeout { | ||
| if opts.Logger != nil { | ||
| opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout) | ||
| opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout) | ||
| } | ||
| exportTimeout = maxExportTimeout | ||
| } |
There was a problem hiding this comment.
When bootstrapTimeout is 0 (the default, meaning no hard cap is active), printing hard cap 0s and stating that the timeout must stay strictly under it is confusing and inaccurate.
Consider splitting the log message to only mention the hard cap when bootstrapTimeout > 0 is actually true, preserving the original clear message for the default configuration.
if maxExportTimeout > 0 && exportTimeout > maxExportTimeout {
if opts.Logger != nil {
if bootstrapTimeout > 0 {
opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout)
} else {
opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout)
}
}
exportTimeout = maxExportTimeout
}
Relayfile Eval ReviewRun: Passed: 4 | Needs human: 0 | Reviewable: 0 | Missing output: 0 | Failed: 0 | Skipped: 0 Human Review CasesNo reviewable human-review cases captured Relayfile output. |
There was a problem hiding this comment.
1 issue found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="internal/mountsync/syncer.go">
<violation number="1" location="internal/mountsync/syncer.go:1129">
P3: This log message always includes a hard-cap clause, so when `BootstrapTimeout` is unset (`0`) it can print `hard cap 0s`, which is misleading. Only include hard-cap wording when `bootstrapTimeout > 0` and keep the idle-timeout-only message for the default path.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| if maxExportTimeout > 0 && exportTimeout > maxExportTimeout { | ||
| if opts.Logger != nil { | ||
| opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout) | ||
| opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout) |
There was a problem hiding this comment.
P3: This log message always includes a hard-cap clause, so when BootstrapTimeout is unset (0) it can print hard cap 0s, which is misleading. Only include hard-cap wording when bootstrapTimeout > 0 and keep the idle-timeout-only message for the default path.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/mountsync/syncer.go, line 1129:
<comment>This log message always includes a hard-cap clause, so when `BootstrapTimeout` is unset (`0`) it can print `hard cap 0s`, which is misleading. Only include hard-cap wording when `bootstrapTimeout > 0` and keep the idle-timeout-only message for the default path.</comment>
<file context>
@@ -1109,14 +1109,24 @@ func NewSyncer(client RemoteClient, opts SyncerOptions) (*Syncer, error) {
+ if maxExportTimeout > 0 && exportTimeout > maxExportTimeout {
if opts.Logger != nil {
- opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout)
+ opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout)
}
exportTimeout = maxExportTimeout
</file context>
|
Reviewed PR #224 locally and made two small fixes in internal/mountsync/syncer.go: updated the Validation:
|
|
✅ pr-reviewer applied fixes — committed and pushed Reviewed PR #224 locally and made two small fixes in internal/mountsync/syncer.go: updated the Validation:
|
There was a problem hiding this comment.
✅ pr-reviewer applied fixes — committed and pushed af71240 to this PR. The notes below describe what changed.
Reviewed PR #224 locally and made two small fixes in internal/mountsync/syncer.go: updated the ExportTimeout option comment to include the new positive BootstrapTimeout clamp behavior, and changed the new clamp log string to plain ASCII.
Validation:
scripts/check-contract-surface.shpassed.go test ./internal/mountsynccould not run because this environment has nogoorgofmtbinary installed.
#223 follow-up — bring the pr-reviewer bot's hardening onto main
The pr-reviewer bot pushed two real hardening fixes (
5219a1b) to the #223 branch after #223 was merged, so they never reached main. This brings them in cleanly — hand-merged, not cherry-picked, because the bot branched from #223's first commit (9406189) and a cherry-pick would silently revert the gemini clamp-warning log now on main. The bot commit's unrelated.trajectories/*debris is excluded.Both fixes reviewed and concurred by me + CIGate + PortabilityDesign.
1. Clamp
exportTimeoutunder a positive bootstrap hard capexportTimeoutwas previously clamped only below ¾·bootstrapIdleTimeout(the no-progress watchdog). If an operator sets a positiveRELAYFILE_BOOTSTRAP_TIMEOUT(hard total cap) shorter than the export sub-deadline, the parent bootstrap ctx's hard cap fires before the export's own deadline →pullRemoteFullExporttakes thectx.Err()!=nilpropagate branch instead of falling through to the resumable tree pull → same-cycle convergence defeated. Now clamps to the min of ¾·idle and ¾·hard-cap. No-op under the recommended unset (0/unbounded) config — purely defensive for the non-recommended config.2. Narrow the 429 → tree fallback to
workspace_busyThe 429 fall-through is now restricted to
Code == "workspace_busy"(the DO-busy signal in ProbeV085's prod evidence). Other 429 classes (globalrate_limited,queue_full) are not export-specific — flooding the per-file tree path wouldn't help and could worsen a global limit — so they remain visible to the caller afterdoJSONexhausts its Retry-After backoff.Tests (full
internal/mountsyncsuite green,go vet+ gofmt clean)TestReconcileFallsBackToTreeWhenExportExceedsHardBootstrapCap—BootstrapTimeout=200ms+ExportTimeout=1s→ clamped to 150ms → falls to tree before the hard cap + bootstrap completes.TestExportSnapshotOverloadedClassification— +429 rate_limitedand +429 queue_fullin the supported (retry-export, not fall-to-tree) set.TestReconcileFallsBackToTreeWhenExportWorkspaceBusystill green (workspace_busy still falls to tree).Contract
Zero cloud↔daemon contract change (still only the one optional
RELAYFILE_EXPORT_TIMEOUTenv from #223). Ships in the same operator relayfile release as #223.🤖 Generated with Claude Code