Skip to content

fix(mount): clamp exportTimeout under positive bootstrap hard cap + narrow 429 fallback to workspace_busy (#223 follow-up)#224

Merged
khaliqgant merged 1 commit into
mainfrom
fix/export-hardcap-429-narrow
May 31, 2026
Merged

fix(mount): clamp exportTimeout under positive bootstrap hard cap + narrow 429 fallback to workspace_busy (#223 follow-up)#224
khaliqgant merged 1 commit into
mainfrom
fix/export-hardcap-429-narrow

Conversation

@khaliqgant

Copy link
Copy Markdown
Member

#223 follow-up — bring the pr-reviewer bot's hardening onto main

The pr-reviewer bot pushed two real hardening fixes (5219a1b) to the #223 branch after #223 was merged, so they never reached main. This brings them in cleanly — hand-merged, not cherry-picked, because the bot branched from #223's first commit (9406189) and a cherry-pick would silently revert the gemini clamp-warning log now on main. The bot commit's unrelated .trajectories/* debris is excluded.

Both fixes reviewed and concurred by me + CIGate + PortabilityDesign.

1. Clamp exportTimeout under a positive bootstrap hard cap

exportTimeout was previously clamped only below ¾·bootstrapIdleTimeout (the no-progress watchdog). If an operator sets a positive RELAYFILE_BOOTSTRAP_TIMEOUT (hard total cap) shorter than the export sub-deadline, the parent bootstrap ctx's hard cap fires before the export's own deadline → pullRemoteFullExport takes the ctx.Err()!=nil propagate branch instead of falling through to the resumable tree pull → same-cycle convergence defeated. Now clamps to the min of ¾·idle and ¾·hard-cap. No-op under the recommended unset (0/unbounded) config — purely defensive for the non-recommended config.

2. Narrow the 429 → tree fallback to workspace_busy

The 429 fall-through is now restricted to Code == "workspace_busy" (the DO-busy signal in ProbeV085's prod evidence). Other 429 classes (global rate_limited, queue_full) are not export-specific — flooding the per-file tree path wouldn't help and could worsen a global limit — so they remain visible to the caller after doJSON exhausts its Retry-After backoff.

Tests (full internal/mountsync suite green, go vet + gofmt clean)

  • TestReconcileFallsBackToTreeWhenExportExceedsHardBootstrapCapBootstrapTimeout=200ms + ExportTimeout=1s → clamped to 150ms → falls to tree before the hard cap + bootstrap completes.
  • TestExportSnapshotOverloadedClassification — +429 rate_limited and +429 queue_full in the supported (retry-export, not fall-to-tree) set.
  • Existing TestReconcileFallsBackToTreeWhenExportWorkspaceBusy still green (workspace_busy still falls to tree).

Contract

Zero cloud↔daemon contract change (still only the one optional RELAYFILE_EXPORT_TIMEOUT env from #223). Ships in the same operator relayfile release as #223.

🤖 Generated with Claude Code

… narrow 429 fallback to workspace_busy

Follow-up to #223 — brings the pr-reviewer bot's two hardening fixes (5219a1b)
onto main. They landed on the PR branch AFTER #223 was merged, so they are not
yet on main; hand-merged here (NOT cherry-picked) to preserve the gemini
clamp-warning log already on main, and excluding the bot commit's unrelated
.trajectories/* debris.

1. Hard-cap clamp: also bound exportTimeout below 3/4 of a POSITIVE
   bootstrapTimeout (RELAYFILE_BOOTSTRAP_TIMEOUT), taking the min with the
   no-progress idle-watchdog clamp. Without this, an operator-set hard cap
   shorter than the export sub-deadline cancels the parent bootstrap ctx before
   the export's own deadline fires -> pullRemoteFullExport propagates instead of
   falling through to the resumable tree pull, defeating same-cycle convergence.
   No-op under the recommended unset (0/unbounded) config; purely defensive.

2. Narrow the 429 -> tree fallback to workspace_busy specifically. Other 429
   classes (global rate limits, queue pressure) are not export-specific and
   should remain visible to the caller after retries are exhausted rather than
   flooding the per-file tree path. Matches the prod signal (429 workspace_busy).

Tests: TestReconcileFallsBackToTreeWhenExportExceedsHardBootstrapCap (hard cap
shorter than sub-deadline -> still falls to tree + bootstrap completes); +2
supported-429 classification cases (rate_limited, queue_full stay visible). Full
internal/mountsync suite green; go vet + gofmt clean. No cloud-contract change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 899bd5d3-0c7a-4444-830d-ace016f27197

📥 Commits

Reviewing files that changed from the base of the PR and between f133096 and 2b6bcac.

📒 Files selected for processing (2)
  • internal/mountsync/syncer.go
  • internal/mountsync/syncer_test.go

📝 Walkthrough

Walkthrough

This PR refines export operation timeout management and HTTP error classification. The timeout clamping now applies a dual-window constraint (idle watchdog and bootstrap hard cap), and HTTP 429 handling is narrowed so only workspace_busy triggers unsupported fallback; other 429 codes remain retryable.

Changes

Export timeout and error classification

Layer / File(s) Summary
Export timeout clamping with hard bootstrap cap
internal/mountsync/syncer.go, internal/mountsync/syncer_test.go
NewSyncer computes maxExportTimeout from 3/4 of bootstrapIdleTimeout and further constrains it by 3/4 of bootstrapTimeout when positive. Log message updated for dual-window constraint. New regression test TestReconcileFallsBackToTreeWhenExportExceedsHardBootstrapCap validates that reconcile falls back to tree pull before the hard cap cancels context.
HTTP 429 error classification refinement
internal/mountsync/syncer.go, internal/mountsync/syncer_test.go
exportSnapshotUnsupported now treats HTTP 429 as unsupported only when error code equals workspace_busy (case-insensitive); rate_limited and queue_full remain transient and retryable. TestExportSnapshotOverloadedClassification extended with two additional 429 cases asserting they are supported/transient.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • AgentWorkforce/relayfile#195: Extends exportSnapshotUnsupported to refine HTTP failure classification for truncated JSON, DO overload, and 413; shares the same error-classification narrowing pattern as this PR's 429 handling.
  • AgentWorkforce/relayfile#166: Introduces the bootstrap timeout hard-cap model that this PR builds upon; the dual-window clamping and associated reconcile fallback logic extend that earlier timeout architecture.

Poem

🐰 Through windows of time, two clocks align,
Bootstrap and idle, a dual design—
When export runs long, the tree path reigns true,
And workspace-busy whispers which way to pursue.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: clamping exportTimeout under bootstrap hard cap and narrowing 429 fallback to workspace_busy, with clear reference to the follow-up nature (#223 follow-up).
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing both fixes, their rationale, test coverage, and contract implications for the bootstrap timeout and 429 handling changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/export-hardcap-429-narrow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request ensures that the export timeout is properly clamped based on the active bootstrap window, taking into account any hard bootstrap timeout cap, so that slow exports yield to resumable tree pulls before the context is canceled. It also refines HTTP 429 error handling to only fall back to tree pulls for "workspace_busy" errors, leaving other 429 errors visible. Feedback on this PR suggests improving the clamping log message to avoid confusing output when no hard bootstrap timeout cap is active.

Comment on lines +1127 to 1132
if maxExportTimeout > 0 && exportTimeout > maxExportTimeout {
if opts.Logger != nil {
opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout)
opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout)
}
exportTimeout = maxExportTimeout
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When bootstrapTimeout is 0 (the default, meaning no hard cap is active), printing hard cap 0s and stating that the timeout must stay strictly under it is confusing and inaccurate.

Consider splitting the log message to only mention the hard cap when bootstrapTimeout > 0 is actually true, preserving the original clear message for the default configuration.

	if maxExportTimeout > 0 && exportTimeout > maxExportTimeout {
		if opts.Logger != nil {
			if bootstrapTimeout > 0 {
				opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout)
			} else {
				opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout)
			}
		}
		exportTimeout = maxExportTimeout
	}

@github-actions

Copy link
Copy Markdown

Relayfile Eval Review

Run: .relayfile/evals/runs/2026-05-31T07-01-46-934Z-HEAD-provider
Mode: provider
Git SHA: 054a6b0

Passed: 4 | Needs human: 0 | Reviewable: 0 | Missing output: 0 | Failed: 0 | Skipped: 0

Human Review Cases

No reviewable human-review cases captured Relayfile output.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/mountsync/syncer.go">

<violation number="1" location="internal/mountsync/syncer.go:1129">
P3: This log message always includes a hard-cap clause, so when `BootstrapTimeout` is unset (`0`) it can print `hard cap 0s`, which is misleading. Only include hard-cap wording when `bootstrapTimeout > 0` and keep the idle-timeout-only message for the default path.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

if maxExportTimeout > 0 && exportTimeout > maxExportTimeout {
if opts.Logger != nil {
opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout)
opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: This log message always includes a hard-cap clause, so when BootstrapTimeout is unset (0) it can print hard cap 0s, which is misleading. Only include hard-cap wording when bootstrapTimeout > 0 and keep the idle-timeout-only message for the default path.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/mountsync/syncer.go, line 1129:

<comment>This log message always includes a hard-cap clause, so when `BootstrapTimeout` is unset (`0`) it can print `hard cap 0s`, which is misleading. Only include hard-cap wording when `bootstrapTimeout > 0` and keep the idle-timeout-only message for the default path.</comment>

<file context>
@@ -1109,14 +1109,24 @@ func NewSyncer(client RemoteClient, opts SyncerOptions) (*Syncer, error) {
+	if maxExportTimeout > 0 && exportTimeout > maxExportTimeout {
 		if opts.Logger != nil {
-			opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under bootstrapIdleTimeout %s so the export yields to the resumable tree pull before the watchdog fires)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout)
+			opts.Logger.Printf("clamping exportTimeout from %s to %s (must stay strictly under the active bootstrap window — no-progress watchdog %s, hard cap %s — so the export yields to the resumable tree pull before the bootstrap ctx is canceled)", exportTimeout, maxExportTimeout, bootstrapIdleTimeout, bootstrapTimeout)
 		}
 		exportTimeout = maxExportTimeout
</file context>

@khaliqgant khaliqgant merged commit b68790d into main May 31, 2026
9 checks passed
@khaliqgant khaliqgant deleted the fix/export-hardcap-429-narrow branch May 31, 2026 07:09
@agent-relay-code

Copy link
Copy Markdown
Contributor

Reviewed PR #224 locally and made two small fixes in internal/mountsync/syncer.go: updated the ExportTimeout option comment to include the new positive BootstrapTimeout clamp behavior, and changed the new clamp log string to plain ASCII.

Validation:

  • scripts/check-contract-surface.sh passed.
  • go test ./internal/mountsync could not run because this environment has no go or gofmt binary installed.

@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer applied fixes — committed and pushed af71240 to this PR. The notes below describe what changed.

Reviewed PR #224 locally and made two small fixes in internal/mountsync/syncer.go: updated the ExportTimeout option comment to include the new positive BootstrapTimeout clamp behavior, and changed the new clamp log string to plain ASCII.

Validation:

  • scripts/check-contract-surface.sh passed.
  • go test ./internal/mountsync could not run because this environment has no go or gofmt binary installed.

@agent-relay-code agent-relay-code Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr-reviewer applied fixes — committed and pushed af71240 to this PR. The notes below describe what changed.

Reviewed PR #224 locally and made two small fixes in internal/mountsync/syncer.go: updated the ExportTimeout option comment to include the new positive BootstrapTimeout clamp behavior, and changed the new clamp log string to plain ASCII.

Validation:

  • scripts/check-contract-surface.sh passed.
  • go test ./internal/mountsync could not run because this environment has no go or gofmt binary installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant