fix(mountsync): independent rootCtx-derived deadline for outbox/writeback flush by khaliqgant · Pull Request #270 · AgentWorkforce/relayfile

khaliqgant · 2026-06-11T11:59:11Z

Problem

The mount wraps both the mirror pull and the durable writeback/outbox push in one per-cycle context bounded by RELAYFILE_MOUNT_TIMEOUT (default 15s). On a large mounted tree (a real churn-digest box carries ~2,829 files / 1,252 Slack across 40 channels), a single cloud round-trip routinely exceeds 15s, so the writeback push is cancelled mid-flight and the outbox record retries across cycles for minutes. Symptoms in prod mount logs:

mount local change failed: context deadline exceeded
mount sync cycle failed: context deadline exceeded

…and provider writebacks (e.g. a Slack reply draft) landing ~10 min late — or not at all before the ephemeral sandbox tears down (cleanup flushExitCode:1).

Fix

A small outbound write must not share — and be starved by — the same budget as a full-tree mirror. bootstrapContext already solves exactly this for the heavy initial/periodic pull ("derive from rootCtx, not the per-cycle ctx, so a tiny RELAYFILE_MOUNT_TIMEOUT cannot starve it"). This applies the symmetric treatment to the outbox flush:

outboxContext() — derives the flush deadline from s.rootCtx with its own budget (default 60s, env RELAYFILE_OUTBOX_TIMEOUT). Process-shutdown cancellation via rootCtx still propagates.
flushOutboxRecords() runs the cloud upload under that deadline instead of the inbound per-cycle ctx.

Test

TestOutboxFlushUsesIndependentDeadlineNotPerCycleCtx hands the flush an already-expired per-cycle ctx and asserts the upload still completes under the rootCtx-derived deadline.

Proven red→green: without the fix the expired ctx leaks through (saw -5.78ms remaining) and the upload is cancelled; with it, the upload runs under the ~60s outbox deadline.
Full internal/mountsync suite green (no regressions); gofmt clean.

Rollout note

This reaches prod only via a Daytona snapshot rebuild (the RELAYFILE_MOUNT_VERSION pin) — merge alone does not deploy it.

Companion (cloud repo, separate PR)

Raise the cloud teardown flush wrap (sandbox-orchestrator.ts flushTimeoutSeconds, currently 20s) above the new 60s outbox budget so writebacks drain before SIGKILL.
Narrow over-broad persona mount scope (churn-digest mounts all 40 Slack channels) — pending verification that scope drives the mounted tree.

🤖 Generated with Claude Code

…eadline The mount wraps BOTH the mirror pull and the durable writeback/outbox push in one per-cycle ctx bounded by RELAYFILE_MOUNT_TIMEOUT (default 15s). On a large mounted tree a single cloud round-trip routinely exceeds 15s, so the writeback push is cancelled mid-flight and the record sits in the outbox retrying across cycles for minutes — surfacing as repeated mount local change failed: context deadline exceeded mount sync cycle failed: context deadline exceeded and provider writebacks (e.g. a Slack reply draft) landing minutes late or not at all before the ephemeral sandbox tears down. A small outbound write must not share — and be starved by — the same budget as a full-tree mirror. This is exactly the problem bootstrapContext already solves for the heavy initial/periodic pull ("derive from rootCtx, not the per-cycle ctx, so a tiny RELAYFILE_MOUNT_TIMEOUT cannot starve it"). Apply the symmetric treatment to the outbox flush: - New outboxContext(): derives the flush deadline from s.rootCtx with its own budget (default 60s, env RELAYFILE_OUTBOX_TIMEOUT). rootCtx cancellation (process shutdown) still propagates. - flushOutboxRecords() now runs the cloud upload under that deadline instead of the inbound per-cycle ctx. Test: TestOutboxFlushUsesIndependentDeadlineNotPerCycleCtx hands the flush an already-expired per-cycle ctx and asserts the upload still completes under the rootCtx-derived deadline (proven red without the fix: the expired ctx leaks through and the upload is cancelled). Full internal/mountsync suite green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai · 2026-06-11T11:59:20Z

📝 Walkthrough

Walkthrough

This PR isolates outbox/writeback flush operations from per-cycle mount timeouts by introducing a configurable, root context-derived deadline. The feature prevents in-flight uploads from being cancelled when the per-cycle RELAYFILE_MOUNT_TIMEOUT is exhausted, ensuring durable record flushes complete independently.

Changes

Outbox flush independent deadline

Layer / File(s)	Summary
Timeout configuration and constant `internal/mountsync/syncer.go`	Defines `defaultOutboxFlushTimeout` constant documenting that outbox flushes use a dedicated rootCtx-derived deadline, and extends `SyncerOptions` with a new `OutboxFlushTimeout` field defaulting from `RELAYFILE_OUTBOX_TIMEOUT` environment variable.
Syncer storage and initialization `internal/mountsync/syncer.go`	Adds `outboxFlushTimeout` field to `Syncer` struct, resolves the configured timeout value with environment variable fallback and non-positive value handling, and wires the computed timeout into the constructed `Syncer` instance.
Outbox flush context derivation and usage `internal/mountsync/syncer.go`	Introduces `outboxContext(parent)` helper method that derives a timeout context from `rootCtx` (with defensive fallback to parent context) using the configured outbox flush timeout. Updates `flushOutboxRecords` to run outbox record uploads using this independent flush context instead of the inbound per-cycle context.
Regression test for independent deadline `internal/mountsync/syncer_test.go`	Comprehensive test `TestOutboxFlushUsesIndependentDeadlineNotPerCycleCtx` that seeds a pending outbox record via cancelled upload, then verifies that a subsequent flush called with an already-expired per-cycle context still executes the bulk upload under the independent outbox-root deadline and clears the pending records.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

no-agent-relay-review

Poem

🐰 Timeouts that timeout? Not on my watch!
Outbox flushes now chart their own path,
Root context steady, per-cycle loose—
Writes finish strong, no more abuse! 📦✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: introducing an independent deadline for outbox/writeback flush derived from rootCtx instead of per-cycle context, which is the core fix.
Description check	✅ Passed	The description provides comprehensive context about the problem (writeback cancellations on large trees), the specific fix (outboxContext with independent deadline), test coverage, and rollout notes, all directly related to the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/mount-outbox-independent-deadline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request decouples the outbox flush/writeback process from the per-cycle mount timeout by introducing an independent OutboxFlushTimeout and deriving the flush context from the root context. This prevents small outbound writes from being starved or canceled by short per-cycle deadlines. A test has also been added to verify this behavior. The review feedback points out a potential panic in the defensive context-handling logic if both the root context and the parent context are nil, suggesting a fallback to context.Background().

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-11T12:00:46Z

+	root := s.rootCtx
+	if root == nil {
+		// Defensive: NewSyncer always sets rootCtx, but never derive from a nil
+		// parent — fall back to the inbound ctx so behaviour degrades to the
+		// pre-fix per-cycle deadline rather than panicking.
+		root = parent
+	}


If both s.rootCtx and parent are nil (which can happen if Syncer is constructed manually in tests and a nil context is passed), root will remain nil. Passing a nil parent context to context.WithTimeout will cause a panic. To prevent this, we should defensively fall back to context.Background() if both are nil.

Suggested change

root := s.rootCtx

if root == nil {

// Defensive: NewSyncer always sets rootCtx, but never derive from a nil

// parent — fall back to the inbound ctx so behaviour degrades to the

// pre-fix per-cycle deadline rather than panicking.

root = parent

}

root := s.rootCtx

if root == nil {

if parent != nil {

root = parent

} else {

root = context.Background()

}

}

github-actions · 2026-06-11T12:13:29Z

Relayfile Eval Review

Run: .relayfile/evals/runs/2026-06-11T12-13-29-119Z-HEAD-provider
Mode: provider
Git SHA: b17d29b

Human Review Cases

No reviewable human-review cases captured Relayfile output.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/mountsync/syncer.go`:
- Around line 1779-1783: outboxContext currently ignores the caller's ctx
(breaking FlushOutboxOnce timeout); change outboxContext to take the caller ctx
as the parent and return a derived ctx that still applies any root-cycle
timeout/values from s.rootCtx but does not bypass caller cancellation—e.g., if
s.rootCtx has a deadline, compute the earlier deadline between caller ctx and
s.rootCtx and use context.WithDeadline on the caller ctx (or otherwise compose
cancellations) so FlushOutboxOnce(ctx) and
flushOutboxRecords/flushOutboxRecordChunk calls will observe the caller
cancellation while cycle-path callers still get the root-derived timeout.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bfd5699c-e6da-4491-a19f-5a4fe9cc2d2d

📥 Commits

Reviewing files that changed from the base of the PR and between 123190c and 6ba3f39.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (2)

internal/mountsync/syncer.go
internal/mountsync/syncer_test.go

coderabbitai · 2026-06-11T12:15:10Z

+	flushCtx, cancel := s.outboxContext(ctx)
+	defer cancel()
 	for _, chunk := range chunkOutboxRecords(due, maxWritebackBatchBytes()) {
-		if err := s.flushOutboxRecordChunk(ctx, chunk, conflicted); err != nil {
+		if err := s.flushOutboxRecordChunk(flushCtx, chunk, conflicted); err != nil {
 			return err


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

outboxContext currently bypasses explicit caller cancellation in FlushOutboxOnce.

Because outboxContext always derives from s.rootCtx when present, flushOutboxRecords no longer observes the inbound ctx cancellation. That is fine for per-cycle sync, but it can break FlushOutboxOnce(ctx) timeout intent (caller supplies a bounded ctx). Please preserve parent cancellation for one-shot flush call paths while keeping rootCtx-derived timeout for cycle-path flushes.

Suggested direction

-func (s *Syncer) flushOutboxRecords(ctx context.Context, conflicted map[string]struct{}, forceDue bool) error { +func (s *Syncer) flushOutboxRecords(ctx context.Context, conflicted map[string]struct{}, forceDue bool) error { ... - flushCtx, cancel := s.outboxContext(ctx) + flushCtx, cancel := s.outboxContext(ctx, forceDue /* respectParentCancel */) defer cancel() ... } -func (s *Syncer) outboxContext(parent context.Context) (context.Context, context.CancelFunc) { +func (s *Syncer) outboxContext(parent context.Context, respectParentCancel bool) (context.Context, context.CancelFunc) { root := s.rootCtx if root == nil { root = parent } timeout := s.outboxFlushTimeout if timeout <= 0 { timeout = defaultOutboxFlushTimeout } - return context.WithTimeout(root, timeout) + ctx, cancel := context.WithTimeout(root, timeout) + if !respectParentCancel || parent == nil { + return ctx, cancel + } + stop := context.AfterFunc(parent, cancel) + return ctx, func() { + stop() + cancel() + } }

Also applies to: 2854-2867

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/mountsync/syncer.go` around lines 1779 - 1783, outboxContext currently ignores the caller's ctx (breaking FlushOutboxOnce timeout); change outboxContext to take the caller ctx as the parent and return a derived ctx that still applies any root-cycle timeout/values from s.rootCtx but does not bypass caller cancellation—e.g., if s.rootCtx has a deadline, compute the earlier deadline between caller ctx and s.rootCtx and use context.WithDeadline on the caller ctx (or otherwise compose cancellations) so FlushOutboxOnce(ctx) and flushOutboxRecords/flushOutboxRecordChunk calls will observe the caller cancellation while cycle-path callers still get the root-derived timeout.

agent-relay-code · 2026-06-11T12:31:02Z

Implemented the PR fixes locally.

Changed:

Preserved the new root-derived outbox deadline for normal sync-cycle flushes.
Made forced one-shot flushes (FlushOutboxOnce(ctx)) respect caller cancellation.
Added a regression test for canceled FlushOutboxOnce.
Added defensive context.Background() fallback if both root and parent contexts are nil.

Addressed comments

CodeRabbit inline review: outboxContext bypassed FlushOutboxOnce(ctx) cancellation. Fixed in internal/mountsync/syncer.go:1780 and internal/mountsync/syncer.go:2874; regression test added at internal/mountsync/syncer_test.go:2626.
CodeRabbit review summary: repeated the same FlushOutboxOnce(ctx) cancellation finding. Fixed in internal/mountsync/syncer.go:1780.
Gemini Code Assist: potential panic if both rootCtx and parent context are nil. Fixed with context.Background() fallback in internal/mountsync/syncer.go:2863.
CodeRabbit walkthrough/pre-merge note: docstring coverage warning. No separate code change beyond updating the outboxContext doc at internal/mountsync/syncer.go:2847; the warning was repo/check advisory, not a validated PR breakage.
github-actions relayfile eval review: reported 4 passed, 0 failed, no reviewable cases. No code change needed.

Verification

Passed locally:

go test ./internal/mountsync
go test ./...
Go binary builds for cmd/relayfile, cmd/relayfile-mount, cmd/relayfile-cli
npm ci
npm run build
npm run test
npm run typecheck
./scripts/check-contract-surface.sh
npx tsx scripts/e2e.ts --ci

The conditional workers typecheck job is skipped in this checkout because packages/server/tsconfig.json is absent. GitHub metadata currently reports PR #270 is already merged/closed, so I’m not ending with READY.

agent-relay-code · 2026-06-11T13:07:33Z

Implemented the PR review fixes in the checkout.

Changed:

internal/mountsync/syncer.go: normal sync-cycle outbox flushes keep the independent root-derived timeout, while forced FlushOutboxOnce(ctx) flushes also respect caller cancellation.
internal/mountsync/syncer.go: outboxContext now falls back to context.Background() if both root and parent contexts are nil.
internal/mountsync/syncer_test.go: added regressions for nil-context fallback and FlushOutboxOnce caller cancellation.

Addressed comments

gemini-code-assist[bot]: flagged panic when both s.rootCtx and parent context are nil. Fixed in internal/mountsync/syncer.go:2861; regression added in internal/mountsync/syncer_test.go:2118.
coderabbitai[bot]: flagged outboxContext bypassing explicit FlushOutboxOnce(ctx) cancellation. Fixed in internal/mountsync/syncer.go:1779 and internal/mountsync/syncer.go:2868; regression added in internal/mountsync/syncer_test.go:2640.
coderabbitai[bot] walkthrough comment: informational summary of the PR, no separate validated code finding beyond the inline cancellation issue above.
github-actions[bot]: relayfile eval review reported 4 passed, 0 failed, 0 needing human review; no code change needed.
agent-relay-code[bot]: listed the same local fixes needed for CodeRabbit/Gemini findings; implemented and validated in the files above.

Advisory Notes

Local .msd-autofix-* metadata references older unrelated findings outside PR fix(mountsync): independent rootCtx-derived deadline for outbox/writeback flush #270’s diff and changed files; left unchanged as out of scope.

Local verification passed:

go test ./internal/mountsync -count=1 -timeout 180s
go test ./... -count=1 -timeout 240s
Go binary builds for cmd/relayfile, cmd/relayfile-mount, cmd/relayfile-cli
npm ci
npm run build
npm run typecheck
npm test
SDK CI sequence: package npm ci, core build, SDK build, npx tsc --noEmit, SDK tests
./scripts/check-contract-surface.sh
npx tsx scripts/e2e.ts --ci

gemini-code-assist Bot reviewed Jun 11, 2026

View reviewed changes

chore: apply pr-reviewer fixes for #270

6ba3f39

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

khaliqgant merged commit ca90b87 into main Jun 11, 2026
9 checks passed

khaliqgant deleted the fix/mount-outbox-independent-deadline branch June 11, 2026 12:27

This was referenced Jun 18, 2026

feat(mount): --push-local-once teardown drain (stop dropping last-moment writeback drafts) #304

Merged

Fix mountsync control-file upload filtering #327

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mountsync): independent rootCtx-derived deadline for outbox/writeback flush#270

fix(mountsync): independent rootCtx-derived deadline for outbox/writeback flush#270
khaliqgant merged 2 commits into
mainfrom
fix/mount-outbox-independent-deadline

khaliqgant commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

Uh oh!

agent-relay-code Bot commented Jun 11, 2026

Uh oh!

agent-relay-code Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

khaliqgant commented Jun 11, 2026

Problem

Fix

Test

Rollout note

Companion (cloud repo, separate PR)

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 11, 2026

Relayfile Eval Review

Human Review Cases

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agent-relay-code Bot commented Jun 11, 2026

Addressed comments

Verification

Uh oh!

agent-relay-code Bot commented Jun 11, 2026

Addressed comments

Advisory Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading