feat(mount): seed GitHub working trees from export tar by khaliqgant · Pull Request #213 · AgentWorkforce/relayfile

khaliqgant · 2026-05-27T10:32:36Z

Summary

add a raw-tar GitHub working-tree seed path for relayfile-mount bootstrap
verify seeded files against fs/tree contentHash and fail on missing/unexpected entries
decode local checkout paths while preserving RelayFile object paths for writeback
preserve the clone sentinel events cursor and fall back cleanly when the contract is unsupported

Contract notes

requests gzip=0 for /fs/export?format=tar&decode=github-working-tree
reads .relayfile/clone.json first, with legacy meta.json fallback
accepts eventsCursor plus legacy aliases; forward-scans for the sentinel cursor until import stamps eventsCursor

Tests

go test ./internal/mountsync ./cmd/relayfile-mount
git diff --check

coderabbitai · 2026-05-27T10:32:48Z

Warning

Review limit reached

@khaliqgant, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 34 minutes and 15 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 381779d5-e98c-42a2-b9fc-385ad96711da

📥 Commits

Reviewing files that changed from the base of the PR and between ec396ef and 919ddaa.

📒 Files selected for processing (3)

internal/mountsync/http_client_test.go
internal/mountsync/syncer.go
internal/mountsync/syncer_test.go

📝 Walkthrough

Walkthrough

This PR adds GitHub "working tree" mount support to mountsync, enabling tar-based full-tree bootstrapping, GitHub-specific path mapping, and event cursor seeding from clone manifests.

Changes

GitHub Working Tree Mount and Tar-based Seed Support

Layer / File(s)	Summary
Data contracts and GitHub working tree types `internal/mountsync/syncer.go`	TreeEntry gains `Size` and `Encoding` fields; new types `GithubWorkingTreeSeedRequest` and `GithubWorkingTreeTar` added for tar-export flow; `githubWorkingTreeMount` struct introduced for path mapping; `SyncerState` extended with `GithubWorkingTreeHeadSHA` persistence.
HTTPClient tar export implementation and test `internal/mountsync/syncer.go`, `internal/mountsync/http_client_test.go`	`ExportGithubWorkingTreeTar` method streams tar-format exports with auth, retry, and error handling; HTTP client test validates query parameters, content-type preservation, and tar entry parsing.
Syncer GitHub mount detection and state initialization `internal/mountsync/syncer.go`	`NewSyncer` detects GitHub working-tree mounts from `remoteRoot`; `githubWorkingTree` field initialized on Syncer; persisted `GithubWorkingTreeHeadSHA` restored into in-memory mount state during load.
GitHub-aware path translation and safety helpers `internal/mountsync/syncer.go`	Implements `remoteToLocalPath`, `localPathToRemotePath`, `localRelativeToRemotePath` for GitHub mapping; adds `githubRemotePathForWorkingTreeRel` revision selection, `safeLocalPath` traversal protection, and `detectGithubWorkingTreeMount` with sentinel/meta path handling.
GitHub tar seed bootstrap orchestration `internal/mountsync/syncer.go`	Reads clone manifest for cursor, verifies expected tree via paginated `ListTree`, exports tar for target head SHA, applies tar files with hash verification and dirty-state preservation, performs safety checks, and commits bootstrap state with events cursor and head SHA.
Path resolution call site updates throughout syncer `internal/mountsync/syncer.go`	Eight call sites updated to use GitHub-aware path translation: local event handler, bootstrap/incremental hash probes, apply-remote operations, local push, read-deny cleanup, and file scanning.
Integration and unit test coverage `internal/mountsync/syncer_test.go`, `internal/mountsync/http_client_test.go`	Comprehensive reconcile test seeds local files from tar export and verifies cursor/head-SHA persistence; `fakeExportClient` extended with deterministic tar generation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

AgentWorkforce/relayfile#92: Both PRs modify state.EventsCursor handling in pullRemoteFull, including early-path logic and cursor resolution behavior that interact at the same decision points.
AgentWorkforce/relayfile#185: Both PRs update bootstrap and full-reconcile flow in pullRemoteFull around WebSocket/cursor polling, with direct dependency on syncer behavior changes.

Poem

🐰 A rabbit hops through working trees so bright,
With tar-seeded roots and paths mapped just right,
GitHub's clone manifest guides every leap,
While checksums guard the promises we keep!
Thump, thump! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: adding GitHub working tree seed support via tar export. It is concise, specific, and clearly summarizes the primary objective.
Description check	✅ Passed	The description is directly related to the changeset. It provides a summary of the implementation approach, contract details, and test verification, all of which align with the code changes shown in the raw summary.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/issue-1250-mount-tar-seed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces support for seeding the local workspace using a GitHub working tree tarball export. It adds a new ExportGithubWorkingTreeTar method to the HTTP client, updates the syncer to detect GitHub working tree mounts, and implements the logic to fetch, verify, and apply the tarball seed. A critical performance issue was identified in githubRemotePathForWorkingTreeRel where unnecessary slice allocations and sorting are performed for every file during local scans, leading to $O(N^2 \log N)$ complexity.

github-actions · 2026-05-27T10:34:21Z

Relayfile Eval Review

Run: .relayfile/evals/runs/2026-05-27T11-00-10-150Z-HEAD-provider
Mode: provider
Git SHA: cb1d04e

Human Review Cases

No reviewable human-review cases captured Relayfile output.

khaliqgant · 2026-05-27T10:34:39Z

Server-contract review (claude-2, #1250 stage-2 server owner) — ✅ adheres

Reviewed the client against the relayfile server contract it consumes. Contract-correct on all points:

Raw tar request — gzip=0 + pathPrefix/headSha asserted in the test; defensively handles BOTH application/x-tar and (legacy) gzip content-types. ✅
contentHash verification (the linchpin) — verifies each tar entry via hashBytes(data) vs the fs/tree contentHash. This is correct by construction: the server-side contentHash (cloud slice-1, content-hash.ts) was explicitly written to MIRROR this daemon hashBytes (SHA-256 hex of the raw decoded bytes), and the decoded github-working-tree tar contains those same raw bytes — so verification MATCHES rather than always-refetching. Holds across utf-8 and base64-stored files (both stored decoded in R2). ✅
Shared-write plane preserved — local paths decode to real checkout paths while tracked state keeps the authoritative RelayFile object path (remotePath) for writeback. So agent edits still flow back to RelayFile (the multi-agent constraint). ✅
Sentinel/cursor — prefers .relayfile/clone.json (meta.json fallback), eventsCursor accepted with aliases + a forward-scan fallback, sentinel-aligned cursor preserved (not latest-after-seed). ✅ (The forward-scan fallback is fine for small clone histories; the import stamping eventsCursor will eliminate it for large histories — tracked separately.)
Integrity guards — tar-verified-count must equal tree count (else error); snapshot-delete skipped on suspected partial/empty listing (preserves local state). ✅

No contract gaps. Composes with #1256 (server: raw-tar + gzip=0-coupled body ceiling) once both deploy + the consumer requests &gzip=0. LGTM from the contract side.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/mountsync/syncer.go`:
- Around line 5113-5135: The mapping can pick a stale object when state
temporarily contains both old and new SHAs; update
githubRemotePathForWorkingTreeRel to prefer a candidate whose remotePath suffix
matches the current head SHA (s.githubWorkingTree.HeadSHA) before falling back
to revision comparison: inside the loop over paths, when candidate==rel compute
isHeadCandidate := (s.githubWorkingTree != nil && s.githubWorkingTree.HeadSHA !=
"" && strings.HasSuffix(remotePath, "@"+s.githubWorkingTree.HeadSHA)); then
choose candidate if bestPath=="" or (isHeadCandidate && !bestIsHead) or
(isHeadCandidate==bestIsHead && revisionAdvances(bestRevision, revision)); track
bestIsHead alongside bestPath/bestRevision; ensure nil/empty HeadSHA is handled
(treat as non-head).
- Around line 2690-2695: parseGithubCloneManifest currently only recognizes
camelCase keys for cursor fields so legacy meta.json entries like events_cursor
or event_id are ignored; update parseGithubCloneManifest (and the read helper
used there) to accept snake_case variants ("events_cursor", "event_cursor",
"fs_events_cursor", "cursor" and "event_id", "event_id" etc.) when populating
githubCloneManifest.EventsCursor and EventID so the legacy manifest path seeded
by readGithubCloneManifest preserves the cursor; modify the read(...) call list
used to build githubCloneManifest (and any related key lookup logic) to include
the snake_case key names alongside the existing camelCase names.
- Around line 2789-2878: The tar verification loop currently allows duplicate
entries because it only validates membership against tree and final cardinality;
fix by tracking seen entries and rejecting duplicates: introduce a local map
(e.g., seenRel := map[string]struct{}{}) before the loop that iterates
tr.Next(), and immediately after computing rel (the cleaned header path) check
if rel is already in seenRel and if so return an error (e.g., "github tar seed
contains duplicate file %q"); otherwise add rel to seenRel and continue with the
existing processing (this will catch duplicate headers for the same file before
using tree[rel] or writing files such as in the blocks that reference
meta.RemotePath, safeLocalPath, writeFileAtomic, etc.).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ada55aa7-36e1-44fe-a9dd-ac76eae86814

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae090c and ec396ef.

📒 Files selected for processing (3)

internal/mountsync/http_client_test.go
internal/mountsync/syncer.go
internal/mountsync/syncer_test.go

khaliqgant · 2026-05-27T10:59:32Z

Addressed the review findings in 919ddaa:

precomputed the GitHub working-tree local-path index once per scan and use O(1) lookups during scanLocalFiles; single-event routing still builds one index per event
local-to-remote mapping now prefers the current HeadSHA object before revision tiebreaking, with a regression for old/new SHA coexistence
legacy clone manifests now accept snake_case cursor keys (events_cursor, event_cursor, fs_events_cursor, event_id)
tar seed verification now rejects duplicate entries, with regression coverage

Validation: go test ./internal/mountsync ./cmd/relayfile-mount and git diff --check pass. All review threads are resolved.

gemini-code-assist Bot reviewed May 27, 2026

View reviewed changes

Comment thread internal/mountsync/syncer.go

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

Comment thread internal/mountsync/syncer.go

Comment thread internal/mountsync/syncer.go

Comment thread internal/mountsync/syncer.go Outdated

feat(mount): seed github working trees from export tar

919ddaa

khaliqgant force-pushed the codex/issue-1250-mount-tar-seed branch from ec396ef to 919ddaa Compare May 27, 2026 10:58

khaliqgant merged commit 67cd414 into main May 27, 2026
8 checks passed

khaliqgant deleted the codex/issue-1250-mount-tar-seed branch May 27, 2026 11:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mount): seed GitHub working trees from export tar#213

feat(mount): seed GitHub working trees from export tar#213
khaliqgant merged 1 commit into
mainfrom
codex/issue-1250-mount-tar-seed

khaliqgant commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

khaliqgant commented May 27, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

khaliqgant commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

khaliqgant commented May 27, 2026

Summary

Contract notes

Tests

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relayfile Eval Review

Human Review Cases

Uh oh!

khaliqgant commented May 27, 2026

Server-contract review (claude-2, #1250 stage-2 server owner) — ✅ adheres

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

khaliqgant commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading