feat(mount): seed GitHub working trees from export tar#213
Conversation
|
Warning Review limit reached
More reviews will be available in 34 minutes and 15 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR adds GitHub "working tree" mount support to mountsync, enabling tar-based full-tree bootstrapping, GitHub-specific path mapping, and event cursor seeding from clone manifests. ChangesGitHub Working Tree Mount and Tar-based Seed Support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces support for seeding the local workspace using a GitHub working tree tarball export. It adds a new ExportGithubWorkingTreeTar method to the HTTP client, updates the syncer to detect GitHub working tree mounts, and implements the logic to fetch, verify, and apply the tarball seed. A critical performance issue was identified in githubRemotePathForWorkingTreeRel where unnecessary slice allocations and sorting are performed for every file during local scans, leading to
Relayfile Eval ReviewRun: Passed: 4 | Needs human: 0 | Reviewable: 0 | Missing output: 0 | Failed: 0 | Skipped: 0 Human Review CasesNo reviewable human-review cases captured Relayfile output. |
Server-contract review (claude-2, #1250 stage-2 server owner) — ✅ adheresReviewed the client against the relayfile server contract it consumes. Contract-correct on all points:
No contract gaps. Composes with #1256 (server: raw-tar + gzip=0-coupled body ceiling) once both deploy + the consumer requests |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/mountsync/syncer.go`:
- Around line 5113-5135: The mapping can pick a stale object when state
temporarily contains both old and new SHAs; update
githubRemotePathForWorkingTreeRel to prefer a candidate whose remotePath suffix
matches the current head SHA (s.githubWorkingTree.HeadSHA) before falling back
to revision comparison: inside the loop over paths, when candidate==rel compute
isHeadCandidate := (s.githubWorkingTree != nil && s.githubWorkingTree.HeadSHA !=
"" && strings.HasSuffix(remotePath, "@"+s.githubWorkingTree.HeadSHA)); then
choose candidate if bestPath=="" or (isHeadCandidate && !bestIsHead) or
(isHeadCandidate==bestIsHead && revisionAdvances(bestRevision, revision)); track
bestIsHead alongside bestPath/bestRevision; ensure nil/empty HeadSHA is handled
(treat as non-head).
- Around line 2690-2695: parseGithubCloneManifest currently only recognizes
camelCase keys for cursor fields so legacy meta.json entries like events_cursor
or event_id are ignored; update parseGithubCloneManifest (and the read helper
used there) to accept snake_case variants ("events_cursor", "event_cursor",
"fs_events_cursor", "cursor" and "event_id", "event_id" etc.) when populating
githubCloneManifest.EventsCursor and EventID so the legacy manifest path seeded
by readGithubCloneManifest preserves the cursor; modify the read(...) call list
used to build githubCloneManifest (and any related key lookup logic) to include
the snake_case key names alongside the existing camelCase names.
- Around line 2789-2878: The tar verification loop currently allows duplicate
entries because it only validates membership against tree and final cardinality;
fix by tracking seen entries and rejecting duplicates: introduce a local map
(e.g., seenRel := map[string]struct{}{}) before the loop that iterates
tr.Next(), and immediately after computing rel (the cleaned header path) check
if rel is already in seenRel and if so return an error (e.g., "github tar seed
contains duplicate file %q"); otherwise add rel to seenRel and continue with the
existing processing (this will catch duplicate headers for the same file before
using tree[rel] or writing files such as in the blocks that reference
meta.RemotePath, safeLocalPath, writeFileAtomic, etc.).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: ada55aa7-36e1-44fe-a9dd-ac76eae86814
📒 Files selected for processing (3)
internal/mountsync/http_client_test.gointernal/mountsync/syncer.gointernal/mountsync/syncer_test.go
ec396ef to
919ddaa
Compare
|
Addressed the review findings in 919ddaa:
Validation: |
Summary
Contract notes
Tests