Skip to content

action: implement multi-arch buildkitd with insecure mode#88

Closed
adityamaru wants to merge 2 commits intomainfrom
devin/1775273254-multiarch-insecure-buildkitd
Closed

action: implement multi-arch buildkitd with insecure mode#88
adityamaru wants to merge 2 commits intomainfrom
devin/1775273254-multiarch-insecure-buildkitd

Conversation

@adityamaru
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru commented Apr 4, 2026

Summary

Adds native multi-arch Docker build support by spawning a follower sandbox VM on the opposite architecture, starting buildkitd on it, and creating a multi-node buildx builder. When platforms includes both linux/amd64 and linux/arm64, the action will:

  1. Detect the non-host architecture needed for a follower
  2. Generate an ephemeral ED25519 SSH keypair (private key stays on leader in /tmp)
  3. Spawn a follower sandbox via POST /api/sandbox using BLACKSMITH_SANDBOX_TOKEN
  4. SSH into the follower, start buildkitd on TCP :1234
  5. Expose the follower's buildkitd port via the tunnel manager (localhost:8377/expose-port)
  6. Create a two-node buildx builder: host arch → local buildkitd, follower arch → tunnel address
  7. Clean up the follower sandbox in the post-action phase via DELETE /api/sandbox/{vm_id}

If follower setup fails, a warning is logged and the build falls back (though QEMU wiring is not implemented yet — builds would just fail for the non-native platform).

This is intentionally insecure — buildkitd listens without auth. A follow-up PR will add mTLS.

Depends on fa#3557 and web#5999 for BLACKSMITH_SANDBOX_TOKEN to be available in VMs.

Ref: BLA-790

Review & Testing Checklist for Human

  • Regex regression: Prettier reformatted /(.*)\s*$/ to /(.*)\ s*$/ in multiple places (lines ~720, ~745, ~792 in main.ts). \ s is NOT \s — this breaks the stderr error extraction regex. Applies to both new code AND the pre-existing "set as default builder" error handler. This needs to be fixed before merge.
  • Sandbox API contract: Verify POST /api/sandbox accepts { arch, ssh_public_key, vcpu, teardown_minutes, labels } and returns { vm_id }. Verify GET /api/sandbox/{vm_id} returns { ssh_connection_string }. Verify DELETE /api/sandbox/{vm_id} exists. These are assumptions based on the numbersmith SDK POC.
  • teardown_minutes: 0 semantics: Confirm that 0 means "no auto-teardown" rather than "teardown immediately" in the backend.
  • Tunnel manager JSON response: exposeFollowerBuildkitd does JSON.parse() on raw SSH stdout with no guard — non-JSON output (e.g. SSH banner, error message) will throw a confusing error.
  • Test plan: Deploy fa#3557 + web#5999 to staging. Run a workflow with platforms: linux/amd64,linux/arm64 using this branch of setup-docker-builder. Verify follower sandbox is created, buildkitd starts, tunnel exposes the port, buildx cluster forms, a multi-arch image is built, and the follower is cleaned up afterwards. Also verify a single-arch build (no platforms or single platform) still works without changes.

Notes

  • No unit tests added for multiarch.ts — the module does SSH/HTTP against live infrastructure so it's inherently integration-test territory.
  • vcpu: 2 for the follower is hardcoded. May want to make this configurable or match the leader's spec.
  • The QEMU fallback path is not wired — the catch block logs a warning but the build will simply fail for the non-native platform. This is acceptable for a draft but should be addressed before GA.
  • Several formatting-only changes (prettier) are mixed into the diff (e.g. line-wrapping changes in maybeShutdownBuildkitd, logBuildkitdCrashLogs).

Link to Devin session: https://app.devin.ai/sessions/4bed582243a84e75be318f407802a563
Requested by: @adityamaru


View in Codesmith

  • Auto-fix issues

Codesmith can help with this PR — just tag @codesmith or enable auto-fix issues. Settings

Add multiarch.ts module that handles:
- Multi-arch detection from platforms input
- Follower sandbox spawning via POST /api/sandbox
- Ephemeral ED25519 SSH keypair generation for inter-VM comms
- Buildkitd startup on follower via SSH exec
- Port exposure via tunnel manager (localhost:8377/expose-port)

Integrate into main.ts:
- Detect multi-arch need from platforms input
- Spawn follower sandbox on opposite arch
- Create multi-node buildx builder (host + follower)
- QEMU fallback if follower setup fails
- Cleanup follower sandbox in post-action phase

Update state-helper.ts:
- Add follower VM ID, arch, and buildkitd addr state management

Depends on BLACKSMITH_SANDBOX_TOKEN being available in VM env.
Ref: BLA-790

Co-Authored-By: maru@blacksmith.sh <adityamaru@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@linear
Copy link
Copy Markdown

linear bot commented Apr 4, 2026

BLA-790 build-push-action: design and implement a multi-arch story

Co-Authored-By: maru@blacksmith.sh <adityamaru@gmail.com>
@adityamaru adityamaru marked this pull request as ready for review April 4, 2026 21:50
@adityamaru adityamaru closed this Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant