fix(ci3): scope build-instance name by repo to stop cross-repo reaping#23987
Merged
Conversation
bootstrap_ec2 terminates any existing instance sharing the target Name tag, to reap orphans left by a cancelled GA run on the same ref. But the name was just <ref>_<arch>[_postfix], with no repo component — so aztec-packages and aztec-packages-private, which build the same tags/refs concurrently under the same OIDC role, computed identical names and reaped each other's live instances. Observed: nightly tag v5.0.0-nightly.20260610 built in both repos; the public run's pre-launch reap terminated the private run's in-progress arm64 release instance ~7 min in, failing that build. Prefix the instance name with the repo basename (GITHUB_REPOSITORY##*/, default aztec-packages). The key stays stable across re-runs within a repo, so the intended orphan cleanup still works; it only stops the two repos from colliding. ci.sh's helper instance_name (shell/kill/get-ip) is kept in sync.
ludamad
approved these changes
Jun 10, 2026
charlielye
added a commit
that referenced
this pull request
Jun 12, 2026
## Regression The cross-repo instance-name fix (#23987) added to `ci.sh` and `ci3/bootstrap_ec2`: ```sh repo=${GITHUB_REPOSITORY##*/} repo=${repo:-aztec-packages} ``` Under `set -u` (set by `ci3/source`), the `##*/` expansion on an **unset** `GITHUB_REPOSITORY` aborts *before* the `:-aztec-packages` default on the next line can apply. So any local invocation fails immediately: ``` $ ./ci.sh bench ./ci.sh: line 58: GITHUB_REPOSITORY: unbound variable ``` It only bit locally — CI always has `GITHUB_REPOSITORY` set, so it passed there. ## Fix Default first, then strip: ```sh repo=${GITHUB_REPOSITORY:-aztec-packages} repo=${repo##*/} ``` Applied in both `ci.sh` and `ci3/bootstrap_ec2`. Verified under `set -u` with the var unset: resolves to `aztec-packages` (and `AztecProtocol/aztec-packages` → `aztec-packages` when set), so the instance-name scheme is unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
ci3/bootstrap_ec2terminates any existing instance that shares the targetNametag before launching — this intentionally reaps orphans left when a GA run is cancelled (e.g. by a new push) on the same ref. But the name was<ref>_<arch>[_<postfix>]with no repo component, soaztec-packagesandaztec-packages-private— which build the same tags/refs concurrently under the same OIDC role — computed identical names and reaped each other's live instances.Observed incident
Nightly tag
v5.0.0-nightly.20260610was built in both repos. Instancei-02e5d6a6c148ec726(v5_0_0-nightly_20260610_arm64_a-release) was launched by the private repo's run at 03:06:01 UTC and terminated at 03:13:12 UTC by the public repo's run for the same tag (its pre-launch reap step), ~7 min in — failing the private build. CloudTrail confirms aTerminateInstancesfrom a differentci3-<run_id>session, not a spot interruption.Fix
Prefix the instance name with the repo basename (
${GITHUB_REPOSITORY##*/}, defaulting toaztec-packagesfor local runs):<repo>_<ref>_<arch>) and stays stable across re-runs/new-pushes of the same ref — so the intended orphan-on-cancel cleanup still works.aztec-packages_…and private →aztec-packages-private_…, so they no longer match and can't reap each other.ci.sh's helperinstance_name(used by theshell/kill/get-ipdev commands) is kept in sync so it still resolves instances launched by a CI run for the same repo.Notes
Nametag limit is 256 chars; the longest prefixed name is ~61 chars. The reap match uses the fullNametag, so the cosmetic 63-chardocker_hostnametruncation doesn't affect correctness.