fix(ci): refresh grind launcher checkout to origin/next before launching by AztecBot · Pull Request #24039 · AztecProtocol/aztec-packages

AztecBot · 2026-06-11T21:46:50Z

Problem

The dashboard grind option always fails to SSH into the build instance:

Waiting for SSH at 3.144.255.68...
Timeout: SSH could not login to 3.144.255.68 within 60 seconds.

The instance launches fine (spot/on-demand fulfilled, IP assigned) but SSH never connects, so grind cycles through every instance type and gives up.

Root cause

CI build boxes were migrated from SSH to SSM. In ci3/bootstrap_ec2 the default is now CI_USE_SSH=0 (SSM); only shell-new forces SSH, and grind-test does not. So on current next, grind runs over SSM like the rest of CI.

But the dashboard launches grind from a long-lived checkout at REPO_PATH (the /grind handler in rk.py shells out to cd $REPO_PATH && ./ci.sh grind-test ...). That checkout had drifted to a pre-SSM commit, so grind alone still took the legacy SSH branch — launching into the retired SSH security group + build-instance key pair, whose port-22 / key-injection preconditions were torn down during the SSM lockdown. The stale checkout also explains the old AMI (ami-09d27244b23be8891) in the logs vs. current next's ami-067627aa971a1dcbb.

Nothing kept REPO_PATH current: the ci3-dashboard-deploy.yml workflow only rebuilds the rkapp Flask container (and is path-filtered to ci3/dashboard/**), so changes to the ci3/ launcher scripts never refreshed it.

Fix

Refresh the launcher checkout to origin/next at grind launch time, before shelling out. This is self-healing and independent of deploys. It matches the existing design where the launcher always runs current-next orchestration scripts while the grind target commit is checked out on the remote box — so this does not restrict which branch/commit you can grind. If the refresh fails (e.g. transient network), the error is surfaced in the run log instead of silently grinding on a stale tree.

Testing

python3 -m py_compile ci3/dashboard/rk.py passes. The behavior change is host-side (requires the dashboard's REPO_PATH checkout) and can't be exercised in unit CI; it will take effect on the next dashboard deploy. The immediate one-time unblock is still to refresh REPO_PATH on ci.aztec-labs.com and restart rkapp.

Created by claudebox · group: slackbot

AztecBot · 2026-06-11T22:32:32Z

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/3a45cda231c3747c�3a45cda231c3747c8;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/multiple_validators_sentinel.parallel.test.ts "collects attestations for all validators on a node" (400s) (code: 0) group:e2e-p2p-epoch-flakes

fix(ci): refresh grind launcher checkout to origin/next before launching

3b462ff

AztecBot added ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure claudebox Owned by claudebox. it can push to this PR. labels Jun 11, 2026

ludamad approved these changes Jun 11, 2026

View reviewed changes

spalladino marked this pull request as ready for review June 11, 2026 21:58

spalladino requested a review from charlielye as a code owner June 11, 2026 21:58

spalladino enabled auto-merge June 11, 2026 21:58

spalladino added this pull request to the merge queue Jun 11, 2026

Merged via the queue into next with commit fa06339 Jun 11, 2026
51 of 55 checks passed

spalladino deleted the cb/grind-refresh-launcher-checkout branch June 11, 2026 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ci): refresh grind launcher checkout to origin/next before launching#24039

fix(ci): refresh grind launcher checkout to origin/next before launching#24039
spalladino merged 1 commit into
nextfrom
cb/grind-refresh-launcher-checkout

AztecBot commented Jun 11, 2026

Uh oh!

AztecBot commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AztecBot commented Jun 11, 2026

Problem

Root cause

Fix

Testing

Uh oh!

AztecBot commented Jun 11, 2026

Flakey Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants