Skip to content

test: flag e2e_ha_full afterAll hook timeout as flake under pipelining#23524

Merged
alexghr merged 1 commit into
merge-train/spartanfrom
cb/22301e7d0daa
May 23, 2026
Merged

test: flag e2e_ha_full afterAll hook timeout as flake under pipelining#23524
alexghr merged 1 commit into
merge-train/spartanfrom
cb/22301e7d0daa

Conversation

@AztecBot

Copy link
Copy Markdown
Collaborator

Summary

Add .test_patterns.yml entry to flag e2e_ha_full.test.ts failing with Exceeded timeout of 1200000 ms for a hook (afterAll teardown) as a flake. This is what dequeued PR #23344 from the merge queue at 2026-05-22T22:07:25Z.

Investigation

Jest output:

FAIL src/composed/ha/e2e_ha_full.test.ts (1519.394 s)
  ● Test suite failed to run

    thrown: "Exceeded timeout of 1200000 ms for a hook.

     279 |   });
     280 |
    > 281 |   afterAll(async () => {
         |   ^

The teardown of the HA full test exceeds the 20-minute hook timeout under pipelining. The suite was migrated to pipelining in #23463; this is a sibling flake of #23501 and #23515.

Why a new entry?

The existing e2e_ha_full entries match per-test assertion lines (✕ should coordinate governance voting..., ✕ should distribute work across multiple HA nodes). The afterAll-timeout failure mode surfaces as Test suite failed to run with no per-test assertion line, so neither existing entry catches it and the failure escalates to FAILED.

Follow-up

Proper source fix is to tighten the HA full afterAll teardown so it does not block on services that have already hung. This patch is the quick unblock; owner palla is the author of the pipelining migration (#23463).

Test plan

  • error_regex: "Exceeded timeout of [0-9]+ ms for a hook" matches the observed Exceeded timeout of 1200000 ms for a hook line from the failure
  • CI continues to run src/composed/ha/e2e_ha_full.test.ts; the afterAll timeout now alerts palla on #aztec3-ci instead of failing the build

Created by claudebox · group: slackbot

@AztecBot AztecBot added the claudebox Owned by claudebox. it can push to this PR. label May 23, 2026
@alexghr alexghr marked this pull request as ready for review May 23, 2026 07:52
@alexghr alexghr enabled auto-merge (squash) May 23, 2026 07:52
@alexghr alexghr merged commit afbeb1a into merge-train/spartan May 23, 2026
39 of 50 checks passed
@alexghr alexghr deleted the cb/22301e7d0daa branch May 23, 2026 07:52
PhilWindle pushed a commit that referenced this pull request May 24, 2026
Dequeued from merge-train/spartan again:
<http://ci.aztec-labs.com/136431da99834194>.

The HA full suite keeps failing under proposer pipelining with shifting
symptoms. In this run the dashboard log shows recurring
`validator:proposal-handler Timed out waiting for block with archive
matching checkpoint proposal` warnings (slot 98, 115, …) and an `Error
building checkpoint at slot 127: already proposed block for slot 127
index 0` on HA-4 — i.e. the 5 HA peers race on the same proposal. The
bundled #23539 (parallel peer teardown) and #23524 (afterAll hook
timeout) entries did not catch this run because jest's per-test summary
was not reached within the dashboard log capture.

This PR adds a broad regex-only entry under `.test_patterns.yml` to flag
any failure of `yarn-project/end-to-end/scripts/run_test.sh ha
src/composed/ha/e2e_ha_full.test.ts` as a flake. Owner: @PaLLa, matching
the existing pipelining-flavoured entries for this suite.

The intent is to unblock the merge queue while the HA pipelining
stabilisation work continues; narrow the regex (or add a real fix) once
the failure modes settle down.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/d394ef6145e749ff) ·
group: `slackbot`*
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
refactor(p2p): merge FastTxCollection into TxCollection with sequential
pipeline (AztecProtocol#23245)
refactor(publisher): bundle-level simulate; drop per-action enqueue sims
(AztecProtocol#23165)
refactor(stdlib): remove deprecated RevertCode/TxExecutionResult aliases
(AztecProtocol#23249)
test(e2e): fix race in 'proposer invalidates multiple checkpoints'
(AztecProtocol#23259)
fix: clean up old jobs regardless of pending status (AztecProtocol#23260)
refactor(p2p): remove unused sendBatchRequest (AztecProtocol#23273)
chore(p2p): remove proposal_tx_collector leftovers (AztecProtocol#23276)
feat: slash truncated checkpoint proposals (AztecProtocol#23250)
refactor: remove unused map in attestation pool (AztecProtocol#23284)
chore(p2p): assert last block in checkpoint proposal is correct (AztecProtocol#23274)
refactor(l1-tx-utils): use DateProvider for fail-fast timeout check
(AztecProtocol#23257)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23277)
test(e2e): fix race in broadcasted_invalid_block_proposal_slash under
pipelining (AztecProtocol#23302)
fix(archiver): atomic getter for L2 tips (AztecProtocol#23295)
fix(sequencer): use targetSlot in tryVoteWhenEscapeHatchOpen under
pipelining (AztecProtocol#23296)
fix(world-state): make fork close idempotent for pruned forks (AztecProtocol#23298)
test(e2e): migrate passing tests to proposer pipelining (AztecProtocol#23275)
chore: update dashboard (AztecProtocol#23312)
chore: Revert "feat(sandbox): support proposer pipelining in local
network" (AztecProtocol#23313)
test: slash on bad attestation (AztecProtocol#23184)
feat(slasher): per-slot data-withholding watcher (A-523, A-525) (AztecProtocol#23116)
test(e2e): enable pipelining on e2e debug trace (AztecProtocol#23301)
test(e2e): enable pipelining on l1-to-l2 test (AztecProtocol#23300)
test(e2e): switch fee_settings to organic fee bumps under pipelining
(AztecProtocol#23303)
fix(ci): retry sqlite3mc-wasm download on transient DNS/TLS failures
(AztecProtocol#23333)
test(e2e): wait for real oracle rotation in fee_settings inflate helper
(AztecProtocol#23334)
test(e2e): anchor e2e_amm PXE to checkpointed tip under pipelining
(AztecProtocol#23336)
fix(spartan-bench): tolerate older node images in SlasherConfig schema
(AztecProtocol#23351)
fix: interrupt prover jobs in stop (AztecProtocol#23358)
test(e2e): enable pipelining on bot, fees, and avm simulator tests
(AztecProtocol#23329)
feat(sentinel): end-of-epoch evaluation with re-execution outcomes
(AztecProtocol#23286)
feat: slash for invalid checkpoint proposals (AztecProtocol#23270)
fix: fork closure in epoch proving jobs (AztecProtocol#23390)
fix(slasher): anchor watcher scans at archiver synced L2 slot (AztecProtocol#23394)
fix: avoid npm uplink for aztec-up local publishes (AztecProtocol#23396)
test(e2e): ignore benign 'Insufficient valid txs' block-build-failed in
epochs tests (AztecProtocol#23424)
chore: refactor weekly proving test wait (AztecProtocol#23395)
refactor: add fifo set (AztecProtocol#23271)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23327)
fix(p2p): validate BLOCK_TXS in BatchTxRequester (AztecProtocol#23371)
chore(p2p): simplify IBatchRequestTxValidator (AztecProtocol#23373)
feat(sequencer): AutomineSequencer for single-sequencer e2e tests
(AztecProtocol#23354)
fix(prover): wait for previous epoch to be proven (AztecProtocol#23458)
chore: collocate provers (AztecProtocol#23439)
chore: rm staging-ignition (AztecProtocol#23440)
chore: rm unused networks (AztecProtocol#23441)
test(e2e): migrate block_building, multi_validator_node,
publisher_funding, invalid_checkpoint_proposal to pipelining (AztecProtocol#23414)
fix(archiver): reconcile local blocks with L1 checkpoints by block
number (AztecProtocol#23461)
feat: Updated slash conditions on block proposals (AztecProtocol#23466)
test(e2e): migrate HA full test to pipelining (AztecProtocol#23463)
chore: update resource profiles (AztecProtocol#23442)
chore: update debug log levels (AztecProtocol#23456)
test: fix flaky sentinel_status_slash by asserting the fault on the
checkpoint slot (AztecProtocol#23483)
feat(slasher): slash checkpoint equivocation between P2P and L1 (A-980)
(AztecProtocol#23436)
refactor(slasher): rename ATTESTED_DESCENDANT_OF_INVALID ->
PROPOSED_DESCENDANT_OF_CHECKPOINT_WITH_INVALID_ATTESTATIONS (AztecProtocol#23468)
fix: reject block proposals in poisoned slots (AztecProtocol#23411)
fix: retry nargo dep + solc downloads to survive transient DNS drops
(AztecProtocol#23490)
fix: enrich json-rpc tracing (AztecProtocol#23412)
feat: add trace export controls (AztecProtocol#23413)
test(e2e): assert no equivocation offenses in HA full test (AztecProtocol#23496)
test: cover invalid checkpoint proposal slashing (AztecProtocol#23503)
test(e2e): migrate more e2e suites to proposer pipelining (AztecProtocol#23482)
test: flag e2e_slashing_attested_invalid_proposal as flake under
pipelining (AztecProtocol#23501)
test: flag e2e_p2p_duplicate_proposal_slash as flake under pipelining
(AztecProtocol#23515)
test(e2e): require cross-observer agreement on sentinel fault slot
(AztecProtocol#23513)
test: flag e2e_ha_full afterAll hook timeout as flake under pipelining
(AztecProtocol#23524)
fix(e2e): propagate l1ContractsArgs into node config so archiver matches
L1 (AztecProtocol#23514)
test: flag e2e_multi_validator_node_key_store P2P tx-dropped failure as
flake (AztecProtocol#23528)
test(cheat-codes): retry warpL2TimeAtLeastTo in-current-slot test on L1
race (AztecProtocol#23533)
test(e2e_ha_full): parallel HA peer node teardown with per-node deadline
(AztecProtocol#23539)
test: flag e2e_ha_full as flake under HA pipelining (AztecProtocol#23541)
test(ci): skip e2e_ha_full entirely on merge-train/spartan (AztecProtocol#23542)
test(ci): skip e2e_multi_validator_node_key_store entirely on
merge-train/spartan (AztecProtocol#23544)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-skip claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants