fix(e2e): drop removed enforceTimeTable option from optimistic proving test#23976
Merged
AztecBot merged 3 commits intoJun 10, 2026
Merged
Conversation
spalladino
approved these changes
Jun 9, 2026
…g test PR #23821 made timetable enforcement unconditional and removed the enforceTimeTable option from EpochsTestOpts, but epochs_optimistic_proving.parallel.test.ts still passed it at six call sites, breaking the yarn-project tsgo type-check on merge-train/spartan-v5. Each site already sets a concrete blockDurationMs, so removing the now default-enforced option is behavior-preserving.
80c489a to
c109cfa
Compare
…des' Temporarily skip the HA work-distribution test, which fails under the always-enforced timetable from #23821 (sequencer misses slots: BlockOrCheckpointSlotExpiredError / no_blocks_built / Fork not found). To be re-enabled after the HA block-building interaction is fixed.
Skip the whole HA Full Setup suite rather than the single test: it fails under the always-enforced timetable from #23821 (sequencer misses slots: BlockOrCheckpointSlotExpiredError / no_blocks_built / Fork not found). To be re-enabled after the HA block-building interaction is fixed.
PhilWindle
pushed a commit
that referenced
this pull request
Jun 11, 2026
…g anvil (#23979) Fixes the flaky HA full suite (`e2e_ha_full`) seen in http://ci.aztec-labs.com/8e1e980c4886df0d, where "should distribute work across multiple HA nodes" timed out awaiting a trigger tx. Also re-enables the suite, which #23976 had skipped. ## Root cause The HA compose suite was the only block-building suite running against an L1 with no self-advancing clock. Its anvil container ran in automine with no `--block-time`, and being external, it was excluded from the `TestDateProvider` sync that locally-spawned anvils get. L1 chain time only moved when something mined, while the shared sequencer clock free-ran. #23821 removed the `AnvilTestWatcher` that used to couple the two clocks in this mode and replaced it with per-iteration nudges in the test (clock warp + blind `mine(8)`). Two consequences, both visible in the failed run's logs: - The `mine(8)` overshoot put L1 ~1.5 slots ahead of the test clock, so each iteration's first propose raced its slot boundary and was silently dropped, followed by a prune that destroyed the pipelined builders' forks (`Fork not found` on all surviving nodes). This race was lost in passing runs too. - Recovery then required the proposers' archiver-sync gate to clear, but the gate's deadline runs on the free-running test clock while nothing mines L1 during the test's `waitForTx` — `Archiver did not sync L1 past slot 109 before slot 110 expired, discarding pipelined work`, repeated until the jest timeout. Whether a run passed or failed came down to seconds of margin on this gate. ## Fix Stop emulating L1 time in the test and run the suite in the same regime as every other block-building e2e (e.g. `e2e_epochs`): - Drop the anvil container and `ETHEREUM_HOSTS` from the HA compose file. With no external L1 configured, `setup()` spawns anvil in-proc with interval mining (`--block-time = ethereumSlotDuration`) and keeps the `TestDateProvider` snapped to L1 block timestamps via the existing stdout listener. The sibling web3signer compose suite already works this way. - Add `automineL1Setup: true` so L1 contract deployment runs under temporary automine before interval mining starts. - Delete all time scaffolding from the test (clock warps, cheat-mining heartbeats, archiver sync nudges). Tests submit a tx and wait, in real time. No assertions change. No production code changes: with a self-advancing L1, the sequencer and publisher behave exactly as on a real network. ## Parallelization The suite file is renamed to `e2e_ha_full.parallel.test.ts`, so CI runs each of its 8 tests as an isolated job in its own compose stack instead of one 15+ minute serial job: - `bootstrap.sh` expands the HA suite per test name (same mechanism as the existing `.parallel` simple tests). - `run_test.sh` forwards the test name into the compose stack and namespaces the docker compose project per test so concurrent jobs on one host don't collide. - `sendTriggerTx` now starts the HA sequencers idempotently, since under per-test isolation the governance/reload/distribute tests run without the first test (previously the only caller of `startHASequencers`). - Three clock-skew test titles contained parentheses, which jest's `--testNamePattern` interprets as regex groups (the filter would silently match nothing); they are retitled. ## Teardown fix (follow-up to the first CI round) The first CI round passed every test body but three jobs (produce-blocks, governance, reload) hung in `afterAll` until the job timeout. Two compounding causes, both fixed here: - `afterAll` reset the shared `TestDateProvider` *before* stopping nodes. The reset rewinds the clock from chain time to wall time — minutes apart after the automine deploy burst — so vote submissions armed against the rewound clock pushed sequencer stops out by that gap. The old 30s abandon-race then gave up, and the abandoned nodes outlived the jest environment, keeping the worker alive until the CI timeout (jest runs without `forceExit`). `afterAll` now stops sequencers first, awaits every node stop fully, and resets the clock last. These three jobs are the ones whose tests end with sequencers still running; the distribute test (which stops nodes in-test, before any reset) passed for the same reason. - Ports #23990 from `merge-train/spartan` (not previously on the v5 line): `CheckpointProposalJob.interrupt()` now propagates to the publisher, cancelling the `sendRequestsAt` slot-deadline sleep on sequencer stop, so a pending vote submission can never block shutdown. The original PR's `e2e_ha_full` teardown changes are superseded by the rework above and were not ported. ## Verification - Three full local runs of the suite via `run_test.sh ha` (all 8 tests each): green in 255s / 254s / 268s of jest time (the old warp-based suite ran 10+ minutes), with zero occurrences of the old failure signatures (`Fork not found`, `Archiver did not sync`, `discarding pipelined work`) — passing runs of the old code showed 12+ `Fork not found` errors even when green. - One per-test CI-style run (`run_test.sh ha <file> "should distribute work across multiple HA nodes"`): the originally flaky test passes standalone in its own compose stack (7 skipped, 1 passed), exercising the full `TEST_NAME` plumbing. - `yarn build`, `yarn format`, `yarn lint` clean; `sequencer-client` unit tests pass (back to the pre-change suite after the revert).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
CI on
merge-train/spartan-v5(commit 609014a, log) failed in theyarn-projectbuild at theyarn tsgo -b --emitDeclarationOnlystep:(also at lines 366, 473, 558, 646, 780)
Root cause
PR #23821 (always enforce timetable with concrete block duration) made timetable enforcement unconditional and removed the
enforceTimeTableoption fromEpochsTestOpts/SetupOptions, deleting ~30enforceTimeTable: truecall sites.epochs_optimistic_proving.parallel.test.tslanded on the v5 line separately and still passedenforceTimeTable: trueat six sites, so it no longer type-checks.Fix
enforceTimeTable: trueproperties. Each call site already sets a concreteblockDurationMs: 8000, so the change is behavior-preserving — the same deletion the PR applied to every other e2e test. Verified in CI:yarn-projectnow type-checks andepochs_optimistic_proving.parallel.test.tspasses.it.skipthe HA testshould distribute work across multiple HA nodesincomposed/ha/e2e_ha_full.test.ts, which fails under the always-enforced timetable (sequencer misses slots:BlockOrCheckpointSlotExpiredError/no_blocks_built/Fork not found). Skipped at Santiago's request, to be re-enabled after the HA block-building interaction with refactor(sequencer)!: always enforce timetable with concrete block duration #23821 is fixed.