fix(e2e): propagate l1ContractsArgs into node config so archiver matches L1#23514
Merged
Conversation
spalladino
approved these changes
May 23, 2026
…hes L1
In `setup()`, `opts.l1ContractsArgs` lands on the deployed rollup contract but
never makes it back into the node config that the archiver factory reads. The
archiver's `l1Constants` therefore uses the default `aztecEpochDuration` (32)
even when a test explicitly deploys the rollup with a smaller value.
This breaks any test that overrides `aztecEpochDuration` via `l1ContractsArgs`
and relies on `EpochTestSettler` to keep the chain alive under the resulting
tight `aztecProofSubmissionEpochs` window. `EpochMonitor.work()` reads the
contract's `epochDuration=4` and decides `epochToProve=0` for block 1 at slot
2; the archiver's `isEpochComplete(0)`, however, reads its own
`epochDuration=32`, computes `endSlot=31`, and returns false until L1 wall-time
crosses slot 31. The settler stays silent, no checkpoint is ever marked as
proven, and the chain prunes when the 144s deadline passes mid-setup.
Observed in `e2e_cross_chain_messaging/l1_to_l2.parallel.test.ts` "non-
registered portal, public, repeatedly" on merge-train/spartan (CI
1779463097551535) — Aztec CI failure, 8 checkpoints rolled back to block 0,
in-flight L1→L2 tx dropped, `advanceBlock` threw "Failed to advance block 8".
Confirmed by instrumenting `archiver.isEpochComplete`:
```
isEpochComplete[probe]: timestamp-fail {
epochNumber: 0,
checkpointedSlot: 0, # archiver's l2 tip was at slot 2 — the 'checkpointed'
# block-data lookup with the wrong epochDuration ended
# up reading the synthetic genesis block (slot 0).
endSlot: 31, # << wrong: rollup contract says endSlot=3
l1Timestamp: 1779484115,
endTimestamp: 1779484479,
diff: 363,
}
```
After this fix, `EpochMonitor`'s "Epoch X is ready to be proven" fires every
epoch and `epoch-settler` writes `Settling epoch N with blocks ...` /
`Proven tip moved: A -> B` well before the proof window expires. The failing
test passes locally with the diagnostic still in place (`PASS` in 441s, four
`Settling epoch` lines for epochs 0..3 before the first `advanceBlock` is
called).
Diagnostic patch used to identify the gate:
https://gist.github.com/AztecBot/a51adb514b9e374fb993212cd8a4beca
danielntmd
pushed a commit
to danielntmd/aztec-packages
that referenced
this pull request
Jun 4, 2026
BEGIN_COMMIT_OVERRIDE refactor(p2p): merge FastTxCollection into TxCollection with sequential pipeline (AztecProtocol#23245) refactor(publisher): bundle-level simulate; drop per-action enqueue sims (AztecProtocol#23165) refactor(stdlib): remove deprecated RevertCode/TxExecutionResult aliases (AztecProtocol#23249) test(e2e): fix race in 'proposer invalidates multiple checkpoints' (AztecProtocol#23259) fix: clean up old jobs regardless of pending status (AztecProtocol#23260) refactor(p2p): remove unused sendBatchRequest (AztecProtocol#23273) chore(p2p): remove proposal_tx_collector leftovers (AztecProtocol#23276) feat: slash truncated checkpoint proposals (AztecProtocol#23250) refactor: remove unused map in attestation pool (AztecProtocol#23284) chore(p2p): assert last block in checkpoint proposal is correct (AztecProtocol#23274) refactor(l1-tx-utils): use DateProvider for fail-fast timeout check (AztecProtocol#23257) feat(sandbox): support proposer pipelining in local network (AztecProtocol#23277) test(e2e): fix race in broadcasted_invalid_block_proposal_slash under pipelining (AztecProtocol#23302) fix(archiver): atomic getter for L2 tips (AztecProtocol#23295) fix(sequencer): use targetSlot in tryVoteWhenEscapeHatchOpen under pipelining (AztecProtocol#23296) fix(world-state): make fork close idempotent for pruned forks (AztecProtocol#23298) test(e2e): migrate passing tests to proposer pipelining (AztecProtocol#23275) chore: update dashboard (AztecProtocol#23312) chore: Revert "feat(sandbox): support proposer pipelining in local network" (AztecProtocol#23313) test: slash on bad attestation (AztecProtocol#23184) feat(slasher): per-slot data-withholding watcher (A-523, A-525) (AztecProtocol#23116) test(e2e): enable pipelining on e2e debug trace (AztecProtocol#23301) test(e2e): enable pipelining on l1-to-l2 test (AztecProtocol#23300) test(e2e): switch fee_settings to organic fee bumps under pipelining (AztecProtocol#23303) fix(ci): retry sqlite3mc-wasm download on transient DNS/TLS failures (AztecProtocol#23333) test(e2e): wait for real oracle rotation in fee_settings inflate helper (AztecProtocol#23334) test(e2e): anchor e2e_amm PXE to checkpointed tip under pipelining (AztecProtocol#23336) fix(spartan-bench): tolerate older node images in SlasherConfig schema (AztecProtocol#23351) fix: interrupt prover jobs in stop (AztecProtocol#23358) test(e2e): enable pipelining on bot, fees, and avm simulator tests (AztecProtocol#23329) feat(sentinel): end-of-epoch evaluation with re-execution outcomes (AztecProtocol#23286) feat: slash for invalid checkpoint proposals (AztecProtocol#23270) fix: fork closure in epoch proving jobs (AztecProtocol#23390) fix(slasher): anchor watcher scans at archiver synced L2 slot (AztecProtocol#23394) fix: avoid npm uplink for aztec-up local publishes (AztecProtocol#23396) test(e2e): ignore benign 'Insufficient valid txs' block-build-failed in epochs tests (AztecProtocol#23424) chore: refactor weekly proving test wait (AztecProtocol#23395) refactor: add fifo set (AztecProtocol#23271) feat(sandbox): support proposer pipelining in local network (AztecProtocol#23327) fix(p2p): validate BLOCK_TXS in BatchTxRequester (AztecProtocol#23371) chore(p2p): simplify IBatchRequestTxValidator (AztecProtocol#23373) feat(sequencer): AutomineSequencer for single-sequencer e2e tests (AztecProtocol#23354) fix(prover): wait for previous epoch to be proven (AztecProtocol#23458) chore: collocate provers (AztecProtocol#23439) chore: rm staging-ignition (AztecProtocol#23440) chore: rm unused networks (AztecProtocol#23441) test(e2e): migrate block_building, multi_validator_node, publisher_funding, invalid_checkpoint_proposal to pipelining (AztecProtocol#23414) fix(archiver): reconcile local blocks with L1 checkpoints by block number (AztecProtocol#23461) feat: Updated slash conditions on block proposals (AztecProtocol#23466) test(e2e): migrate HA full test to pipelining (AztecProtocol#23463) chore: update resource profiles (AztecProtocol#23442) chore: update debug log levels (AztecProtocol#23456) test: fix flaky sentinel_status_slash by asserting the fault on the checkpoint slot (AztecProtocol#23483) feat(slasher): slash checkpoint equivocation between P2P and L1 (A-980) (AztecProtocol#23436) refactor(slasher): rename ATTESTED_DESCENDANT_OF_INVALID -> PROPOSED_DESCENDANT_OF_CHECKPOINT_WITH_INVALID_ATTESTATIONS (AztecProtocol#23468) fix: reject block proposals in poisoned slots (AztecProtocol#23411) fix: retry nargo dep + solc downloads to survive transient DNS drops (AztecProtocol#23490) fix: enrich json-rpc tracing (AztecProtocol#23412) feat: add trace export controls (AztecProtocol#23413) test(e2e): assert no equivocation offenses in HA full test (AztecProtocol#23496) test: cover invalid checkpoint proposal slashing (AztecProtocol#23503) test(e2e): migrate more e2e suites to proposer pipelining (AztecProtocol#23482) test: flag e2e_slashing_attested_invalid_proposal as flake under pipelining (AztecProtocol#23501) test: flag e2e_p2p_duplicate_proposal_slash as flake under pipelining (AztecProtocol#23515) test(e2e): require cross-observer agreement on sentinel fault slot (AztecProtocol#23513) test: flag e2e_ha_full afterAll hook timeout as flake under pipelining (AztecProtocol#23524) fix(e2e): propagate l1ContractsArgs into node config so archiver matches L1 (AztecProtocol#23514) test: flag e2e_multi_validator_node_key_store P2P tx-dropped failure as flake (AztecProtocol#23528) test(cheat-codes): retry warpL2TimeAtLeastTo in-current-slot test on L1 race (AztecProtocol#23533) test(e2e_ha_full): parallel HA peer node teardown with per-node deadline (AztecProtocol#23539) test: flag e2e_ha_full as flake under HA pipelining (AztecProtocol#23541) test(ci): skip e2e_ha_full entirely on merge-train/spartan (AztecProtocol#23542) test(ci): skip e2e_multi_validator_node_key_store entirely on merge-train/spartan (AztecProtocol#23544) END_COMMIT_OVERRIDE
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
e2e_cross_chain_messaging/l1_to_l2.parallel.test.ts"consumed from public repeatedly" failed in the merge-train (http://ci.aztec-labs.com/1779463097551535) because the chain pruned all 8 pending checkpoints to block 0 at theaztecProofSubmissionEpochs=2deadline (L1 genesis + 144 s), dropping the in-flight L1→L2 tx and the subsequentadvanceBlocknoop. TheEpochTestSettlersafety net never fired — 124 s of polling, zero handler invocations.Built yarn-project locally (after working around a sandbox proxy/DNS quirk that was blocking the
noirbuild), ran the failing test underLOG_LEVEL='info; trace:prover-node:epoch-monitor', instrumentedarchiver.isEpochComplete, and found the actual root cause.What's actually broken
In
setup(), the per-testopts.l1ContractsArgsis spread into thedeployAztecL1Contractscall (so it lands on the deployed rollup) but never written back to the node config. The archiver factory then constructs itsl1Constantsfromconfig.aztecEpochDuration(default 32) even though the rollup was deployed with the override (4).That puts the EpochMonitor and the archiver on different
epochDurations:EpochTestSettler'sEpochMonitorreadsepochDurationfromRollupCheatCodes.getConfig()(rollup contract):4. It computesepochToProve=0for block 1 at slot 2 — correct.archiver.isEpochComplete(0)uses its own l1ConstantsepochDuration=32.getSlotRangeForEpoch(0)→endSlot=31. The checkpointed L2 tip's slot is2(or even0when the bad endSlot makes the resolver fall through to the genesis branch), soslot < endSlot. It then falls through to the L1-timestamp check, which won't be satisfied for another 363 s.isEpochComplete(0)permanently returnsfalsefor this run,handleEpochReadyToProvenever fires,markAsProvenis never called, and the proof window expires before any test code reachesadvanceBlock(which is the only other path that callsmarkAsProvenfor this test).Smoking-gun probe output (with temporary instrumentation):
What's in the PR
A single small change in
yarn-project/end-to-end/src/fixtures/setup.ts: after the L1 deploy andObject.assign(config, l1ContractAddresses), copy every defined field fromopts.l1ContractsArgsontoconfig. Now any field a test overrides vial1ContractsArgs(aztecEpochDuration: 4,aztecProofSubmissionEpochs: 2, etc.) is reflected in the node config that the archiver later reads.The
undefined-filtering matters:P2PNetworkTestbuildsdeployL1ContractsArgsby spreading a partialAztecNodeConfig(...initialValidatorConfiginp2p_network.ts:133), which leavesdataDirectory(and other unset node-config fields) atundefinedinside the value. A blindObject.assign(config, opts.l1ContractsArgs)would then clobber the tempdataDirectorythatsetup()had already assigned earlier in this same function and crashsetupSharedBlobStorage. The first revision of this PR had that bug; CI caught it ine2e_p2p/add_rollup.test.tswithTypeError: The "path" argument must be of type string. Received undefinedatsetup.ts:110→setupSharedBlobStorage(setup.ts:505). Iterating field-by-field and skipping undefined fixes that.Verification
Both tests pass locally with the refined fix.
l1_to_l2.parallel.test.ts(original failure) — settler fires on schedule, well before the firstadvanceBlockcall:e2e_p2p/add_rollup.test.ts(the regression CI surfaced on the prior PR revision):Independent before/after archiver probe also confirms the divergence is gone:
[archiver-factory probe]Advanced outbox to epoch N)setup.ts)config.aztecEpochDuration=32 contract.epochDuration=4advanceBlock → watcher.markAsProven()config.aztecEpochDuration=4 contract.epochDuration=4epoch 0,epoch 1, … fire before any test-sidemarkAsProvenrunsReconsidering the original PR
The first revision of this PR rewrote
EpochTestSettlerto pollrollup.getTips()directly and bypassEpochMonitor+ the L2BlockSource entirely. That works — and is a defensible robustness change in its own right — but it doesn't address the underlying bug, which is a config-drift between the L1 deployment args and the node config insetup(). With the root-cause fix in place the existing settler functions correctly, so the contract-direct rewrite isn't needed.Worth doing as a separate follow-up if we want to harden the archiver itself:
yarn-project/archiver/src/factory.tsreadsepochDuration/slotDurationfromconfigeven thoughproofSubmissionEpochs,l1GenesisTime, etc. are already fetched from the rollup contract. Switching those two over removes the config-drift class of bug entirely. The PR keeps that scope out since the test-fixture fix is enough to close the bug, and the production node has no path that lets the two get out of sync (env-var driven deploy + env-var driven node config).Other affected tests
A grep for
l1ContractsArgsshows only a handful of tests pass per-test L1 overrides (everything else relies ongetL1ContractsConfigEnvVars()defaults). Of those,e2e_cross_chain_messaging/l1_to_l2.parallel.test.tsis the canary because it combines a tightaztecProofSubmissionEpochs=2/aztecEpochDuration=4window with a 140+ s setup. Other tests either pad the proof window high (aztecProofSubmissionEpochs: 1024/640) or rely onAUTOMINE_E2E_OPTS+ manualcheatCodes.rollup.markAsProven(). With this fix every test that overrides L1 config vial1ContractsArgsnow has a consistent node config.Diagnostic gist
Patch used to identify the gate: https://gist.github.com/AztecBot/a51adb514b9e374fb993212cd8a4beca