Skip to content

fix(archiver): reconcile local blocks with L1 checkpoints by block number#23461

Merged
spalladino merged 1 commit into
merge-train/spartanfrom
spl/archiver-prune-mismatching-by-block-number
May 21, 2026
Merged

fix(archiver): reconcile local blocks with L1 checkpoints by block number#23461
spalladino merged 1 commit into
merge-train/spartanfrom
spl/archiver-prune-mismatching-by-block-number

Conversation

@spalladino

Copy link
Copy Markdown
Contributor

Motivation

When the sequencer locally proposed block N at slot S1 and L1 mined a different block N at slot S2 (same block number, different slot), the archiver's reconciliation in pruneMismatchingLocalBlocks joined local↔checkpoint blocks by slot and missed the conflict. The local block was never pruned and the subsequent re-apply of L1's block N threw Contract instance ... already exists, cannot add again at block N. The archiver's serial queue then retried in a loop and wedged the data store — this is the cause of the e2e flake at yarn-project/end-to-end/src/composed/ha/e2e_ha_full.test.ts:693.

Approach

pruneMismatchingLocalBlocks now keys conflict detection on blockNumber only, so a same-number-different-slot block is correctly detected as a conflict and pruned. The per-checkpoint trailing prune (which handles "local has extra blocks at the same slot as the published checkpoint") stays scoped by slot so speculative blocks at later slots that belong to a pending proposed checkpoint are preserved — this keeps the pipelining model in promoteProposedToCheckpointed intact. After any prune, pending proposed checkpoints from the offending checkpoint number onwards are evicted so they do not dangle while referencing removed blocks.

Changes

  • archiver: pruneMismatchingLocalBlocks joins by blockNumber only; trailing slot-scoped prune unchanged; new evictProposedCheckpointsForPrunedBlocks helper clears stale pending proposed checkpoints after a prune.
  • archiver (tests): four new regression tests in data_store_updater.test.ts covering the e2e_ha scenario, the prune-by-number path without contract data, pipelining preservation, and eviction of higher-numbered proposed checkpoints chaining off pruned blocks.

…mber, not slot

When the sequencer locally proposed block N at slot S1 and L1 mined a
different block N at slot S2, pruneMismatchingLocalBlocks filtered by
slot and missed the conflict, so the local block (and its contract data)
was never pruned. The subsequent re-apply of L1's block N then threw
"Contract instance ... already exists" and the archiver's serial queue
got stuck retrying.

Fixed by reconciling on blockNumber. The per-checkpoint trailing prune
remains scoped by slot so speculative blocks at later slots (which may
belong to a pending proposed checkpoint) are preserved. Also evict
pending proposed checkpoints that referenced pruned blocks so they do
not dangle after a conflict prune.

Regression tests added in data_store_updater.test.ts.
@spalladino spalladino force-pushed the spl/archiver-prune-mismatching-by-block-number branch from c14aaa6 to 50d22cd Compare May 21, 2026 10:28
@spalladino spalladino changed the title fix(archiver): reconcile local blocks with L1 checkpoints by block number, not slot fix(archiver): reconcile local blocks with L1 checkpoints by block number May 21, 2026
@PhilWindle PhilWindle enabled auto-merge (squash) May 21, 2026 10:48
@spalladino spalladino disabled auto-merge May 21, 2026 15:59
@spalladino spalladino merged commit 65989b5 into merge-train/spartan May 21, 2026
17 checks passed
@spalladino spalladino deleted the spl/archiver-prune-mismatching-by-block-number branch May 21, 2026 15:59
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
refactor(p2p): merge FastTxCollection into TxCollection with sequential
pipeline (AztecProtocol#23245)
refactor(publisher): bundle-level simulate; drop per-action enqueue sims
(AztecProtocol#23165)
refactor(stdlib): remove deprecated RevertCode/TxExecutionResult aliases
(AztecProtocol#23249)
test(e2e): fix race in 'proposer invalidates multiple checkpoints'
(AztecProtocol#23259)
fix: clean up old jobs regardless of pending status (AztecProtocol#23260)
refactor(p2p): remove unused sendBatchRequest (AztecProtocol#23273)
chore(p2p): remove proposal_tx_collector leftovers (AztecProtocol#23276)
feat: slash truncated checkpoint proposals (AztecProtocol#23250)
refactor: remove unused map in attestation pool (AztecProtocol#23284)
chore(p2p): assert last block in checkpoint proposal is correct (AztecProtocol#23274)
refactor(l1-tx-utils): use DateProvider for fail-fast timeout check
(AztecProtocol#23257)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23277)
test(e2e): fix race in broadcasted_invalid_block_proposal_slash under
pipelining (AztecProtocol#23302)
fix(archiver): atomic getter for L2 tips (AztecProtocol#23295)
fix(sequencer): use targetSlot in tryVoteWhenEscapeHatchOpen under
pipelining (AztecProtocol#23296)
fix(world-state): make fork close idempotent for pruned forks (AztecProtocol#23298)
test(e2e): migrate passing tests to proposer pipelining (AztecProtocol#23275)
chore: update dashboard (AztecProtocol#23312)
chore: Revert "feat(sandbox): support proposer pipelining in local
network" (AztecProtocol#23313)
test: slash on bad attestation (AztecProtocol#23184)
feat(slasher): per-slot data-withholding watcher (A-523, A-525) (AztecProtocol#23116)
test(e2e): enable pipelining on e2e debug trace (AztecProtocol#23301)
test(e2e): enable pipelining on l1-to-l2 test (AztecProtocol#23300)
test(e2e): switch fee_settings to organic fee bumps under pipelining
(AztecProtocol#23303)
fix(ci): retry sqlite3mc-wasm download on transient DNS/TLS failures
(AztecProtocol#23333)
test(e2e): wait for real oracle rotation in fee_settings inflate helper
(AztecProtocol#23334)
test(e2e): anchor e2e_amm PXE to checkpointed tip under pipelining
(AztecProtocol#23336)
fix(spartan-bench): tolerate older node images in SlasherConfig schema
(AztecProtocol#23351)
fix: interrupt prover jobs in stop (AztecProtocol#23358)
test(e2e): enable pipelining on bot, fees, and avm simulator tests
(AztecProtocol#23329)
feat(sentinel): end-of-epoch evaluation with re-execution outcomes
(AztecProtocol#23286)
feat: slash for invalid checkpoint proposals (AztecProtocol#23270)
fix: fork closure in epoch proving jobs (AztecProtocol#23390)
fix(slasher): anchor watcher scans at archiver synced L2 slot (AztecProtocol#23394)
fix: avoid npm uplink for aztec-up local publishes (AztecProtocol#23396)
test(e2e): ignore benign 'Insufficient valid txs' block-build-failed in
epochs tests (AztecProtocol#23424)
chore: refactor weekly proving test wait (AztecProtocol#23395)
refactor: add fifo set (AztecProtocol#23271)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23327)
fix(p2p): validate BLOCK_TXS in BatchTxRequester (AztecProtocol#23371)
chore(p2p): simplify IBatchRequestTxValidator (AztecProtocol#23373)
feat(sequencer): AutomineSequencer for single-sequencer e2e tests
(AztecProtocol#23354)
fix(prover): wait for previous epoch to be proven (AztecProtocol#23458)
chore: collocate provers (AztecProtocol#23439)
chore: rm staging-ignition (AztecProtocol#23440)
chore: rm unused networks (AztecProtocol#23441)
test(e2e): migrate block_building, multi_validator_node,
publisher_funding, invalid_checkpoint_proposal to pipelining (AztecProtocol#23414)
fix(archiver): reconcile local blocks with L1 checkpoints by block
number (AztecProtocol#23461)
feat: Updated slash conditions on block proposals (AztecProtocol#23466)
test(e2e): migrate HA full test to pipelining (AztecProtocol#23463)
chore: update resource profiles (AztecProtocol#23442)
chore: update debug log levels (AztecProtocol#23456)
test: fix flaky sentinel_status_slash by asserting the fault on the
checkpoint slot (AztecProtocol#23483)
feat(slasher): slash checkpoint equivocation between P2P and L1 (A-980)
(AztecProtocol#23436)
refactor(slasher): rename ATTESTED_DESCENDANT_OF_INVALID ->
PROPOSED_DESCENDANT_OF_CHECKPOINT_WITH_INVALID_ATTESTATIONS (AztecProtocol#23468)
fix: reject block proposals in poisoned slots (AztecProtocol#23411)
fix: retry nargo dep + solc downloads to survive transient DNS drops
(AztecProtocol#23490)
fix: enrich json-rpc tracing (AztecProtocol#23412)
feat: add trace export controls (AztecProtocol#23413)
test(e2e): assert no equivocation offenses in HA full test (AztecProtocol#23496)
test: cover invalid checkpoint proposal slashing (AztecProtocol#23503)
test(e2e): migrate more e2e suites to proposer pipelining (AztecProtocol#23482)
test: flag e2e_slashing_attested_invalid_proposal as flake under
pipelining (AztecProtocol#23501)
test: flag e2e_p2p_duplicate_proposal_slash as flake under pipelining
(AztecProtocol#23515)
test(e2e): require cross-observer agreement on sentinel fault slot
(AztecProtocol#23513)
test: flag e2e_ha_full afterAll hook timeout as flake under pipelining
(AztecProtocol#23524)
fix(e2e): propagate l1ContractsArgs into node config so archiver matches
L1 (AztecProtocol#23514)
test: flag e2e_multi_validator_node_key_store P2P tx-dropped failure as
flake (AztecProtocol#23528)
test(cheat-codes): retry warpL2TimeAtLeastTo in-current-slot test on L1
race (AztecProtocol#23533)
test(e2e_ha_full): parallel HA peer node teardown with per-node deadline
(AztecProtocol#23539)
test: flag e2e_ha_full as flake under HA pipelining (AztecProtocol#23541)
test(ci): skip e2e_ha_full entirely on merge-train/spartan (AztecProtocol#23542)
test(ci): skip e2e_multi_validator_node_key_store entirely on
merge-train/spartan (AztecProtocol#23544)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants