Skip to content

fix(archiver): atomic getter for L2 tips#23295

Merged
PhilWindle merged 1 commit into
merge-train/spartanfrom
spl/fix-get-l2-tips
May 15, 2026
Merged

fix(archiver): atomic getter for L2 tips#23295
PhilWindle merged 1 commit into
merge-train/spartanfrom
spl/fix-get-l2-tips

Conversation

@spalladino

@spalladino spalladino commented May 14, 2026

Copy link
Copy Markdown
Contributor

Prevents the archiver from reporting invalid L2 tips by querying all chain tips within a db transaction. Moves the responsibility of assembling the tip data to the block store itself to minimize the number of queries to the db. Clamps proven and finalized tips such that an incorrect L1 sync still results in finalized <= proven <= checkpoints. And adds explicit assertions that tips are ordered.

Also adds a guard in the tips store that prevents from deleting block hashes that are still alive by a given chain tip, instead of assuming that the finalized chain tip is always the oldest one.

This should catch errors where the block stream breaks due to a finalized chain tip running ahead of a proven chain tip.

Note that this PR does NOT enforce ordering at the L2Tips struct itself, since consumers (ie the ones that report the "local" chain tips) may break this contract (see A-1061). This PR is a simpler alternative to #22964. Fixes A-1018.

Prevents the archiver from reporting invalid L2 tips by querying all
chain tips within a db transaction. Moves the responsibility of
assembling the tip data to the block store itself to minimize the number
of queries to the db. Clamps proven and finalized tips such that an
incorrect L1 sync still results in finalized <= proven <= checkpoints.
And adds explicit assertions that tips are ordered.

Also adds a guard in the tips store that prevents from deleting block
hashes that are still alive by a given chain tip, instead of assuming
that the finalized chain tip is always the oldest one.

This should catch errors where the block stream breaks due to a
finalized chain tip running ahead of a proven chain tip.

Note that this PR does NOT enforce ordering at the L2Tips struct itself,
since consumers (ie the ones that report the "local" chain tips) may
break this contract (see A-1061).
@AztecBot

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/c207c9d8c62bb353�c207c9d8c62bb3538;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/multiple_validators_sentinel.parallel.test.ts "collects attestations for validators in proposer node when block is not published" (272s) (code: 0) group:e2e-p2p-epoch-flakes

@PhilWindle PhilWindle merged commit e0f2339 into merge-train/spartan May 15, 2026
17 checks passed
@PhilWindle PhilWindle deleted the spl/fix-get-l2-tips branch May 15, 2026 08:29
This was referenced May 15, 2026
AztecBot added a commit that referenced this pull request May 22, 2026
)

Backport of only the l2_tips_store_base.ts change from PR #23295 (the
rest of that PR restructures archiver tip loading and is intentionally
not included). v4-next does not yet have the `proposedCheckpoint` tag,
so that tip is dropped from the live-tip set; the remaining proposed /
checkpointed / proven tips still cap the deletion bound so the block
stream cannot orphan a live tip's block hash, block-to-checkpoint
mapping, or enclosing checkpoint when the finalized tip drifts ahead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
spypsy added a commit that referenced this pull request May 22, 2026
spypsy added a commit that referenced this pull request May 27, 2026
…deleting live tips (#23295) (#23505)

Backport to `v4` of the
`yarn-project/stdlib/src/block/l2_block_stream/l2_tips_store_base.ts`
changes from #23295, per @spalladino's note in [this
thread](https://aztec-labs.slack.com/archives/C0AU8BULZHC/p1779448199476059).
The rest of #23295 (`yarn-project/archiver/src/store/block_store.ts`,
`yarn-project/archiver/src/store/l2_tips_cache.ts`) restructures how the
archiver loads chain tips and is intentionally **not** included.

The `l2_tips_store_base.ts` file on `v4` and `v4-next` is identical
(same blob SHA), so the patch is the same in both branches.

## What this fixes

When `handleChainFinalized` ran, it unconditionally deleted block hashes
/ block-to-checkpoint mappings / checkpoints below the finalized block
number. The implicit assumption was that `finalized` is always the
oldest live tip — but we have hit bugs where that isn't the case.
Deleting under a still-live tip dangles subsequent `getBlockId` /
`getCheckpointId` lookups and locks the block stream into an error loop.

This patch caps every deletion bound at the lowest live tip (proposed /
checkpointed / proven) so finalized can drift ahead without orphaning
data belonging to another tip.

## Self-containment

The patch only touches `handleChainFinalized` inside `L2TipsStoreBase`.
It uses methods that already exist on the base class (`getTip`,
`getCheckpointNumberForBlock`, `deleteBlockHashesBefore`,
`deleteBlockToCheckpointBefore`, `deleteCheckpointsBefore`) and types
that are already imported in the file (`BlockNumber`,
`CheckpointNumber`). No other file in #23295 is required for this change
to compile or behave correctly.

## Difference from the original PR

`v4`'s `L2BlockTag` union does not yet include `'proposedCheckpoint'`
(that tag was added together with the archiver restructuring that this
backport excludes), so the live-tip set here is `['proposed',
'checkpointed', 'proven']` instead of the four-tip set used on `next`.
The behaviour — cap deletion bound at the lowest live tip — is
unchanged.

Original PR: #23295
@aminsammara aminsammara mentioned this pull request Jun 3, 2026
1 task
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
refactor(p2p): merge FastTxCollection into TxCollection with sequential
pipeline (AztecProtocol#23245)
refactor(publisher): bundle-level simulate; drop per-action enqueue sims
(AztecProtocol#23165)
refactor(stdlib): remove deprecated RevertCode/TxExecutionResult aliases
(AztecProtocol#23249)
test(e2e): fix race in 'proposer invalidates multiple checkpoints'
(AztecProtocol#23259)
fix: clean up old jobs regardless of pending status (AztecProtocol#23260)
refactor(p2p): remove unused sendBatchRequest (AztecProtocol#23273)
chore(p2p): remove proposal_tx_collector leftovers (AztecProtocol#23276)
feat: slash truncated checkpoint proposals (AztecProtocol#23250)
refactor: remove unused map in attestation pool (AztecProtocol#23284)
chore(p2p): assert last block in checkpoint proposal is correct (AztecProtocol#23274)
refactor(l1-tx-utils): use DateProvider for fail-fast timeout check
(AztecProtocol#23257)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23277)
test(e2e): fix race in broadcasted_invalid_block_proposal_slash under
pipelining (AztecProtocol#23302)
fix(archiver): atomic getter for L2 tips (AztecProtocol#23295)
fix(sequencer): use targetSlot in tryVoteWhenEscapeHatchOpen under
pipelining (AztecProtocol#23296)
fix(world-state): make fork close idempotent for pruned forks (AztecProtocol#23298)
test(e2e): migrate passing tests to proposer pipelining (AztecProtocol#23275)
chore: update dashboard (AztecProtocol#23312)
chore: Revert "feat(sandbox): support proposer pipelining in local
network" (AztecProtocol#23313)
test: slash on bad attestation (AztecProtocol#23184)
feat(slasher): per-slot data-withholding watcher (A-523, A-525) (AztecProtocol#23116)
test(e2e): enable pipelining on e2e debug trace (AztecProtocol#23301)
test(e2e): enable pipelining on l1-to-l2 test (AztecProtocol#23300)
test(e2e): switch fee_settings to organic fee bumps under pipelining
(AztecProtocol#23303)
fix(ci): retry sqlite3mc-wasm download on transient DNS/TLS failures
(AztecProtocol#23333)
test(e2e): wait for real oracle rotation in fee_settings inflate helper
(AztecProtocol#23334)
test(e2e): anchor e2e_amm PXE to checkpointed tip under pipelining
(AztecProtocol#23336)
fix(spartan-bench): tolerate older node images in SlasherConfig schema
(AztecProtocol#23351)
fix: interrupt prover jobs in stop (AztecProtocol#23358)
test(e2e): enable pipelining on bot, fees, and avm simulator tests
(AztecProtocol#23329)
feat(sentinel): end-of-epoch evaluation with re-execution outcomes
(AztecProtocol#23286)
feat: slash for invalid checkpoint proposals (AztecProtocol#23270)
fix: fork closure in epoch proving jobs (AztecProtocol#23390)
fix(slasher): anchor watcher scans at archiver synced L2 slot (AztecProtocol#23394)
fix: avoid npm uplink for aztec-up local publishes (AztecProtocol#23396)
test(e2e): ignore benign 'Insufficient valid txs' block-build-failed in
epochs tests (AztecProtocol#23424)
chore: refactor weekly proving test wait (AztecProtocol#23395)
refactor: add fifo set (AztecProtocol#23271)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23327)
fix(p2p): validate BLOCK_TXS in BatchTxRequester (AztecProtocol#23371)
chore(p2p): simplify IBatchRequestTxValidator (AztecProtocol#23373)
feat(sequencer): AutomineSequencer for single-sequencer e2e tests
(AztecProtocol#23354)
fix(prover): wait for previous epoch to be proven (AztecProtocol#23458)
chore: collocate provers (AztecProtocol#23439)
chore: rm staging-ignition (AztecProtocol#23440)
chore: rm unused networks (AztecProtocol#23441)
test(e2e): migrate block_building, multi_validator_node,
publisher_funding, invalid_checkpoint_proposal to pipelining (AztecProtocol#23414)
fix(archiver): reconcile local blocks with L1 checkpoints by block
number (AztecProtocol#23461)
feat: Updated slash conditions on block proposals (AztecProtocol#23466)
test(e2e): migrate HA full test to pipelining (AztecProtocol#23463)
chore: update resource profiles (AztecProtocol#23442)
chore: update debug log levels (AztecProtocol#23456)
test: fix flaky sentinel_status_slash by asserting the fault on the
checkpoint slot (AztecProtocol#23483)
feat(slasher): slash checkpoint equivocation between P2P and L1 (A-980)
(AztecProtocol#23436)
refactor(slasher): rename ATTESTED_DESCENDANT_OF_INVALID ->
PROPOSED_DESCENDANT_OF_CHECKPOINT_WITH_INVALID_ATTESTATIONS (AztecProtocol#23468)
fix: reject block proposals in poisoned slots (AztecProtocol#23411)
fix: retry nargo dep + solc downloads to survive transient DNS drops
(AztecProtocol#23490)
fix: enrich json-rpc tracing (AztecProtocol#23412)
feat: add trace export controls (AztecProtocol#23413)
test(e2e): assert no equivocation offenses in HA full test (AztecProtocol#23496)
test: cover invalid checkpoint proposal slashing (AztecProtocol#23503)
test(e2e): migrate more e2e suites to proposer pipelining (AztecProtocol#23482)
test: flag e2e_slashing_attested_invalid_proposal as flake under
pipelining (AztecProtocol#23501)
test: flag e2e_p2p_duplicate_proposal_slash as flake under pipelining
(AztecProtocol#23515)
test(e2e): require cross-observer agreement on sentinel fault slot
(AztecProtocol#23513)
test: flag e2e_ha_full afterAll hook timeout as flake under pipelining
(AztecProtocol#23524)
fix(e2e): propagate l1ContractsArgs into node config so archiver matches
L1 (AztecProtocol#23514)
test: flag e2e_multi_validator_node_key_store P2P tx-dropped failure as
flake (AztecProtocol#23528)
test(cheat-codes): retry warpL2TimeAtLeastTo in-current-slot test on L1
race (AztecProtocol#23533)
test(e2e_ha_full): parallel HA peer node teardown with per-node deadline
(AztecProtocol#23539)
test: flag e2e_ha_full as flake under HA pipelining (AztecProtocol#23541)
test(ci): skip e2e_ha_full entirely on merge-train/spartan (AztecProtocol#23542)
test(ci): skip e2e_multi_validator_node_key_store entirely on
merge-train/spartan (AztecProtocol#23544)
END_COMMIT_OVERRIDE
aminsammara added a commit that referenced this pull request Jun 4, 2026
## Summary

Patch release `v4.3.1` on `v4`. Supersedes the auto-opened
release-please PR #20979, whose changelog was missing the `stdlib:`
entry and listed the `prover:` fix twice.

Bumps `.release-please-manifest.json` to 4.3.1 and prepends a `##
[4.3.1]` entry to `CHANGELOG.md` with the two PRs landed since v4.3.0:

- **#23457** — `fix(prover): wait for previous epoch to be proven`
- **#23505** — `chore(backport-to-v4): fix(stdlib): guard tip-store
finalize against deleting live tips` (backport of #23295)

## Test plan

- [ ] CI green
aminsammara added a commit that referenced this pull request Jun 4, 2026
## Summary

Adds the third bug fix that landed on `v4` (via backport PR #23851 of
#23470) to the existing `## [4.3.1]` `CHANGELOG.md` entry so the section
reflects what v4.3.1 actually ships.

Single-line change in `CHANGELOG.md` only — no version-file or manifest
changes. Manifest stays at 4.3.1.

## What v4.3.1 contains after this lands

- `prover: wait for previous epoch to be proven` (#23457)
- `stdlib: guard tip-store finalize against deleting live tips`
(backport of #23295, via #23505)
- `release-image: re-stamp aztec_version into released noir-contracts
artifacts` (backport of #23470, via #23851) — **new entry, fixes the
v4.3.1 ci-compat-e2e failure**

## After merging

Force-move the `v4.3.1` tag to the merge commit of this PR (safe —
nothing has been published as 4.3.1 yet: npm has only 4.3.0, Docker Hub
returns 404 for `aztec:4.3.1`, GitHub Release for v4.3.1 doesn't exist).

## Test plan

- [ ] Visual check that the `## [4.3.1]` section in `CHANGELOG.md` lists
all three fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants