Skip to content

fix: make world-state hash queries reorg-aware to close getWorldState race#23677

Merged
PhilWindle merged 3 commits into
merge-train/spartanfrom
cb/world-state-hash-query-reorg-aware
May 29, 2026
Merged

fix: make world-state hash queries reorg-aware to close getWorldState race#23677
PhilWindle merged 3 commits into
merge-train/spartanfrom
cb/world-state-hash-query-reorg-aware

Conversation

@AztecBot

Copy link
Copy Markdown
Collaborator

Problem

AztecNodeService.getWorldState({ hash }) can intermittently throw

Block hash 0x.. not found in world state at block number N
(world state has no hash at that index ...). If the node API has been queried
with anchor block hash possibly a reorg has occurred.

even when no reorg happened. This surfaced as a flake in oxide / e2e_frozen_notes_refund e2e tests proving a tx that reads existing notes (private-kernel reset resolving a read request anchored on an early block).

Root cause

getWorldState did its world-state sync via #syncWorldState(), which only synced to the archiver's latest height and ignored any requested block hash. It then:

  1. resolved the requested hash to a block number against the archiver, and
  2. took getSnapshot(N) and read the ARCHIVE leaf at index N to double-check the hash.

The synchronizer reports syncedTo >= N as soon as the height advances, but the archive-tree commit for block N may not yet be visible in the snapshot view. A snapshot taken in that window has populated note/nullifier trees but a half-written archive tree, so getLeafValue(ARCHIVE, N) returns undefined and the double-check throws. The sync-status height and the archive-tree write were not barriered against each other.

Fix

Make the hash-anchored path reorg-aware, as the existing synchronizer plumbing already supports (syncImmediate(targetBlockNumber, blockHash)):

  • When the request carries a block hash, resolve it against the archiver up front (fails fast with the existing clear reorg error if the hash is unknown), then drive the sync to that exact (blockNumber, hash).
  • WorldStateSynchronizer.syncImmediate reads the committed ARCHIVE leaf for that block; if it is missing or mismatched it forces a blockStream.sync() and re-checks, only throwing on a genuine reorg. This barriers on the archive-tree commit actually landing, closing the race window, and turns real reorgs into a clear block_hash_mismatch instead of a confusing snapshot error.

Block-number / tag queries are unchanged: they still sync to latest height with no hash.

The existing snapshot double-check is kept as a backstop.

Tests

Added unit tests in server.test.ts asserting that a hash-anchored query forwards the resolved (blockNumber, hash) to syncImmediate, and that block-number queries still sync to latest height with no hash. Full yarn-project install/bootstrap (which builds the barretenberg/noir portal packages) was not available in this session, so the suite was not executed here — worth a CI run before merge.


Created by claudebox · group: slackbot

@AztecBot AztecBot added the claudebox Owned by claudebox. it can push to this PR. label May 29, 2026
@PhilWindle PhilWindle marked this pull request as ready for review May 29, 2026 10:15
@PhilWindle PhilWindle enabled auto-merge May 29, 2026 10:15
@PhilWindle PhilWindle changed the base branch from next to merge-train/spartan May 29, 2026 11:26
@PhilWindle PhilWindle merged commit d348795 into merge-train/spartan May 29, 2026
24 of 25 checks passed
@PhilWindle PhilWindle deleted the cb/world-state-hash-query-reorg-aware branch May 29, 2026 11:51
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
test(e2e): unskip pipelining related e2e tests (AztecProtocol#23642)
fix(archiver): prune blocks without proposed checkpoint by end of build
slot (AztecProtocol#23606)
test: migrate benchmarks to pipelining setup (AztecProtocol#23647)
fix(p2p): fall back to archiver in BLOCK_TXS response validation
(AztecProtocol#23624)
docs(slashing): align operator and slasher docs with AZIP-7 (AztecProtocol#23494)
fix(p2p): do not penalize peers that signal a missing block with Fr.ZERO
(AztecProtocol#23672)
chore: adjust metrics deployment (AztecProtocol#23676)
fix(cheat-codes): warpL2TimeAtLeastBy advances relative to leading clock
(AztecProtocol#23675)
chore: tighten node pool sizes (AztecProtocol#23678)
chore: remove archival nodes (AztecProtocol#23630)
chore: merge blob sink duties into RPC node (AztecProtocol#23631)
fix: sync avm-transpiler Cargo.lock with noir submodule (AztecProtocol#23683)
fix(spartan): set validator lag env vars in tps-scenario (AztecProtocol#23684)
fix: make world-state hash queries reorg-aware to close getWorldState
race (AztecProtocol#23677)
fix: pin noir submodule to next's version on merge-train/spartan
(AztecProtocol#23690)
fix: ensure image ref is used by bench runner (AztecProtocol#23682)
fix(ci): retry aztec-nr nargo dependency clone on transient network
flake (AztecProtocol#23653)
chore: run one-off jobs on network nodes (AztecProtocol#23701)
fix: simulate proposals inside target slot (AztecProtocol#23692)
chore: smaller eth-devnet (AztecProtocol#23704)
chore: enable testnet autoscaling (AztecProtocol#23705)
feat(api)!: redesign node log retrieval API around tag-based queries
(AztecProtocol#23625)
fix(sequencer): set own proposed checkpoint locally instead of via p2p
loopback (AztecProtocol#23659)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants