refactor: centralize block-zero handling in archiver#22870
Merged
spalladino merged 3 commits intoMay 4, 2026
Conversation
This was referenced Apr 30, 2026
PhilWindle
pushed a commit
that referenced
this pull request
May 1, 2026
> **Note:** This PR is stacked together with #22818, #22809, and #22870 into combined PR #22891 (targeting `merge-train/spartan`) for easier merging. The combined PR has each of these as a separate commit, so reviewers can either review here or on the combined PR. ## Motivation `KVArchiverDataStore` was a thick pass-through wrapper that re-exported the same methods as the substores it owned (`BlockStore`, `LogStore`, `MessageStore`, `ContractClassStore`, `ContractInstanceStore`), forcing every API change to be plumbed through three layers. Removing it brings the archiver one step closer to a clean, query-object-based data source API and makes substore boundaries explicit at every call site. ## Approach Replace the wrapper with a plain `ArchiverDataStores` bundle that exposes the substores directly, and move cross-store helpers to free functions on the bundle. Substores absorb the array/iterator helpers that previously lived on the wrapper. Function-name caching becomes its own small class, and the `ContractDataSource` adapter is now a named class instead of an inline object literal so that re-prover tools and tests can reach for it without depending on `data_stores.ts` internals. ## Changes - **archiver**: Delete `KVArchiverDataStore`; add `ArchiverDataStores` bundle (`BlockStore`, `LogStore`, `MessageStore`, `ContractClassStore`, `ContractInstanceStore`, db, function-name cache) and `createArchiverDataStores`. Cross-store helpers (`getArchiverSynchPoint`, `backupArchiverDataStores`) move to free functions. - **archiver (renames)**: `ArchiverDataStores` fields use plural form: `blocks`, `logs`, `messages`, `contractClasses`, `contractInstances`. Substore class names and individual store classes are unchanged. - **archiver (function names)**: Extract `FunctionNamesCache` class (replaces `Map<string, string>` plus the `registerContractFunctionSignatures`/`getDebugFunctionName` free functions). - **archiver (contract data source)**: Extract `ArchiverContractDataSourceAdapter` class implementing `ContractDataSource`; `createContractDataSource` is now a thin factory. - **archiver (substores)**: `BlockStore.getBlocks`/`getCheckpointedBlocks`/`getBlockHeaders` return arrays (iterator variants renamed `iterate*`); contract substores expose batch `add*`/`delete*` helpers. - **archiver (tests)**: Split the 4286-line `kv_archiver_store.test.ts` into per-substore test files (`block_store.test.ts`, `log_store.test.ts`, `message_store.test.ts`, `contract_class_store.test.ts`, `contract_instance_store.test.ts`). - **node-lib, prover-node, txe, validator-client, end-to-end**: Update call sites to reach for the relevant substore on `ArchiverDataStores` directly.
PhilWindle
reviewed
May 1, 2026
| * the first proposal because the sequencer would advertise `Fr.ZERO` instead of the | ||
| * real root that L1 stored at `archives[0]`. | ||
| */ | ||
| private getGenesisBlock(): Promise<L2Block> { |
Collaborator
There was a problem hiding this comment.
Do these need to return promises and have the memoisation?
PhilWindle
reviewed
May 1, 2026
| const blobClient = createBlobClient(archiverConfig, { logger: createLogger('archiver:blob-client:client') }); | ||
| const archiver = await createArchiver(archiverConfig, { telemetry, blobClient }, { blockUntilSync: true }); | ||
|
|
||
| // Spin up a NativeWorldStateService just long enough to derive the initial block header. |
Collaborator
There was a problem hiding this comment.
Wonder if this file is even needed anymore.
Contributor
Author
There was a problem hiding this comment.
Not really. I was unsure on whether to remove it as part of this PR, but since we're at it, I'll kill it.
PhilWindle
approved these changes
May 1, 2026
PhilWindle
left a comment
Collaborator
There was a problem hiding this comment.
Just a small nit/question.
ef5c3b5 to
f930fd3
Compare
81f8b46 to
8338da1
Compare
The archiver now accepts initialHeader at construction (sourced from nativeWs.getInitialHeader() after world-state is built first) and returns a synthetic L2Block.empty(initialHeader) for single-block queries that resolve to genesis (number 0, hash matching initialHeader.hash(), archive matching initialHeader.lastArchive.root, or tag resolving to 0). Range queries (getBlocks, getBlocksData) explicitly do NOT prepend. L2TipsCache, world-state synchronizer's getL2Tips, stdlib's L2TipsStoreBase, P2PClient/L2TipsKVStore, PXE, and aztec-node sentinel all switched to the dynamic initialHeader.hash() instead of the static GENESIS_BLOCK_HEADER_HASH. Genesis special-casing removed from aztec-node, prover-node, p2p_client, sequencer (all-zeros escape hatch), and stdlib areBlockHashesEqualAt (reorg detection at genesis is now real). MockL2BlockSource synthesizes block 0 with a configurable initialHeader to mirror archiver behaviour; NoopL1Archiver accepts initialHeader; world-state and validator-client integration tests thread db.getInitialHeader() through.
8338da1 to
82b0706
Compare
4b17843
into
spl/internal-archiver-api-review
7 of 9 checks passed
spalladino
added a commit
that referenced
this pull request
May 4, 2026
The Aztec rollup has an implicit "block zero" whose state is captured in
an initial block header computed by `NativeWorldStateService`. Today the
archiver returns `undefined` for any query that resolves to genesis,
forcing every consumer (aztec-node, prover-node, p2p, sequencer, PXE,
sentinel) to reach into
`worldStateSynchronizer.getCommitted().getInitialHeader()` and
synthesize a fake block 0 themselves. Worse, components disagreed on the
genesis hash whenever `genesisTimestamp` or prefilled state diverged
from the default, because some used
`worldState.getInitialHeader().hash()` (dynamic) and others used the
protocol constant `GENESIS_BLOCK_HEADER_HASH` (static).
This refactor centralizes block-zero handling in the archiver and
threads the dynamic initial header through every component that
previously hard-coded the constant, eliminating the divergence and
removing the special-case branches in callers.
Construct world-state first, capture `nativeWs.getInitialHeader()`, and
pass it into the archiver at construction. The archiver returns a
synthetic `L2Block.empty(initialHeader)` for single-block queries that
resolve to genesis — by number, hash, archive, or tag. Range queries
explicitly do **not** prepend. `L2TipsCache`, world-state synchronizer's
`getL2Tips`, stdlib's `L2TipsStoreBase`, P2PClient/L2TipsKVStore, PXE,
and sentinel all switched to the dynamic `initialHeader.hash()`. Genesis
special-casing was deleted from aztec-node, prover-node, p2p_client,
sequencer (the all-zeros escape hatch), and stdlib's
`areBlockHashesEqualAt` — reorg detection at genesis is now real instead
of an `if blockNumber === 0 return true` short-circuit.
- **archiver**: synthetic genesis block in `data_source_base` (with
`archive` set to `new AppendOnlyTreeSnapshot(genesisArchiveRoot, 1)`),
`initialHeader` plumbing through `factory` / constructor, `L2TipsCache`
uses dynamic genesis hash, `MockL2BlockSource` synthesizes block 0 +
exposes `getInitialHeader`/`setInitialHeader`/`setGenesisArchiveRoot`,
`NoopL1Archiver` accepts `initialHeader`.
- **world-state**: `createWorldState` exported as a public factory, new
`createWorldStateSynchronizerOverNative` that wraps a pre-built native
instance, `getL2Tips` reports `initialHeader.hash()` and
`BlockNumber.ZERO` for genesis tips.
- **aztec-node**: `server.ts` reorders wiring so world-state is built
first; `getBlock`, `getBlockHeader`, `resolveBlockNumber`,
`getPrivateLogsByTags`, `getPublicLogsByTagsFromContract` no longer
special-case genesis. `buildGenesisBlockResponse` deleted. Sentinel
re-creates its `L2TipsMemoryStore` in `init()` with the archiver's
block-0 hash.
- **stdlib**: `L2TipsStoreBase` accepts `initialBlockHash` (default
`GENESIS_BLOCK_HEADER_HASH` for back-compat). `areBlockHashesEqualAt` no
longer short-circuits at block 0. `L2BlockStream`'s reorg-search loop
refuses to walk past block 0 — emits a clear "genesis hash mismatch"
error instead of cascading into "block hash not found for -1".
`L2TipsKVStore`/`L2TipsMemoryStore` thread the param through.
- **p2p**: `P2PClient` / `createP2PClient` accept and forward
`initialBlockHash`; aztec-node passes the archiver's genesis hash. Test
helpers and benches updated.
- **pxe**: fetches `node.getBlock(0)` at startup and seeds
`L2TipsKVStore` with that hash.
- **sequencer-client**: deleted the all-zeros escape hatch in
`getStatus`.
- **prover-node**: `gatherPreviousBlockHeader` calls
`l2BlockSource.getBlockData({number:0})` uniformly.
- **tests**: new `archiver/src/modules/data_source_base.test.ts`
covering genesis-query semantics; world-state and validator-client
integration tests thread `db.getInitialHeader()` and
`genesisArchiveRoot` to the archiver/mock; new `l2_block_stream.test.ts`
case asserting the genesis-hash-mismatch error path.
PhilWindle
pushed a commit
that referenced
this pull request
May 5, 2026
⚠️ **This PR includes #22870. Reviewers should review only the first commit.**⚠️ ## Motivation Consolidates the block-related lookup surface on `L2BlockSource` from ~17 narrow methods returning ~9 different shapes down to 4 methods returning 2 shapes (`L2Block` and `BlockData`). Replaces the per-shape getters with discriminated query objects that carry both the lookup discriminant and a single `onlyCheckpointed` filter, removing the parallel `Checkpointed*` API and the throwaway wrapper types. Additionally, this refactor centralizes block-zero handling in the archiver and threads the dynamic initial header through every component that previously hard-coded the constant, eliminating the divergence and removing the special-case branches in callers. ## Approach `L2BlockSource` exposes 4 methods that take query objects: ```ts getBlock(query: BlockQuery): Promise<L2Block | undefined> getBlocks(query: BlocksQuery): Promise<L2Block[]> getBlockData(query: BlockQuery): Promise<BlockData | undefined> getBlocksData(query: BlocksQuery): Promise<BlockData[]> type BlockQuery = ({number} | {hash} | {archive}) & { onlyCheckpointed?: boolean } type BlocksQuery = ({from, limit} | {epoch}) & { onlyCheckpointed?: boolean } ``` On-disk format is unchanged — the archiver already stored block metadata, tx bodies, and per-checkpoint L1/attestation data in separate LMDB maps; `CheckpointedL2Block` was only an in-memory join produced at read time. **Includes changes from #22870 ## API surface change ### Methods removed from `L2BlockSource` `getL2Block`, `getL2BlockByHash`, `getL2BlockByArchive`, `getCheckpointedBlock`, `getCheckpointedBlockByHash`, `getCheckpointedBlockByArchive`, `getCheckpointedBlocks`, `getCheckpointedBlocksForEpoch`, `getCheckpointedBlockHeadersForEpoch`, `getBlock(number)`, `getBlocks(from, limit)`, `getBlockData(number)`, `getBlockDataByArchive`, `getBlockDataWithCheckpointContext`, `getBlockHeader`, `getBlockHeaderByHash`, `getBlockHeaderByArchive`. ### Types deleted `CheckpointedL2Block`, `BlockDataWithCheckpointContext` — both removed entirely (file + schema + re-exports). Callers that previously read `.l1` / `.attestations` off these now do `getBlockData(...)` followed by `getCheckpointData(blockData.checkpointNumber)` and read those fields off `CheckpointData`. ### Types added `BlockQuery`, `BlocksQuery` (and matching Zod schemas) on `L2BlockSource`. No new domain types — `L2Block`, `BlockData`, `BlockHeader` are unchanged. ### AztecNode public RPC Method names preserved (`getBlock`, `getBlockHeader`, `getCheckpointedBlocks`, etc. — bodies delegate internally to the new `L2BlockSource` methods). One wire-level change: `AztecNode.getCheckpointedBlocks` element type goes `CheckpointedL2Block[]` → `BlockResponse[]`, forced by the type deletion. Older RPC clients that parse the old shape will need to update. ## Changes - **stdlib**: `BlockQuery` / `BlocksQuery` types + Zod schemas next to `L2BlockSource`. `CheckpointedL2Block` file deleted; `BlockDataWithCheckpointContext` removed from `block_data.ts`. `ArchiverApiSchema` and `MockArchiver` shrunk; new `it()` blocks cover each query discriminant. `L2BlockStream` migrated. - **archiver**: `BlockStore` consolidates to four query-object reads plus iterators. `data_source_base.ts` adds `resolveBlocksQuery` that translates `{ epoch }` → `{ from, limit }` (returns `null` for empty epochs so callers short-circuit to `[]`). Mocks honor `onlyCheckpointed`. - **aztec-node**: `server.ts` keeps the public RPC method names but delegates to the new query methods. `getCheckpointedBlocks` adds a per-call `Map<CheckpointNumber, CheckpointData>` cache to avoid an N+1. - **consumer migrations**: `world-state`, `txe`, `p2p` block-txs handler, `validator-client` (`validator.ts`, `proposal_handler.ts`), `pxe` block-stream source (honors `onlyCheckpointed` via `node.getL2Tips`), `prover-node`, `sequencer-client`, `telemetry-client`, `aztec/testing`, `L2BlockStream` in stdlib. - **tests**: per-package mocks updated for the new shapes; new test covers `getBlocks({ epoch })` empty-epoch returning `[]`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The Aztec rollup has an implicit "block zero" whose state is captured in an initial block header computed by
NativeWorldStateService. Today the archiver returnsundefinedfor any query that resolves to genesis, forcing every consumer (aztec-node, prover-node, p2p, sequencer, PXE, sentinel) to reach intoworldStateSynchronizer.getCommitted().getInitialHeader()and synthesize a fake block 0 themselves. Worse, components disagreed on the genesis hash whenevergenesisTimestampor prefilled state diverged from the default, because some usedworldState.getInitialHeader().hash()(dynamic) and others used the protocol constantGENESIS_BLOCK_HEADER_HASH(static).This refactor centralizes block-zero handling in the archiver and threads the dynamic initial header through every component that previously hard-coded the constant, eliminating the divergence and removing the special-case branches in callers.
Approach
Construct world-state first, capture
nativeWs.getInitialHeader(), and pass it into the archiver at construction. The archiver returns a syntheticL2Block.empty(initialHeader)for single-block queries that resolve to genesis — by number, hash, archive, or tag. Range queries explicitly do not prepend.L2TipsCache, world-state synchronizer'sgetL2Tips, stdlib'sL2TipsStoreBase, P2PClient/L2TipsKVStore, PXE, and sentinel all switched to the dynamicinitialHeader.hash(). Genesis special-casing was deleted from aztec-node, prover-node, p2p_client, sequencer (the all-zeros escape hatch), and stdlib'sareBlockHashesEqualAt— reorg detection at genesis is now real instead of anif blockNumber === 0 return trueshort-circuit.Changes
data_source_base(witharchiveset tonew AppendOnlyTreeSnapshot(genesisArchiveRoot, 1)),initialHeaderplumbing throughfactory/ constructor,L2TipsCacheuses dynamic genesis hash,MockL2BlockSourcesynthesizes block 0 + exposesgetInitialHeader/setInitialHeader/setGenesisArchiveRoot,NoopL1ArchiveracceptsinitialHeader.createWorldStateexported as a public factory, newcreateWorldStateSynchronizerOverNativethat wraps a pre-built native instance,getL2TipsreportsinitialHeader.hash()andBlockNumber.ZEROfor genesis tips.server.tsreorders wiring so world-state is built first;getBlock,getBlockHeader,resolveBlockNumber,getPrivateLogsByTags,getPublicLogsByTagsFromContractno longer special-case genesis.buildGenesisBlockResponsedeleted. Sentinel re-creates itsL2TipsMemoryStoreininit()with the archiver's block-0 hash.L2TipsStoreBaseacceptsinitialBlockHash(defaultGENESIS_BLOCK_HEADER_HASHfor back-compat).areBlockHashesEqualAtno longer short-circuits at block 0.L2BlockStream's reorg-search loop refuses to walk past block 0 — emits a clear "genesis hash mismatch" error instead of cascading into "block hash not found for -1".L2TipsKVStore/L2TipsMemoryStorethread the param through.P2PClient/createP2PClientaccept and forwardinitialBlockHash; aztec-node passes the archiver's genesis hash. Test helpers and benches updated.node.getBlock(0)at startup and seedsL2TipsKVStorewith that hash.getStatus.gatherPreviousBlockHeadercallsl2BlockSource.getBlockData({number:0})uniformly.archiver/src/modules/data_source_base.test.tscovering genesis-query semantics; world-state and validator-client integration tests threaddb.getInitialHeader()andgenesisArchiveRootto the archiver/mock; newl2_block_stream.test.tscase asserting the genesis-hash-mismatch error path.