Skip to content

feat(p2p): tx validation cache#23585

Merged
fcarreiro merged 1 commit into
merge-train/spartanfrom
fc/tx-validator-cache-2
May 28, 2026
Merged

feat(p2p): tx validation cache#23585
fcarreiro merged 1 commit into
merge-train/spartanfrom
fc/tx-validator-cache-2

Conversation

@fcarreiro

@fcarreiro fcarreiro commented May 27, 2026

Copy link
Copy Markdown
Contributor

Overview

Adds a tx validation cache to the p2p layer so that repeated validation of the same transaction by the same validator reuses the prior result instead of redoing the work (notably the expensive proof verification).

Downside: Using this cache for validations adds up to 7ms overhead for each validation, when the object needs to be hashed. This is actually entirely (+90%) dominated by .toBuffer() time.

Cached validation is added for on-demand tx collection, but NOT for gossip and RPC ingress.

Changes

Cache core (p2p/src/msg_validators/tx_validator/)

  • TxValidationCache — bounded, LRU-evicting cache keyed by (validatorSymbol, txHash). Stores the in-flight promise before awaiting, so concurrent validations of the same tx coalesce into a single call. get/set/delete take the cache key directly; key(validatorSymbol, tx) builds it.
  • CachedTxValidator — wraps any TxValidator to route validateTx through the cache using the validator's identifier symbol. DataTxValidator and TxProofValidator gained stable identifiers.
  • factory.ts — threads an optional TxValidationCache through the gossip (first/second stage), block-proposal, on-demand, and RPC validator builders, wrapping the state-independent validators (DataTxValidator, TxProofValidator, and the minimum-integrity aggregate) in CachedTxValidator.

LRU map extracted to foundation (foundation/src/collection/lru_map.ts)

  • The hand-rolled doubly-linked-list LRU bookkeeping was factored out of TxValidationCache into a generic LruMap<K, V>, mirroring the existing LruSet. TxValidationCache now composes an LruMap<string, Promise<TxValidationResult>>. Added LruMap unit tests.

Wiring

  • New P2P_TX_VALIDATION_CACHE_SIZE env var / txValidationCacheSize config (cache disabled when 0).
  • createP2PClient constructs the cache and passes it to LibP2PService (gossip + block-proposal paths) and to the batch-tx-requester's on-demand validator config.

Benchmarks

  • Added a sha256-based TX hash benchmark.

Closes https://linear.app/aztec-labs/issue/A-934/dont-repeatedly-verify-retrieved-transactions .

@fcarreiro fcarreiro marked this pull request as draft May 27, 2026 13:15
@fcarreiro fcarreiro force-pushed the fc/tx-validator-cache-2 branch from 332aee5 to 8202e45 Compare May 27, 2026 13:41
@fcarreiro fcarreiro marked this pull request as ready for review May 27, 2026 13:42
@fcarreiro fcarreiro force-pushed the fc/tx-validator-cache-2 branch 2 times, most recently from df1ff6d to 8cf320a Compare May 27, 2026 14:15
@fcarreiro fcarreiro requested a review from PhilWindle May 27, 2026 16:20
return {
proofValidator: {
validator: new TxProofValidator(proofVerifier, bindings),
validator: CachedTxValidator.new(new TxProofValidator(proofVerifier, bindings), cache),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove this one, otherwise there would be overhead for gossip as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped these.

private readonly entries: LruMap<string, Promise<TxValidationResult>>;
// We try to remember hashes for known object references to avoid recomputing them.
// eslint-disable-next-line aztec-custom/no-non-primitive-in-collections
private readonly txHashesCache: LruMap<Tx, string>;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will of course keep up to maxSize complete transactions in memory. At ~100KB per tx and 5,000 default maxSize thats 500MB. I'm wondering whether this is a good trade off. We probably don't validate the same object very frequently if at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to a WeakMap which has exactly the semantics we want and doesn't hold on to references. It's not an LRU though, so it will hold values for any TX that has been validated AND is still around. This is actually nice because the eviction will probably match mempool eviction or similar.

@fcarreiro fcarreiro force-pushed the fc/tx-validator-cache-2 branch from 8cf320a to 331ca47 Compare May 28, 2026 10:10
@fcarreiro fcarreiro requested a review from PhilWindle May 28, 2026 10:12
@AztecBot

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/6e078d4be689fa6d�6e078d4be689fa6d8;;�): yarn-project/kv-store/scripts/run_test.sh src/sqlite-opfs/internal/ordered-binary-browser.test.ts (2s) (code: 0)

@fcarreiro fcarreiro merged commit c22eb86 into merge-train/spartan May 28, 2026
17 checks passed
@fcarreiro fcarreiro deleted the fc/tx-validator-cache-2 branch May 28, 2026 10:52
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
fix(archiver): skip descendants of invalid-attestations checkpoints
(AztecProtocol#23502)
chore: scale network validators (AztecProtocol#23579)
fix(ci): nightly 10 TPS bench GCP auth and checkout (AztecProtocol#23586)
chore: set eth node resource profile (AztecProtocol#23583)
fix: wait for checkpoint before sentinel assertions (AztecProtocol#23573)
fix: slash attestations for invalid checkpoint proposals (AztecProtocol#23506)
test: fix web3signer pipelining
`e2e_multi_validator_node_key_store.test.ts` (AztecProtocol#23568)
fix: cap CI devbox hostname (AztecProtocol#23591)
test: stabilize invalid checkpoint descendant e2e (AztecProtocol#23582)
test(e2e): stabilize invalidation slots in `proposer invalidates
multiple checkpoints` (AztecProtocol#23590)
test(e2e): stabilize invalid proposal slashing target slot in
`attested_invalid_proposal` (AztecProtocol#23589)
chore(foundation): faster toBufferBE via zero fast-path (AztecProtocol#23592)
fix: honour BB_BINARY_PATH (AztecProtocol#23570)
chore: bump reth and lighthouse (AztecProtocol#23588)
chore: add web3signer and postgres node selectors (AztecProtocol#23598)
fix: do not symlink .codex folders (AztecProtocol#23593)
chore: fix claude and codex symlinking tests (AztecProtocol#23599)
test(e2e): narrow down sentinel check in `multiple_validators_sentinel`
(AztecProtocol#23604)
test(e2e): fix `proposer invalidates multiple checkpoints` timeout
(AztecProtocol#23608)
fix: record zero-amount slashing offenses (AztecProtocol#23556)
fix: log slashing offense names (AztecProtocol#23565)
feat(p2p): tx validation cache (AztecProtocol#23585)
chore: add KEDA deployment module (AztecProtocol#23553)
chore: add KEDA prover agent autoscaling (AztecProtocol#23554)
chore: update destroy_bootnode.sh (AztecProtocol#23626)
chore: skip failing chonk_pinned_inputs.test in CI (AztecProtocol#23643)
chore(ci): tolerate public authwit P2P receipt flake (AztecProtocol#23648)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants