feat(p2p): tx validation cache#23585
Conversation
332aee5 to
8202e45
Compare
df1ff6d to
8cf320a
Compare
| return { | ||
| proofValidator: { | ||
| validator: new TxProofValidator(proofVerifier, bindings), | ||
| validator: CachedTxValidator.new(new TxProofValidator(proofVerifier, bindings), cache), |
There was a problem hiding this comment.
I can remove this one, otherwise there would be overhead for gossip as well
| private readonly entries: LruMap<string, Promise<TxValidationResult>>; | ||
| // We try to remember hashes for known object references to avoid recomputing them. | ||
| // eslint-disable-next-line aztec-custom/no-non-primitive-in-collections | ||
| private readonly txHashesCache: LruMap<Tx, string>; |
There was a problem hiding this comment.
This will of course keep up to maxSize complete transactions in memory. At ~100KB per tx and 5,000 default maxSize thats 500MB. I'm wondering whether this is a good trade off. We probably don't validate the same object very frequently if at all.
There was a problem hiding this comment.
Changed it to a WeakMap which has exactly the semantics we want and doesn't hold on to references. It's not an LRU though, so it will hold values for any TX that has been validated AND is still around. This is actually nice because the eviction will probably match mempool eviction or similar.
8cf320a to
331ca47
Compare
Flakey Tests🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry. |
BEGIN_COMMIT_OVERRIDE fix(archiver): skip descendants of invalid-attestations checkpoints (AztecProtocol#23502) chore: scale network validators (AztecProtocol#23579) fix(ci): nightly 10 TPS bench GCP auth and checkout (AztecProtocol#23586) chore: set eth node resource profile (AztecProtocol#23583) fix: wait for checkpoint before sentinel assertions (AztecProtocol#23573) fix: slash attestations for invalid checkpoint proposals (AztecProtocol#23506) test: fix web3signer pipelining `e2e_multi_validator_node_key_store.test.ts` (AztecProtocol#23568) fix: cap CI devbox hostname (AztecProtocol#23591) test: stabilize invalid checkpoint descendant e2e (AztecProtocol#23582) test(e2e): stabilize invalidation slots in `proposer invalidates multiple checkpoints` (AztecProtocol#23590) test(e2e): stabilize invalid proposal slashing target slot in `attested_invalid_proposal` (AztecProtocol#23589) chore(foundation): faster toBufferBE via zero fast-path (AztecProtocol#23592) fix: honour BB_BINARY_PATH (AztecProtocol#23570) chore: bump reth and lighthouse (AztecProtocol#23588) chore: add web3signer and postgres node selectors (AztecProtocol#23598) fix: do not symlink .codex folders (AztecProtocol#23593) chore: fix claude and codex symlinking tests (AztecProtocol#23599) test(e2e): narrow down sentinel check in `multiple_validators_sentinel` (AztecProtocol#23604) test(e2e): fix `proposer invalidates multiple checkpoints` timeout (AztecProtocol#23608) fix: record zero-amount slashing offenses (AztecProtocol#23556) fix: log slashing offense names (AztecProtocol#23565) feat(p2p): tx validation cache (AztecProtocol#23585) chore: add KEDA deployment module (AztecProtocol#23553) chore: add KEDA prover agent autoscaling (AztecProtocol#23554) chore: update destroy_bootnode.sh (AztecProtocol#23626) chore: skip failing chonk_pinned_inputs.test in CI (AztecProtocol#23643) chore(ci): tolerate public authwit P2P receipt flake (AztecProtocol#23648) END_COMMIT_OVERRIDE
Overview
Adds a tx validation cache to the p2p layer so that repeated validation of the same transaction by the same validator reuses the prior result instead of redoing the work (notably the expensive proof verification).
Downside: Using this cache for validations adds up to 7ms overhead for each validation, when the object needs to be hashed. This is actually entirely (+90%) dominated by
.toBuffer()time.Cached validation is added for on-demand tx collection, but NOT for gossip and RPC ingress.
Changes
Cache core (
p2p/src/msg_validators/tx_validator/)TxValidationCache— bounded, LRU-evicting cache keyed by(validatorSymbol, txHash). Stores the in-flight promise before awaiting, so concurrent validations of the same tx coalesce into a single call.get/set/deletetake the cache key directly;key(validatorSymbol, tx)builds it.CachedTxValidator— wraps anyTxValidatorto routevalidateTxthrough the cache using the validator'sidentifiersymbol.DataTxValidatorandTxProofValidatorgained stableidentifiers.factory.ts— threads an optionalTxValidationCachethrough the gossip (first/second stage), block-proposal, on-demand, and RPC validator builders, wrapping the state-independent validators (DataTxValidator,TxProofValidator, and the minimum-integrity aggregate) inCachedTxValidator.LRU map extracted to foundation (
foundation/src/collection/lru_map.ts)TxValidationCacheinto a genericLruMap<K, V>, mirroring the existingLruSet.TxValidationCachenow composes anLruMap<string, Promise<TxValidationResult>>. AddedLruMapunit tests.Wiring
P2P_TX_VALIDATION_CACHE_SIZEenv var /txValidationCacheSizeconfig (cache disabled when0).createP2PClientconstructs the cache and passes it toLibP2PService(gossip + block-proposal paths) and to the batch-tx-requester's on-demand validator config.Benchmarks
Closes https://linear.app/aztec-labs/issue/A-934/dont-repeatedly-verify-retrieved-transactions .