test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan by AztecBot · Pull Request #23083 · AztecProtocol/aztec-packages

AztecBot · 2026-05-08T09:07:55Z

Motivation

The verifies transactions at 10 TPS sub-test of yarn-project/end-to-end/src/bench/tx_stats_bench.test.ts is now reliably flaking on the bench all step of merge-train/spartan. It has fired on at least two different merge-train commits hours apart, with no relation to either commit's diff:

Run	Triggering merge-train commit	CI log
25546251580	#22934 (refactor(node-rpc)! removing deprecated AztecNode methods)	http://ci.aztec-labs.com/1778227975844707
25552992890	#22405 (feat(p2p): detect and track announce IP changes at runtime)	http://ci.aztec-labs.com/1778237470322975

Both runs hit the same assertion:

● transaction benchmarks › verifies transactions at 10 TPS

  expect(received).toBe(expected) // Object.is equality
  Expected: true
  Received: false
    at bench/tx_stats_bench.test.ts:268

Sub-test failing log on the latest run: http://ci.aztec-labs.com/ca459ca73d02002c (bench all parent: http://ci.aztec-labs.com/90616bad7bf7ebaa).

The other three sub-tests in the suite (compression; single private verify x20 serial; single public verify x20 serial) pass cleanly against the same proven txs in both runs. The failure is in the stress sub-test that fires 600 IVC verifications at 10/s with 8 concurrent IVC verifiers (BB_NUM_IVC_VERIFIERS=8, BB_IVC_CONCURRENCY=1). At least one verification returns valid: false under load.

Cause

Neither triggering commit touches the IVC verifier path:

refactor(node-rpc)!: remove deprecated AztecNode methods and L2BlockSource tip helpers #22934 is a pure node-rpc surface refactor.
feat(p2p): detect and track announce IP changes at runtime #22405 is p2p / discv5 ENR plumbing.

The two failures sharing this signature across unrelated diffs is strong evidence that the flake is independent of the merge-train commit and stems from the bench infrastructure itself.

The likely culprit is the recent bb-prover migration to the bb.js NativeUnixSocket backend (#21564), which spawns a fresh bb subprocess per Chonk verification via withVerifierInstance. Under 8x parallel verifications on the CPU-isolated bench host (each verifier requesting 16 threads, 8 × 16 = 128 threads on 56 isolated cores), transient verifier failures appear. The bench-output log shows continuous bb.js - Received signal 15, shutting down gracefully... traffic during the 10 TPS phase — verifier instances are being torn down rapidly, and at least one verification slips through with a stale/incomplete response. Because the serial sub-tests (numIterations = 20 sequential) pass cleanly in both runs, this is a stress-only interaction, not a correctness regression.

Approach

Add tx_stats_bench to .test_patterns.yml with an error_regex anchored to the test file's stack-trace line (tx_stats_bench.test.ts:<line>:<col>), and assign *charlie as owner (author of the bb.js migration). With this entry, ci3/run_test_cmd retries the test once on failure and treats a single retry-pass as a flake instead of a hard fail, unblocking the merge train for unrelated commits while Charlie investigates the underlying concurrency interaction with the bb.js backend.

The error_regex is intentionally narrow (file + line + column from the stack trace) so other ways tx_stats_bench could fail (timeout, OOM, infra) are still surfaced as hard fails.

Changes

.test_patterns.yml: add a tx_stats_bench entry with an error_regex anchored to the test file's stack-trace line and *charlie as owner.

ClaudeBox logs:

https://claudebox.work/s/6e7853d3a073145f?run=1 (initial diagnosis on refactor(node-rpc)!: remove deprecated AztecNode methods and L2BlockSource tip helpers #22934 failure)
https://claudebox.work/s/c12a360275f05ad3?run=1 (this update on feat(p2p): detect and track announce IP changes at runtime #22405 recurrence)

…artan

Cherry-pick of #23092 onto merge-train/fairies. Same flake (`verifies transactions at 10 TPS` in `tx_stats_bench.test.ts:268`) is now blocking PRs based on this train; see #23092 for the full analysis (bb.js NativeUnixSocket backend under 8x parallel IVC verifications intermittently returning `valid:false`). Skipping at sub-test granularity keeps the other three serial sub-tests emitting their compression and single-tx verification metrics; only the IVC-verifier-under-concurrency metrics are dropped. The `.test_patterns.yml` flake-retry entry from #23083 stays in place.

BEGIN_COMMIT_OVERRIDE fix(test): warp L1 forward when proposer scan hits EpochNotStable (AztecProtocol#22967) test(e2e): fail epochs tests on proposer-rollup-check-failed (AztecProtocol#22965) fix: grafana switch to aztec_status="proposed" (AztecProtocol#22978) chore: update benchmark scraper (AztecProtocol#22984) test(e2e): migrate simple epoch tests to pipelining (AztecProtocol#22973) chore: remove top-level yarn.lock (AztecProtocol#22987) refactor(archiver)!: unify L2BlockSource checkpoint lookups via query objects (AztecProtocol#22933) fix(sequencer): bounded sweep instead of event scan for governance proposal check (AztecProtocol#22989) fix(docs): allow webapp-tutorial yarn install to populate empty lockfile in CI (AztecProtocol#23000) test(e2e): enable pipelining in l1-reorgs and mbps redistribution tests (AztecProtocol#23009) fix(archiver): restore pending block height metric under pipelining (AztecProtocol#22994) chore(p2p): remove skipped validation result option (AztecProtocol#23034) refactor(p2p)!: remove slow tx collection flow (AztecProtocol#22878) chore(spartan): add next-net-clone environment config (AztecProtocol#22995) chore(sequencer): add context to proposer-rollup-check-failed logs (AztecProtocol#23071) test(e2e): wait for archiver sync before asserting pipelining (AztecProtocol#22997) refactor(node-rpc)!: remove deprecated AztecNode methods and L2BlockSource tip helpers (AztecProtocol#22934) feat(p2p): detect and track announce IP changes at runtime (AztecProtocol#22405) test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan (AztecProtocol#23083) fix(sequencer): bind vote-only multicalls to target slot under pipelining (AztecProtocol#23090) feat(sequencer): build optimistically across pruning epoch boundary (AztecProtocol#23056) fix(sequencer): use chainTipsOverride.pending for log context (AztecProtocol#23098) test(e2e): relax post-boundary slot assertion in epochs_proof_at_boundary (AztecProtocol#23108) fix(bb-prover): pool long-lived bb verifier processes instead of spawning per-call (AztecProtocol#23093) fix(sequencer): anchor fee asset price modifier to predicted parent (AztecProtocol#23113) chore: error log when L1 head timestamp drifts (AztecProtocol#22947) fix(sequencer): override full parent checkpoint cell in pipelined simulation (AztecProtocol#23073) test(e2e): enable pipelining on missed l1 slot test (AztecProtocol#23068) fix: more robust metrics reporting in IRM monitor (AztecProtocol#23038) fix: preserve LMDB slashing protection (AztecProtocol#23145) test(e2e): enable pipelining on p2p tests (AztecProtocol#23070) fix(archiver): move L2 tips cache refresh out of write transactions (AztecProtocol#23110) test(e2e): fix data_withholding_slash flake by freezing L1 across restart (AztecProtocol#23162) fix(validator): include proposed checkpoint out-hashes when validating checkpoint proposals (AztecProtocol#23119) refactor(config): drop nested config option, flatten l1Contracts (AztecProtocol#23143) test(e2e): bump bash TIMEOUT for e2e_p2p/add_rollup to match jest 20m (AztecProtocol#23177) fix(p2p): chunk archive of mined txs on block finalization (A-969) (AztecProtocol#23085) fix(p2p): stream tx pool hydration to bound startup memory (A-968) (AztecProtocol#23086) chore: remove orphan --archiver flag usages from start invocations (AztecProtocol#23186) feat(ci): daily merge-train/spartan stale-PR notifier (AztecProtocol#23189) fix: preserve contract artifact permissions (AztecProtocol#23174) fix(ci3): accept slashes in /list/<path:key> for merge-train history (AztecProtocol#23160) feat(ci): route merge-train/spartan flake notifications to #team-alpha-ci (AztecProtocol#23219) fix(cheat-codes): wait for post-warp L2 block in warpL2TimeAtLeastTo (AztecProtocol#23213) feat: slash attesters signing over bad checkpoints (AztecProtocol#23180) refactor(prover-client): split orchestrator into sub-tree + top-tree pair (AztecProtocol#22996) fix(srs): retry transient CRS HTTP downloads with exponential backoff (AztecProtocol#23244) refactor(p2p): remove old reqresp mode (AztecProtocol#23158) docs(sequencer-client): rewrite top-level and timing READMEs (AztecProtocol#23149) fix(aztec-node): include upcoming checkpoint's L1 to L2 messages in simulatePublicCalls (AztecProtocol#23163) END_COMMIT_OVERRIDE

test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/sp…

c734212

…artan

AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels May 8, 2026

spalladino approved these changes May 8, 2026

View reviewed changes

spalladino marked this pull request as ready for review May 8, 2026 12:08

spalladino merged commit 7513e78 into merge-train/spartan May 8, 2026
37 of 43 checks passed

spalladino deleted the claudebox/fix-spartan-22934-rpc-removal branch May 8, 2026 12:08

This was referenced May 8, 2026

feat: merge-train/spartan #22980

Merged

test: skip tx_stats_bench 10 TPS sub-test on merge-train/spartan #23092

Closed

vezenovm mentioned this pull request May 8, 2026

test: skip tx_stats_bench 10 TPS sub-test on merge-train/fairies #23101

Closed

AztecBot mentioned this pull request May 12, 2026

chore(merge-train/spartan): merge next to resolve conflicts on #22980 #23188

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan#23083

test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan#23083
spalladino merged 1 commit into
merge-train/spartanfrom
claudebox/fix-spartan-22934-rpc-removal

AztecBot commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AztecBot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Cause

Approach

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AztecBot commented May 8, 2026 •

edited

Loading