Skip to content

test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan#23083

Merged
spalladino merged 1 commit into
merge-train/spartanfrom
claudebox/fix-spartan-22934-rpc-removal
May 8, 2026
Merged

test: mark tx_stats_bench 10 TPS as flake-retryable on merge-train/spartan#23083
spalladino merged 1 commit into
merge-train/spartanfrom
claudebox/fix-spartan-22934-rpc-removal

Conversation

@AztecBot

@AztecBot AztecBot commented May 8, 2026

Copy link
Copy Markdown
Collaborator

Motivation

The verifies transactions at 10 TPS sub-test of yarn-project/end-to-end/src/bench/tx_stats_bench.test.ts is now reliably flaking on the bench all step of merge-train/spartan. It has fired on at least two different merge-train commits hours apart, with no relation to either commit's diff:

Run Triggering merge-train commit CI log
25546251580 #22934 (refactor(node-rpc)! removing deprecated AztecNode methods) http://ci.aztec-labs.com/1778227975844707
25552992890 #22405 (feat(p2p): detect and track announce IP changes at runtime) http://ci.aztec-labs.com/1778237470322975

Both runs hit the same assertion:

● transaction benchmarks › verifies transactions at 10 TPS

  expect(received).toBe(expected) // Object.is equality
  Expected: true
  Received: false
    at bench/tx_stats_bench.test.ts:268

Sub-test failing log on the latest run: http://ci.aztec-labs.com/ca459ca73d02002c (bench all parent: http://ci.aztec-labs.com/90616bad7bf7ebaa).

The other three sub-tests in the suite (compression; single private verify x20 serial; single public verify x20 serial) pass cleanly against the same proven txs in both runs. The failure is in the stress sub-test that fires 600 IVC verifications at 10/s with 8 concurrent IVC verifiers (BB_NUM_IVC_VERIFIERS=8, BB_IVC_CONCURRENCY=1). At least one verification returns valid: false under load.

Cause

Neither triggering commit touches the IVC verifier path:

The two failures sharing this signature across unrelated diffs is strong evidence that the flake is independent of the merge-train commit and stems from the bench infrastructure itself.

The likely culprit is the recent bb-prover migration to the bb.js NativeUnixSocket backend (#21564), which spawns a fresh bb subprocess per Chonk verification via withVerifierInstance. Under 8x parallel verifications on the CPU-isolated bench host (each verifier requesting 16 threads, 8 × 16 = 128 threads on 56 isolated cores), transient verifier failures appear. The bench-output log shows continuous bb.js - Received signal 15, shutting down gracefully... traffic during the 10 TPS phase — verifier instances are being torn down rapidly, and at least one verification slips through with a stale/incomplete response. Because the serial sub-tests (numIterations = 20 sequential) pass cleanly in both runs, this is a stress-only interaction, not a correctness regression.

Approach

Add tx_stats_bench to .test_patterns.yml with an error_regex anchored to the test file's stack-trace line (tx_stats_bench.test.ts:<line>:<col>), and assign *charlie as owner (author of the bb.js migration). With this entry, ci3/run_test_cmd retries the test once on failure and treats a single retry-pass as a flake instead of a hard fail, unblocking the merge train for unrelated commits while Charlie investigates the underlying concurrency interaction with the bb.js backend.

The error_regex is intentionally narrow (file + line + column from the stack trace) so other ways tx_stats_bench could fail (timeout, OOM, infra) are still surfaced as hard fails.

Changes

  • .test_patterns.yml: add a tx_stats_bench entry with an error_regex anchored to the test file's stack-trace line and *charlie as owner.

ClaudeBox logs:

@AztecBot AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels May 8, 2026
@spalladino spalladino marked this pull request as ready for review May 8, 2026 12:08
@spalladino spalladino merged commit 7513e78 into merge-train/spartan May 8, 2026
37 of 43 checks passed
@spalladino spalladino deleted the claudebox/fix-spartan-22934-rpc-removal branch May 8, 2026 12:08
ledwards2225 pushed a commit that referenced this pull request May 11, 2026
Cherry-pick of #23092 onto merge-train/fairies. Same flake (`verifies
transactions at 10 TPS` in `tx_stats_bench.test.ts:268`) is now blocking
PRs based on this train; see #23092 for the full analysis (bb.js
NativeUnixSocket backend under 8x parallel IVC verifications
intermittently returning `valid:false`).

Skipping at sub-test granularity keeps the other three serial sub-tests
emitting their compression and single-tx verification metrics; only the
IVC-verifier-under-concurrency metrics are dropped. The
`.test_patterns.yml` flake-retry entry from #23083 stays in place.
rangozd pushed a commit to rangozd/aztec-packages that referenced this pull request May 16, 2026
BEGIN_COMMIT_OVERRIDE
fix(test): warp L1 forward when proposer scan hits EpochNotStable
(AztecProtocol#22967)
test(e2e): fail epochs tests on proposer-rollup-check-failed (AztecProtocol#22965)
fix: grafana switch to aztec_status="proposed" (AztecProtocol#22978)
chore: update benchmark scraper (AztecProtocol#22984)
test(e2e): migrate simple epoch tests to pipelining (AztecProtocol#22973)
chore: remove top-level yarn.lock (AztecProtocol#22987)
refactor(archiver)!: unify L2BlockSource checkpoint lookups via query
objects (AztecProtocol#22933)
fix(sequencer): bounded sweep instead of event scan for governance
proposal check (AztecProtocol#22989)
fix(docs): allow webapp-tutorial yarn install to populate empty lockfile
in CI (AztecProtocol#23000)
test(e2e): enable pipelining in l1-reorgs and mbps redistribution tests
(AztecProtocol#23009)
fix(archiver): restore pending block height metric under pipelining
(AztecProtocol#22994)
chore(p2p): remove skipped validation result option (AztecProtocol#23034)
refactor(p2p)!: remove slow tx collection flow (AztecProtocol#22878)
chore(spartan): add next-net-clone environment config (AztecProtocol#22995)
chore(sequencer): add context to proposer-rollup-check-failed logs
(AztecProtocol#23071)
test(e2e): wait for archiver sync before asserting pipelining (AztecProtocol#22997)
refactor(node-rpc)!: remove deprecated AztecNode methods and
L2BlockSource tip helpers (AztecProtocol#22934)
feat(p2p): detect and track announce IP changes at runtime (AztecProtocol#22405)
test: mark tx_stats_bench 10 TPS as flake-retryable on
merge-train/spartan (AztecProtocol#23083)
fix(sequencer): bind vote-only multicalls to target slot under
pipelining (AztecProtocol#23090)
feat(sequencer): build optimistically across pruning epoch boundary
(AztecProtocol#23056)
fix(sequencer): use chainTipsOverride.pending for log context (AztecProtocol#23098)
test(e2e): relax post-boundary slot assertion in
epochs_proof_at_boundary (AztecProtocol#23108)
fix(bb-prover): pool long-lived bb verifier processes instead of
spawning per-call (AztecProtocol#23093)
fix(sequencer): anchor fee asset price modifier to predicted parent
(AztecProtocol#23113)
chore: error log when L1 head timestamp drifts (AztecProtocol#22947)
fix(sequencer): override full parent checkpoint cell in pipelined
simulation (AztecProtocol#23073)
test(e2e): enable pipelining on missed l1 slot test (AztecProtocol#23068)
fix: more robust metrics reporting in IRM monitor (AztecProtocol#23038)
fix: preserve LMDB slashing protection (AztecProtocol#23145)
test(e2e): enable pipelining on p2p tests (AztecProtocol#23070)
fix(archiver): move L2 tips cache refresh out of write transactions
(AztecProtocol#23110)
test(e2e): fix data_withholding_slash flake by freezing L1 across
restart (AztecProtocol#23162)
fix(validator): include proposed checkpoint out-hashes when validating
checkpoint proposals (AztecProtocol#23119)
refactor(config): drop nested config option, flatten l1Contracts
(AztecProtocol#23143)
test(e2e): bump bash TIMEOUT for e2e_p2p/add_rollup to match jest 20m
(AztecProtocol#23177)
fix(p2p): chunk archive of mined txs on block finalization (A-969)
(AztecProtocol#23085)
fix(p2p): stream tx pool hydration to bound startup memory (A-968)
(AztecProtocol#23086)
chore: remove orphan --archiver flag usages from start invocations
(AztecProtocol#23186)
feat(ci): daily merge-train/spartan stale-PR notifier (AztecProtocol#23189)
fix: preserve contract artifact permissions (AztecProtocol#23174)
fix(ci3): accept slashes in /list/&lt;path:key&gt; for merge-train
history (AztecProtocol#23160)
feat(ci): route merge-train/spartan flake notifications to
#team-alpha-ci (AztecProtocol#23219)
fix(cheat-codes): wait for post-warp L2 block in warpL2TimeAtLeastTo
(AztecProtocol#23213)
feat: slash attesters signing over bad checkpoints (AztecProtocol#23180)
refactor(prover-client): split orchestrator into sub-tree + top-tree
pair (AztecProtocol#22996)
fix(srs): retry transient CRS HTTP downloads with exponential backoff
(AztecProtocol#23244)
refactor(p2p): remove old reqresp mode (AztecProtocol#23158)
docs(sequencer-client): rewrite top-level and timing READMEs (AztecProtocol#23149)
fix(aztec-node): include upcoming checkpoint's L1 to L2 messages in
simulatePublicCalls (AztecProtocol#23163)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants