Skip to content

test(e2e): fix race in broadcasted_invalid_block_proposal_slash under pipelining#23302

Merged
spalladino merged 1 commit into
merge-train/spartanfrom
claudebox/fix-broadcasted-invalid-block-slash-race
May 15, 2026
Merged

test(e2e): fix race in broadcasted_invalid_block_proposal_slash under pipelining#23302
spalladino merged 1 commit into
merge-train/spartanfrom
claudebox/fix-broadcasted-invalid-block-slash-race

Conversation

@AztecBot

Copy link
Copy Markdown
Collaborator

Summary

Fixes the e2e_p2p_broadcasted_invalid_block_proposal_slash failure that has been blocking the merge-train/spartan train (run https://github.com/AztecProtocol/aztec-packages/actions/runs/25896899879, test log http://ci.aztec-labs.com/2bf4e2cd2d9e7944).

The test creates the malicious proposer first (auto-starting its sequencer) and only later creates the honest nodes and waits for P2P mesh. Under enableProposerPipelining: true (turned on for this test by #23070), the malicious proposer is selected for the very next slot, builds + broadcasts the invalid proposal one slot ahead, and lands the broadcast before the honest validators have joined the mesh. They then reject it at the gossipsub checkpoint_proposal_validator with Penalizing peer for invalid slot number (since their target slot has already moved past), so the state_mismatch slashing path never runs. The malicious sequencer then gets stuck on the failed publish (Awaiting pending L1 payload submission) and never proposes again before the test times out on awaitOffenseDetected.

This is the same race that #23070 fixed in duplicate_proposal_slash.test.ts; the same pattern is applied here:

  • Create both the invalid proposer and the honest nodes with dontStartSequencer: true.
  • After P2P mesh connectivity + committee formation, use advanceToEpochBeforeProposer to land one epoch before an epoch where the invalid proposer is scheduled.
  • Start all sequencers, then advanceToEpoch(targetEpoch, { offset: -AZTEC_SLOT_DURATION }) so the malicious slot fires while every node is online and at the same wall-clock slot.
  • After awaitOffenseDetected on one node, poll getSlashOffenses across all nodes for BROADCASTED_INVALID_BLOCK_PROPOSAL — under pipelining a given receiver may have already advanced past the build slot when the proposal arrives, so we need to catch whichever node was still in the build slot.

The on-chain slash assertion (rollup.listenToSlash) is preserved unchanged.

Full failure analysis: https://gist.github.com/AztecBot/39b69c1117f419145938ccd2c198f8e9

Test plan

  • CI: e2e_p2p_broadcasted_invalid_block_proposal_slash passes on merge-train/spartan.
  • Local ./bootstrap.sh ci / fast / build are not runnable in this container (no Docker socket and $HOME not writable for the container UID — yarn install fails on corepack mkdir, parallel-bootstrap can't create ~/.parallel). Fix is a direct port of a pattern already shipping green on next via the sibling duplicate_proposal_slash.test.ts.

ClaudeBox log: https://claudebox.work/s/06a4929a1971beaf?run=1

@AztecBot AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels May 15, 2026
@spalladino spalladino marked this pull request as ready for review May 15, 2026 02:56
@spalladino spalladino enabled auto-merge (squash) May 15, 2026 02:56
@AztecBot

Copy link
Copy Markdown
Collaborator Author

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/c3c25d160acf945b�c3c25d160acf945b8;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_mbps.parallel.test.ts "builds multiple blocks per slot with L2 to L1 messages" (395s) (code: 0) group:e2e-p2p-epoch-flakes

@spalladino spalladino merged commit 3ba6dbe into merge-train/spartan May 15, 2026
38 of 47 checks passed
@spalladino spalladino deleted the claudebox/fix-broadcasted-invalid-block-slash-race branch May 15, 2026 03:26
This was referenced May 15, 2026
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
refactor(p2p): merge FastTxCollection into TxCollection with sequential
pipeline (AztecProtocol#23245)
refactor(publisher): bundle-level simulate; drop per-action enqueue sims
(AztecProtocol#23165)
refactor(stdlib): remove deprecated RevertCode/TxExecutionResult aliases
(AztecProtocol#23249)
test(e2e): fix race in 'proposer invalidates multiple checkpoints'
(AztecProtocol#23259)
fix: clean up old jobs regardless of pending status (AztecProtocol#23260)
refactor(p2p): remove unused sendBatchRequest (AztecProtocol#23273)
chore(p2p): remove proposal_tx_collector leftovers (AztecProtocol#23276)
feat: slash truncated checkpoint proposals (AztecProtocol#23250)
refactor: remove unused map in attestation pool (AztecProtocol#23284)
chore(p2p): assert last block in checkpoint proposal is correct (AztecProtocol#23274)
refactor(l1-tx-utils): use DateProvider for fail-fast timeout check
(AztecProtocol#23257)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23277)
test(e2e): fix race in broadcasted_invalid_block_proposal_slash under
pipelining (AztecProtocol#23302)
fix(archiver): atomic getter for L2 tips (AztecProtocol#23295)
fix(sequencer): use targetSlot in tryVoteWhenEscapeHatchOpen under
pipelining (AztecProtocol#23296)
fix(world-state): make fork close idempotent for pruned forks (AztecProtocol#23298)
test(e2e): migrate passing tests to proposer pipelining (AztecProtocol#23275)
chore: update dashboard (AztecProtocol#23312)
chore: Revert "feat(sandbox): support proposer pipelining in local
network" (AztecProtocol#23313)
test: slash on bad attestation (AztecProtocol#23184)
feat(slasher): per-slot data-withholding watcher (A-523, A-525) (AztecProtocol#23116)
test(e2e): enable pipelining on e2e debug trace (AztecProtocol#23301)
test(e2e): enable pipelining on l1-to-l2 test (AztecProtocol#23300)
test(e2e): switch fee_settings to organic fee bumps under pipelining
(AztecProtocol#23303)
fix(ci): retry sqlite3mc-wasm download on transient DNS/TLS failures
(AztecProtocol#23333)
test(e2e): wait for real oracle rotation in fee_settings inflate helper
(AztecProtocol#23334)
test(e2e): anchor e2e_amm PXE to checkpointed tip under pipelining
(AztecProtocol#23336)
fix(spartan-bench): tolerate older node images in SlasherConfig schema
(AztecProtocol#23351)
fix: interrupt prover jobs in stop (AztecProtocol#23358)
test(e2e): enable pipelining on bot, fees, and avm simulator tests
(AztecProtocol#23329)
feat(sentinel): end-of-epoch evaluation with re-execution outcomes
(AztecProtocol#23286)
feat: slash for invalid checkpoint proposals (AztecProtocol#23270)
fix: fork closure in epoch proving jobs (AztecProtocol#23390)
fix(slasher): anchor watcher scans at archiver synced L2 slot (AztecProtocol#23394)
fix: avoid npm uplink for aztec-up local publishes (AztecProtocol#23396)
test(e2e): ignore benign 'Insufficient valid txs' block-build-failed in
epochs tests (AztecProtocol#23424)
chore: refactor weekly proving test wait (AztecProtocol#23395)
refactor: add fifo set (AztecProtocol#23271)
feat(sandbox): support proposer pipelining in local network (AztecProtocol#23327)
fix(p2p): validate BLOCK_TXS in BatchTxRequester (AztecProtocol#23371)
chore(p2p): simplify IBatchRequestTxValidator (AztecProtocol#23373)
feat(sequencer): AutomineSequencer for single-sequencer e2e tests
(AztecProtocol#23354)
fix(prover): wait for previous epoch to be proven (AztecProtocol#23458)
chore: collocate provers (AztecProtocol#23439)
chore: rm staging-ignition (AztecProtocol#23440)
chore: rm unused networks (AztecProtocol#23441)
test(e2e): migrate block_building, multi_validator_node,
publisher_funding, invalid_checkpoint_proposal to pipelining (AztecProtocol#23414)
fix(archiver): reconcile local blocks with L1 checkpoints by block
number (AztecProtocol#23461)
feat: Updated slash conditions on block proposals (AztecProtocol#23466)
test(e2e): migrate HA full test to pipelining (AztecProtocol#23463)
chore: update resource profiles (AztecProtocol#23442)
chore: update debug log levels (AztecProtocol#23456)
test: fix flaky sentinel_status_slash by asserting the fault on the
checkpoint slot (AztecProtocol#23483)
feat(slasher): slash checkpoint equivocation between P2P and L1 (A-980)
(AztecProtocol#23436)
refactor(slasher): rename ATTESTED_DESCENDANT_OF_INVALID ->
PROPOSED_DESCENDANT_OF_CHECKPOINT_WITH_INVALID_ATTESTATIONS (AztecProtocol#23468)
fix: reject block proposals in poisoned slots (AztecProtocol#23411)
fix: retry nargo dep + solc downloads to survive transient DNS drops
(AztecProtocol#23490)
fix: enrich json-rpc tracing (AztecProtocol#23412)
feat: add trace export controls (AztecProtocol#23413)
test(e2e): assert no equivocation offenses in HA full test (AztecProtocol#23496)
test: cover invalid checkpoint proposal slashing (AztecProtocol#23503)
test(e2e): migrate more e2e suites to proposer pipelining (AztecProtocol#23482)
test: flag e2e_slashing_attested_invalid_proposal as flake under
pipelining (AztecProtocol#23501)
test: flag e2e_p2p_duplicate_proposal_slash as flake under pipelining
(AztecProtocol#23515)
test(e2e): require cross-observer agreement on sentinel fault slot
(AztecProtocol#23513)
test: flag e2e_ha_full afterAll hook timeout as flake under pipelining
(AztecProtocol#23524)
fix(e2e): propagate l1ContractsArgs into node config so archiver matches
L1 (AztecProtocol#23514)
test: flag e2e_multi_validator_node_key_store P2P tx-dropped failure as
flake (AztecProtocol#23528)
test(cheat-codes): retry warpL2TimeAtLeastTo in-current-slot test on L1
race (AztecProtocol#23533)
test(e2e_ha_full): parallel HA peer node teardown with per-node deadline
(AztecProtocol#23539)
test: flag e2e_ha_full as flake under HA pipelining (AztecProtocol#23541)
test(ci): skip e2e_ha_full entirely on merge-train/spartan (AztecProtocol#23542)
test(ci): skip e2e_multi_validator_node_key_store entirely on
merge-train/spartan (AztecProtocol#23544)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants