Skip to content

test(e2e): wait warmup slots in slashing tests#23719

Merged
PhilWindle merged 1 commit into
merge-train/spartanfrom
spl/try-deflake-slashing
Jun 1, 2026
Merged

test(e2e): wait warmup slots in slashing tests#23719
PhilWindle merged 1 commit into
merge-train/spartanfrom
spl/try-deflake-slashing

Conversation

@spalladino

Copy link
Copy Markdown
Contributor

Attempt at deflaking slashing tests by keeping some warmup slots. Codex analysis below.

Cause

The duplicate proposal slash test was timing-fragile after proposer pipelining. In the failed CI run, the malicious shared-key validators were selected for the first slot of the target epoch immediately after the test started sequencers and warped time forward.

Because pipelining builds a proposal one slot ahead, selecting the first slot meant the malicious nodes had to start building at the exact warp boundary. They started late in the build sub-slot, serialized the duplicate proposals after the receiver nodes had already advanced to the next slot, and the receivers rejected the gossip as late (invalid slot number) before duplicate-proposal detection could run.

Failed run: 36837b7edc543e70

Analysis

The failed run never produced a DUPLICATE_PROPOSAL offense. It only produced a DUPLICATE_ATTESTATION offense, then timed out waiting for the expected proposal offense.

Timeline from the failed run:

  • malicious validators were selected for slot 10
  • both started building late in sub-slot 3
  • both broadcast checkpoint proposals for slot 10
  • receivers had already advanced to slot 11 and rejected slot 10 proposals as stale
  • the generic offense wait was satisfied by a duplicate attestation
  • the final DUPLICATE_PROPOSAL assertion timed out

A passing run selected a later proposer slot instead. That gave the sequencers a full warmup slot after the warp, both malicious nodes built early enough, standalone duplicate proposals were broadcast in time, and DUPLICATE_PROPOSAL was detected.

Fix Rationale

Update advanceToEpochBeforeProposer to skip the first warmupSlots slots of the target epoch and return the concrete targetSlot. The duplicate proposal and duplicate attestation tests still warp to one slot before the target epoch, but now the selected malicious proposer slot is at least one slot into that epoch.

That gives freshly-started sequencers one full slot of wall-clock warmup before the pipelined build for the malicious slot begins, avoiding the startup/warp boundary race where duplicate proposals are emitted too late and rejected before slash detection.

Attempt at deflaking slashing tests by keeping some warmup slots.
@spalladino spalladino added the ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure label May 29, 2026
@PhilWindle PhilWindle merged commit 093ceb9 into merge-train/spartan Jun 1, 2026
23 of 29 checks passed
@PhilWindle PhilWindle deleted the spl/try-deflake-slashing branch June 1, 2026 14:58
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
chore: deploy next-net and reuse contracts (AztecProtocol#23761)
chore: turn on autoscaling (AztecProtocol#23706)
chore: rename staging-public to staging (AztecProtocol#23767)
chore(p2p): use sync hash for tx validation hashing (AztecProtocol#23768)
test(e2e): wait warmup slots in slashing tests (AztecProtocol#23719)
feat(api)!: make getTxReceipt the single tx-lookup API (AztecProtocol#23660)
fix: cap cloned n_tps fees within sponsored FPC balance (AztecProtocol#23770)
fix: protect HA validator Postgres from cluster scale-down (AztecProtocol#23772)
refactor: remove non-pipelining sequencer code path (AztecProtocol#23665)
feat(archiver): add getL2ToL1MembershipWitness node RPC (AztecProtocol#23646)
fix(p2p)!: revamp BLOCK_TXS validations (AztecProtocol#23778)
chore: name the bots (AztecProtocol#23795)
fix(e2e): ensure BBSync init (AztecProtocol#23793)
fix(p2p)!: fix BLOCK_TXS response under proposer equivocation (AztecProtocol#23786)
fix: reconnect L1 port-forward after epoch-boundary sleep in n_tps_prove
(AztecProtocol#23800)
chore: add empty vscode settings for yarn-project (AztecProtocol#23808)
fix(sequencer): only warn about missing proposed checkpoint once overdue
(AztecProtocol#23807)
fix: refresh n_tps fee quotes during sustained benchmark (AztecProtocol#23797)
fix(sequencer): enforce build-frame deadlines and align
attestation/publish windows (AztecProtocol#23776)
END_COMMIT_OVERRIDE
spalladino added a commit that referenced this pull request Jun 4, 2026
## Motivation

`v5-next` was cut from `next` at cbc99df (Jun 1), so PRs merged to
`merge-train/spartan` after the cut never flowed into it. This backports
all of them (authored by @spalladino and @fcarreiro) to keep v5-next
current with the spartan train.

## Approach

Each PR is cherry-picked from its squashed merge commit on
`merge-train/spartan`, in merge order, preserving the original commit
message and PR number — one commit per backported PR. All 11 applied
cleanly with no conflicts; patches are identical to the originals
(verified via `git patch-id`), and `bootstrap.sh build yarn-project`
passes on the result. Labeled `ci-no-squash` to preserve the per-PR
commits.

## Backported PRs

- #23768 — chore(p2p): use sync hash for tx validation hashing
- #23719 — test(e2e): wait warmup slots in slashing tests
- #23660 — feat(api)!: make getTxReceipt the single tx-lookup API
- #23665 — refactor: remove non-pipelining sequencer code path
- #23646 — feat(archiver): add getL2ToL1MembershipWitness node RPC
- #23778 — fix(p2p)!: revamp BLOCK_TXS validations
- #23786 — fix(p2p)!: fix BLOCK_TXS response under proposer equivocation
- #23808 — chore: add empty vscode settings for yarn-project
- #23807 — fix(sequencer): only warn about missing proposed checkpoint
once overdue
- #23776 — fix(sequencer): enforce build-frame deadlines and align
attestation/publish windows
- #23818 — chore(p2p): BlockTxsRequest comment

Note #23660, #23778, and #23786 are breaking changes (node RPC +
tx-effect db format, and p2p wire format respectively), as they were on
`next`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants