Skip to content

test(e2e): pick bad slots upfront and warp to them in proposer invalidates multiple checkpoints#24017

Merged
spalladino merged 1 commit into
merge-train/spartan-v5from
cb/fix-invalidate-multiple-checkpoints-warp
Jun 11, 2026
Merged

test(e2e): pick bad slots upfront and warp to them in proposer invalidates multiple checkpoints#24017
spalladino merged 1 commit into
merge-train/spartan-v5from
cb/fix-invalidate-multiple-checkpoints-warp

Conversation

@AztecBot

Copy link
Copy Markdown
Collaborator

Fixes a flake in proposer invalidates multiple checkpoints (e2e_epochs/epochs_invalidate_block.parallel.test.ts) reported on v5-next: failed run. Replaces #24016 (was based on merge-train/spartan; this one targets the v5 line where the flake fired and restructures the test instead of just resizing the timeout).

Root cause of the flake

TimeoutError: Operation timed out after 256000ms — the bare 8-slot timeoutPromise waiting for the two bad checkpoints. The bad-slot search from #23608 rejects any candidate pair whose proposer also owns an earlier un-snapshotted pipelined slot, and the rejection window grows with each attempt. In the failed run the current slot was 21 and the search rejected (24,25)…(29,30) before accepting slots 30/31 — 9–10 slots out. The fixed 256s wait expired at 22:48:55, before slot 30 even began (~22:49:00), while the chain healthily mined checkpoints at slots 22–28 underneath; the run was unwinnable at selection time. The race's .then(() => [CheckpointNumber(0), …]) fallback was also dead code, since timeoutPromise rejects.

Fix: search first, then warp

Instead of starting the sequencers and waiting in real time for whatever slots the search lands on:

  • With sequencers stopped, search for a warpSlot such that the proposers of the three lead-in slots warpSlot+1..warpSlot+3 are not the proposers of the bad slots warpSlot+4/warpSlot+5. A far-away candidate now costs a warp instead of a real-time wait, and EpochNotStable during the search is handled by warping forward one epoch (same pattern as the archiver skips a descendant test in this file).
  • Warp to one L1 block before warpSlot, so sequencers get a full L2 slot to boot before the first pipelined build window we rely on (end of warpSlot, targeting warpSlot+1).
  • Start the sequencers and wait for the first good checkpoint (lands at warpSlot, or up to warpSlot+2 on a slow start).
  • Apply the malicious config to the bad-slot proposers. The three good lead-in slots guarantee no pipelined job before badSlot1 can snapshot it, since jobs snapshot config during the last L1 slot of the previous L2 slot.
  • Fail fast with a clear assertion if config application was somehow late enough to reach badSlot1's build window, rather than timing out opaquely.
  • The 8-slot wait for the bad checkpoints is now correctly sized by construction (badSlot2 is at most ~6 slots from the wait start), and gets a descriptive timeout message.

Worst case the wait phase is bounded at ~6 slots regardless of how many candidates the search rejects, where previously each rejected candidate pushed the bad checkpoints one slot further past the fixed timeout.


Created by claudebox · group: slackbot

@AztecBot AztecBot added ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure claudebox Owned by claudebox. it can push to this PR. labels Jun 11, 2026
@spalladino spalladino marked this pull request as ready for review June 11, 2026 14:36
@spalladino spalladino enabled auto-merge (squash) June 11, 2026 14:37
@spalladino spalladino merged commit 227a74e into merge-train/spartan-v5 Jun 11, 2026
44 of 52 checks passed
@spalladino spalladino deleted the cb/fix-invalidate-multiple-checkpoints-warp branch June 11, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants