Skip to content

refactor(e2e): enable pipelining in e2e_epochs tests#22544

Merged
spalladino merged 7 commits into
merge-train/spartanfrom
palla/tests-to-pipelining
Apr 22, 2026
Merged

refactor(e2e): enable pipelining in e2e_epochs tests#22544
spalladino merged 7 commits into
merge-train/spartanfrom
palla/tests-to-pipelining

Conversation

@spalladino

@spalladino spalladino commented Apr 15, 2026

Copy link
Copy Markdown
Contributor
  • Enables enableProposerPipelining: true in several e2e_epochs test suites
  • Adds inboxLag: 2 for validator-based tests (required with pipelining)
  • Increases timeouts for solo-sequencer tests (~2x slots per checkpoint with pipelining)

@spalladino spalladino added the ci-draft Run CI on draft PRs. label Apr 15, 2026
@spalladino spalladino changed the base branch from next to palla/skip-initial-sequencer-in-e2e April 15, 2026 13:27
@spalladino spalladino force-pushed the palla/tests-to-pipelining branch from 0922fb0 to 8692be8 Compare April 15, 2026 13:39
@spalladino spalladino marked this pull request as ready for review April 20, 2026 14:46
@spalladino spalladino added ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure and removed ci-draft Run CI on draft PRs. labels Apr 20, 2026
Base automatically changed from palla/skip-initial-sequencer-in-e2e to merge-train/spartan April 21, 2026 12:05
@spalladino spalladino force-pushed the palla/tests-to-pipelining branch from c14cc38 to 858af40 Compare April 21, 2026 12:12
@spalladino spalladino enabled auto-merge (squash) April 21, 2026 12:12
@spalladino spalladino force-pushed the palla/tests-to-pipelining branch from 858af40 to 2f9926e Compare April 21, 2026 18:39
@spalladino spalladino disabled auto-merge April 21, 2026 21:21
@spalladino spalladino force-pushed the palla/tests-to-pipelining branch from 2f9926e to 4d8f18c Compare April 21, 2026 21:21
@spalladino spalladino changed the base branch from merge-train/spartan to spl/fix-genesis-world-state-again April 21, 2026 21:22
@spalladino

Copy link
Copy Markdown
Contributor Author

Do not merge until the base is merged against the merge-train. I've only rebased on top of it to check if it fixed CI.

Base automatically changed from spl/fix-genesis-world-state-again to merge-train/spartan April 22, 2026 08:43
spalladino and others added 7 commits April 22, 2026 11:16
Enable `enableProposerPipelining: true` in 13 of 19 e2e_epochs test suites.

Tests with pipelining enabled:
- epochs_simple_block_building, epochs_empty_blocks_proof, epochs_multiple
- epochs_long_proving_time, epochs_multi_proof, epochs_proof_public_cross_chain
- epochs_upload_failed_proof, epochs_ha_sync, epochs_mbps.parallel
- epochs_sync_after_reorg, epochs_partial_proof, epochs_manual_rollback
- epochs_invalidate_block.parallel

Key changes:
- Validator-based tests also need `inboxLag: 2` with pipelining
- Solo-sequencer tests need ~2x timeouts for waitUntilCheckpointNumber
  (sequencer must wait for L1 tx to land before simulating the next)

Tests where pipelining is NOT enabled (incompatible):
- epochs_proof_fails, epochs_missed_l1_slot, epochs_l1_reorgs:
  deliberately manipulate L1 tx timing/mining
- epochs_first_slot, epochs_high_tps_block_building:
  tight timing (3 L1 blocks per L2 slot)
- epochs_mbps_redistribution:
  pipelining changes block building timing affecting redistribution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ipelining

With pipelining enabled, if the proposer builds blocks spanning 3+ checkpoints
in a single slot, non-proposer nodes reject checkpoint N+2 blocks with
CheckpointNumberNotSequentialError (checkpoint N+1 hasn't been confirmed on L1
yet). This caused the L2-to-L1 messages sub-test to fail.

Fix by increasing maxTxsPerBlock so fewer blocks are built per phase, keeping
each proposer within 2 checkpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With MBPS (multiple blocks per slot), a single pipelining proposer can build
blocks spanning 3+ checkpoints in one slot. Non-proposer nodes reject checkpoint
N+2 with CheckpointNumberNotSequentialError because N+1 hasn't been confirmed
on L1 yet. This is a platform limitation, not a test issue.

The dedicated epochs_mbps.pipeline.parallel test already covers MBPS+pipelining
with wider timing (72s slots, 12s L1 blocks) that avoids this race.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This test asserts zero sequencer errors (failEvents). Pipelining produces
transient Rollup__InvalidArchive / ProposedCheckpointNotSequentialError when
checkpoint N's L1 tx is slow to land while the pipelined proposer builds
checkpoint N+1. These are recoverable but logged as errors, breaking the
strict zero-error assertion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Delete PIPELINING_MIGRATION.md (tracked in Linear A-912 sub-issues instead)
- Revert invalidation assertion to expect earliest checkpoint invalidated
- Remove inline comments explaining why pipelining was not enabled

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@spalladino spalladino force-pushed the palla/tests-to-pipelining branch from 4d8f18c to 471f1b8 Compare April 22, 2026 14:16
@spalladino spalladino enabled auto-merge (squash) April 22, 2026 14:16
@spalladino spalladino merged commit 581417a into merge-train/spartan Apr 22, 2026
12 checks passed
@spalladino spalladino deleted the palla/tests-to-pipelining branch April 22, 2026 14:36
chrismarino pushed a commit to chrismarino/aztec-packages that referenced this pull request May 5, 2026
BEGIN_COMMIT_OVERRIDE
fix(kv-store): ensure LMDB cursor is closed on iteration abort (AztecProtocol#22509)
fix(telemetry-client): use appropriate histogram buckets for L1 gas
prices (AztecProtocol#22512)
fix(telemetry-client): log warning when BatchSpanProcessor drops spans
(AztecProtocol#22511)
fix(stdlib): wrap HA signer databaseUrl in SecretValue (AztecProtocol#22510)
fix(prover-client): don't mark in-progress epoch N jobs as stale when
epoch N+1 starts (AztecProtocol#22508)
chore: (A-730) graceful shutdown for services in node startup failure
path (AztecProtocol#22112)
fix(prover-client): reject stale job promises and count timeouts toward
retry limit (AztecProtocol#21842)
feat(archiver): validate historical L1 log availability at startup
(AztecProtocol#22644)
fix(archiver): do not query MessageSent events by blockhash (AztecProtocol#22641)
refactor(e2e): skip initial sequencer in p2p and epochs tests (AztecProtocol#22535)
fix: handle missing L1 finalized block on devnets (AztecProtocol#22663)
fix(world-state): treat historical block 0 queries as historical, not
latest (AztecProtocol#22679)
fix(sequencer): re-check parent checkpoint validity before pipelined L1
submission (AztecProtocol#22586)
fix(world-state): make block 0 a first-class historical block (AztecProtocol#22711)
chore: show all running versions (AztecProtocol#22376)
chore: fix prettier inside worktrees (AztecProtocol#22557)
feat: use optimized verifier for rollup (AztecProtocol#21840)
fix(kv-store): skip pool creation on ephemeral deleteDb to unstick
browser tests (AztecProtocol#22693)
chore: rm claude lockfile (AztecProtocol#22718)
fix(e2e): wait for first checkpoint in fee_asset_price_oracle_gossip
test (AztecProtocol#22719)
chore(prover-node): track estimated L1 fee when proof publishing is
disabled (AztecProtocol#22691)
fix(ci): rerun squashed PR check on base branch change (AztecProtocol#22713)
feat(archiver): decouple calldata from blob fetching in L1 synchronizer
(AztecProtocol#22716)
refactor(e2e): enable pipelining in e2e_epochs tests (AztecProtocol#22544)
feat(p2p): reject and evict txs with insufficient max fee per gas
(AztecProtocol#22118)
refactor(world-state): always index block 0 regardless of initial tree
size (AztecProtocol#22724)
fix(e2e): fix redistribution test (AztecProtocol#22729)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants