feat: merge-train/spartan-v5#23910
Merged
Merged
Conversation
…les (#23809) ## Initial words from the human behind this changeset So it's not all AI! This PR splits the timetable into two. One of them is related to consensus, and defines when proposals and attestations are to be accepted. The other is related to block building, only used by the proposer. Aside from the split, we also redefine some of the times, such that: - Block building starts one Ethereum slot before the start of the build slot. - The proposer finishes block building such that, assuming all validators respond in time, it has its checkpoint ready to submit to L1 exactly one Ethereum slot before the start of the target slot, to maximize chances for inclusion. - Should validators _not_ respond in time, the sequencer will still wait for attestations until one Ethereum slot before the end of its target slot, ie the last possible moment for submission. The network considers attestations valid up until this point. - On the other hand, checkpoint proposals are accepted up until a few seconds before the block-building start of the next proposer. Those "few seconds" are one block duration, so the next proposer has time to reexecute and catch up. Everything below this line is now AI generated. It's not that bad to read though. ## What & why Restructures the sequencer timing model ("timetable") from a single state-gated model into **two explicit, slot-keyed timetables**, so the proposer's *scheduling* and the *consensus deadlines* validators and the next proposer must agree on are cleanly separated and queried directly — instead of being inferred from sequencer state transitions. This PR focuses on the new structure; the change count is large mostly because the old `CheckpointTimingModel` and its state-gated plumbing are replaced wholesale.  ## The split - **`ConsensusTimetable`** — consensus acceptance bounds only, derived from protocol constants `{genesis_time, aztec_slot_duration, ethereum_slot_duration, block_duration}`. Returns the receive-window bounds and the single attestation deadline. Used by validator-client, p2p, the publisher's L1-tx timeout, and next-proposer handoff reasoning. - **`ProposerTimetable`** — composes the consensus one and adds the proposer's ideal/happy-path schedule and the block sub-slot timetable, with the operational budgets `{min_block_duration, p2p_propagation_time, checkpoint_proposal_prepare_time}`. Used only by the sequencer / checkpoint-proposal job. ## Key design points - **Single `attestation_deadline`** (`target_slot_start + aztec_slot_duration − 2 · ethereum_slot_duration`) replaces the old publish / attestation-receive / proposal-validation deadlines and removes `l1_publish_prepare_time`. It is the one hard, **consensus-driven** (used for inactivity/slashing) cutoff for re-execution, validation, signing, the proposer's attestation-collection cutoff, the p2p attestation acceptance bound, and the L1-tx timeout. - **Consensus receive gate** `checkpoint_proposal_receive_deadline = target_slot_start − ethereum_slot_duration − block_duration` depends only on protocol constants and gates the next-proposer handoff. Enforced at **p2p ingress, checkpoint-type only**, against arrival time. - **Explicit deadline queries replace state-gated timing.** `setState` is now pure — `getMaxAllowedTime`, `assertTimeLeft`, and `SequencerTooSlowError` are gone. The proposer queries `getStartDeadline` / `selectNextSubslot` directly; states remain purely for lifecycle/observability. - **p2p acceptance windows** are now explicit `[receive_start − clock_disparity, deadline + clock_disparity]` ranges: checkpoint proposals get a tight, **non-overlapping** window (a proposal maps unambiguously to one target slot); attestations get a deliberately **liberal** window (they are self-tagged by `(slot, checkpoint)` in the signature). - The proposer sizes block production around the **ideal L1-publish path only** — `last_block_build_time` and `checkpoint_proposal_send_time` are single values, no ideal/deadline pairs. ## Example (production values) With `aztec_slot_duration=72`, `ethereum_slot_duration=12`, `block_duration=6`, `min_block_duration=2`, `p2p_propagation_time=2`, `checkpoint_proposal_prepare_time=1`: `max_blocks_per_checkpoint = 10`; `last_block_build_time` at +61s; `attestation_deadline` at +132s. The full model, worked numbers, and a proportional timeline diagram are in `sequencer-client/src/sequencer/README.md`. ## Notes - `enforce=false` and single-block mode are **retained for now** — they remain the e2e default and the sandbox/local-network default (on the production `Sequencer`, not automine). A follow-up will remove them by (1) adapting tests so timetable enforcement can be on everywhere and (2) migrating local-network to the automine sequencer (which bypasses the timetable entirely). - **Behavior change**: intermediate block-proposal re-execution is now bounded by the single `attestation_deadline` rather than the next wall-clock slot boundary. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
) ## Motivation `prepareCheckpointProposal` ran the expensive `checkSync` — a ~6-subsystem query (world-state, archiver tips, p2p, l1→l2 tips, pending-chain validation, proposed-checkpoint data) plus block-data and coherence guards — on every tick of every node, *before* the cheap `checkCanPropose`. Since most nodes are not the proposer for most slots, the full sync work was wasted on a busy committee. ## Approach Swap the order so the cheap proposer check runs first and non-proposers bail before doing the full sync work, while preserving the "invalidate even if not proposer" liveness backstop. The single coupling — `considerInvalidatingCheckpoint` previously consumed the whole sync result — is removed by having it take the archiver's pending-chain validation status directly (a single cheap read), which is authoritative on its own. The expensive `checkSync` and the build-start timing gate (`setState(PROPOSER_CHECK)`) stay on the proposer path, so proposer build/timing semantics are unchanged. ## Changes - **sequencer-client**: Reorder `prepareCheckpointProposal` to compute `checkCanPropose` once up front; escape-hatch voting and the non-proposer invalidation backstop are evaluated before `checkSync`, which now runs only for proposers. `considerInvalidatingCheckpoint` takes a `ValidateCheckpointResult` instead of the full sync result, wrapped in try/catch on the non-proposer path, and gains a per-`(checkpoint, slot)` dedup guard (set only after a request is successfully simulated) to avoid re-simulating/re-submitting within a slot while still retrying on transient failures. - **sequencer-client (tests)**: Add coverage for the intentional behavior changes — a non-proposer with an invalid pending chain invalidates even when the sync check would fail; an escape-hatch-open proposer votes regardless of sync state; per-slot invalidation dedup; and retry within a slot after a transient simulation failure. ### Intentional behavior changes 1. **Invalidation no longer requires a fully-successful `checkSync`.** A non-proposer with an invalid pending chain now invalidates from `getPendingChainValidationStatus()` alone. `simulateInvalidateCheckpoint` (L1-backed) remains the validity gate. Strengthens the liveness backstop. 2. **Invalidation is no longer gated by the build-start timing throw**, since non-proposers no longer reach `setState(PROPOSER_CHECK)`. The gate is scoped to building, and invalidation is not building. 3. **Escape-hatch suppression of invalidation is preserved** by ordering (escape-hatch `return` stays ahead of the non-proposer invalidation branch). 4. **Escape-hatch voting on failed-sync slots becomes immediate**, since it is now evaluated before `checkSync`. Minor observable change: escape-hatch voting and non-proposer invalidation no longer transition through the `PROPOSER_CHECK` state, so that state's duration metric no longer includes `checkCanPropose` or those paths. Fixes A-1147
…23916) ## Why The `merge-train/spartan-v5` PR (#23910) was dequeued from the merge queue: the merge-queue-heavy run failed on **one** of its 10 grind runs, on the cli-wallet flow test `cli-wallet/test/flows/shield_and_transfer.sh`. The fatal error was a fee-juice claim failing after a chain reorg: ``` WARN pxe:block_synchronizer Pruning data after block 7 due to reorg ERROR wallet Error: No L1 to L2 message found for message hash 0x008e9f1b... Simulation error: No L1 to L2 message found ... at FeeJuice.claim_and_end_setup ``` ## Root cause The cli-wallet `send` / `deploy` / `create-account` / `deploy-account` commands wait for tx status **`proposed`** by default. In the serial, single-block local-network sandbox a *proposed* block can be orphaned and pruned before its checkpoint is published — exactly the reorg above. When that happens, a tx the caller already moved on from is dropped, and the bridged fee-juice L1→L2 message that the next `deploy-account --payment fee_juice,claim` depends on disappears, so the claim fails with `No L1 to L2 message found`. It is intermittent (1/10 grinds here), which is why it only bit in the heavy merge-queue run. ## Fix Scoped entirely to the flow-test harness: the `aztec-wallet` wrapper in `shared/setup.sh` now passes `--wait-for-status checkpointed` for the tx-producing commands (`send`, `deploy`, `deploy-account`, `create-account`), so each tx is durably included before the next is sent. Commands that pass `--no-wait` (e.g. `bridge-fee-juice`) or already set `--wait-for-status` are left untouched. **The cli-wallet itself is unchanged — its default is still `proposed`.** This addresses the earlier review note to not change the wallet default and keep the change within the tests.
Collaborator
Author
Flakey Tests🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BEGIN_COMMIT_OVERRIDE
refactor(sequencer): split timing into consensus and proposer timetables (#23809)
chore(sequencer): gate checkSync behind the cheap proposer check (#23810)
test(cli-wallet): wait for checkpointed in serial sandbox flow tests (#23916)
END_COMMIT_OVERRIDE