Skip to content

refactor(sequencer): split timing into consensus and proposer timetables#23809

Merged
spalladino merged 27 commits into
merge-train/spartan-v5from
spl/timetable-consensus-proposer-split
Jun 5, 2026
Merged

refactor(sequencer): split timing into consensus and proposer timetables#23809
spalladino merged 27 commits into
merge-train/spartan-v5from
spl/timetable-consensus-proposer-split

Conversation

@spalladino

@spalladino spalladino commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Initial words from the human behind this changeset

So it's not all AI!

This PR splits the timetable into two. One of them is related to consensus, and defines when proposals and attestations are to be accepted. The other is related to block building, only used by the proposer.

Aside from the split, we also redefine some of the times, such that:

  • Block building starts one Ethereum slot before the start of the build slot.
  • The proposer finishes block building such that, assuming all validators respond in time, it has its checkpoint ready to submit to L1 exactly one Ethereum slot before the start of the target slot, to maximize chances for inclusion.
  • Should validators not respond in time, the sequencer will still wait for attestations until one Ethereum slot before the end of its target slot, ie the last possible moment for submission. The network considers attestations valid up until this point.
  • On the other hand, checkpoint proposals are accepted up until a few seconds before the block-building start of the next proposer. Those "few seconds" are one block duration, so the next proposer has time to reexecute and catch up.

Everything below this line is now AI generated. It's not that bad to read though.

What & why

Restructures the sequencer timing model ("timetable") from a single state-gated model into two explicit, slot-keyed timetables, so the proposer's scheduling and the consensus deadlines validators and the next proposer must agree on are cleanly separated and queried directly — instead of being inferred from sequencer state transitions.

This PR focuses on the new structure; the change count is large mostly because the old CheckpointTimingModel and its state-gated plumbing are replaced wholesale.

Timetable example

The split

  • ConsensusTimetable — consensus acceptance bounds only, derived from protocol constants {genesis_time, aztec_slot_duration, ethereum_slot_duration, block_duration}. Returns the receive-window bounds and the single attestation deadline. Used by validator-client, p2p, the publisher's L1-tx timeout, and next-proposer handoff reasoning.
  • ProposerTimetable — composes the consensus one and adds the proposer's ideal/happy-path schedule and the block sub-slot timetable, with the operational budgets {min_block_duration, p2p_propagation_time, checkpoint_proposal_prepare_time}. Used only by the sequencer / checkpoint-proposal job.

Key design points

  • Single attestation_deadline (target_slot_start + aztec_slot_duration − 2 · ethereum_slot_duration) replaces the old publish / attestation-receive / proposal-validation deadlines and removes l1_publish_prepare_time. It is the one hard, consensus-driven (used for inactivity/slashing) cutoff for re-execution, validation, signing, the proposer's attestation-collection cutoff, the p2p attestation acceptance bound, and the L1-tx timeout.
  • Consensus receive gate checkpoint_proposal_receive_deadline = target_slot_start − ethereum_slot_duration − block_duration depends only on protocol constants and gates the next-proposer handoff. Enforced at p2p ingress, checkpoint-type only, against arrival time.
  • Explicit deadline queries replace state-gated timing. setState is now pure — getMaxAllowedTime, assertTimeLeft, and SequencerTooSlowError are gone. The proposer queries getStartDeadline / selectNextSubslot directly; states remain purely for lifecycle/observability.
  • p2p acceptance windows are now explicit [receive_start − clock_disparity, deadline + clock_disparity] ranges: checkpoint proposals get a tight, non-overlapping window (a proposal maps unambiguously to one target slot); attestations get a deliberately liberal window (they are self-tagged by (slot, checkpoint) in the signature).
  • The proposer sizes block production around the ideal L1-publish path onlylast_block_build_time and checkpoint_proposal_send_time are single values, no ideal/deadline pairs.

Example (production values)

With aztec_slot_duration=72, ethereum_slot_duration=12, block_duration=6, min_block_duration=2, p2p_propagation_time=2, checkpoint_proposal_prepare_time=1:

max_blocks_per_checkpoint = 10; last_block_build_time at +61s; attestation_deadline at +132s. The full model, worked numbers, and a proportional timeline diagram are in sequencer-client/src/sequencer/README.md.

Notes

  • enforce=false and single-block mode are retained for now — they remain the e2e default and the sandbox/local-network default (on the production Sequencer, not automine). A follow-up will remove them by (1) adapting tests so timetable enforcement can be on everywhere and (2) migrating local-network to the automine sequencer (which bypasses the timetable entirely).
  • Behavior change: intermediate block-proposal re-execution is now bounded by the single attestation_deadline rather than the next wall-clock slot boundary.

@spalladino spalladino added the ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure label Jun 2, 2026
@spalladino spalladino force-pushed the spl/timetable-consensus-proposer-split branch from 0c39bab to c56410f Compare June 3, 2026 13:32
@spalladino spalladino force-pushed the spl/timetable-consensus-proposer-split branch from 7fa8297 to 066fe9c Compare June 4, 2026 18:21
@spalladino spalladino requested a review from nventuro as a code owner June 4, 2026 18:21
@spalladino spalladino changed the base branch from merge-train/spartan to merge-train/spartan-v5 June 4, 2026 18:21
@spalladino spalladino removed the request for review from nventuro June 4, 2026 19:52

@PhilWindle PhilWindle left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally agree with the changes. They weren't as extensive as I had feared. The only concern I think I have is around the attestation_deadline = last_ethereum_block_in_target_slot - ethereum_slot_duration.

Does this make it too easy for validators to grief proposers by withholding attestations until deep into the target slot. Potentially increasing proposer costs/or missed checkpoints.

* proposer pipelining. Also the slot of the parent checkpoint this job builds on top of.
*/
private getBuildSlot(): SlotNumber {
return SlotNumber(this.targetSlot - 1);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use PROPOSER_PIPELINING_SLOT_OFFSET?

slot: this.targetSlot,
});
await this.waitUntilTimeInSlot(nextSubslotStart);
await this.waitUntilTimeInSlot(nextSubslotStart - this.getSlotStartBuildTimestamp());

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely minor. waitUntilTimeInSlot() is only called here and does sleepUntil(arg + this.getSlotStartBuildTimestamp()) :-)

if (this.blockDuration === undefined) {
// Single-block enforced mode: execution and re-execution run sequentially, so split the time
// remaining until the attestation deadline in half.
const maxAllowed = this.getAttestationDeadline(slot);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's not an issue if only used for tests etc. But I think this means single block only mode won't take p2p propagation into consideration?

@spalladino spalladino Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single-block mode is hopefully going away in #23821, to be replaced by the automine sequencer that doesn't rely on a timetable, so I didn't bother too much with it.

spalladino and others added 21 commits June 5, 2026 09:05
…ublish windows

Under proposer pipelining the checkpoint proposal job passed `targetSlot` to
`setState` for build-frame states, which made `Sequencer.setState` measure the
state deadlines a full Aztec slot too late (frame B), so they never fired.
Measure build-frame deadlines against `slotNow` (the build slot).

Also align the timing windows around L1 geometry:
- Base the checkpoint attestation/publish deadlines and the p2p attestation
  acceptance window on `ethereumSlotDuration` (12s before the last L1 block of
  the target slot) rather than the configurable `l1PublishingTime`. This unifies
  the deadline with the publisher's send lead, which already targets one Ethereum
  slot before the target slot start.
- Validators keep validating/attesting checkpoint proposals until the L1 publish
  deadline instead of the target-slot start.

Adds regression coverage for the frame bug (build-frame states must be measured
against the build slot) and for the realigned attestation/publish windows.
…rTimetable

Replace the state-gated CheckpointTimingModel with two explicit, slot-keyed timetables: ConsensusTimetable (protocol-constant deadlines + p2p acceptance bounds) and ProposerTimetable (proposer ideal schedule + block sub-slots). Collapse the publish/attestation/validation deadlines into a single consensus-driven attestation_deadline and drop l1_publish_prepare_time. Make setState pure and query deadlines explicitly (remove getMaxAllowedTime/assertTimeLeft/SequencerTooSlowError). Enforce explicit p2p receive windows. Update the spec README and example timeline SVG.
…te under pipelining

Three CI fixes for the consensus/proposer timetable split:

- p2p libp2p_service test: the mocked L1 constants omitted l1GenesisTime and
  ethereumSlotDuration, which the new PipeliningWindow/ConsensusTimetable path
  needs to derive slot timestamps. Add them and pin getEpochAndSlotNow so test
  proposals fall inside slot 100's receive window.

- checkpoint_proposal_job: the propose tx timeout was set to attestation_deadline
  (target_slot_start + S - 2E), which bounds when validators must sign, not when
  the proposer must send. Attestations are collected up to (and, when not
  enforcing, past) that point, so the propose tx was enqueued already expired and
  timed out before mining. Use last_ethereum_block_in_target_slot
  (target_slot_start + S - E), the latest L1 block the propose can still land in.

- sequencer: the sync-failure vote fallback gated on now > getStartDeadline(targetSlot),
  but targetSlot is derived from the next L1 slot (a one-ethereumSlotDuration look-ahead),
  so now never reaches that deadline for the current target slot when E > 2*minBlockDuration,
  and governance/slashing votes never fired. Apply the same look-ahead to the comparison.

- epochs_missed_l1_slot e2e: build-frame state events now carry the build slot
  (slotNow), not the target slot, so wait for INITIALIZING_CHECKPOINT at the build
  slot.
The timetable refactor anchored block sub-slots directly at build_frame_start
(block_build_deadline(k) = build_frame_start + (k+1)*D), assuming the proposer
begins building at the moment the build frame opens. In practice the proposer
only enters the build loop after sync, the proposer check, and checkpoint
initialization, so it starts some time into the frame. When that prologue plus
min_block_duration exceeds D (notably when min_block_duration is clamped to D in
tight fast profiles), the first sub-slot's deadline is already too close to be
startable, so it is skipped. This under-packs checkpoints (fewer than
minBlocksForCheckpoint) and starves the first block of a checkpoint of build
time, breaking enforced-timetable multi-block-per-checkpoint e2e tests.

Reintroduce a checkpoint_proposal_init_time budget (default 1s, matching the
pre-refactor checkpointInitializationTime) and anchor the sub-slot grid at
build_frame_start + init. max_blocks_per_checkpoint subtracts the same init, so
the derived counts for the production and local profiles are unchanged. This is
a proposer operational budget only; no consensus-acceptance deadline changes.

Update the timing spec and timetable tests accordingly, and add a regression
test for the tight minD == D fast profile with a realistic late proposer start.
…t durations

The timetable refactor dropped the pre-refactor fast-profile override that
forced shorter operational budgets when ethereum_slot_duration < 8. Without it,
fast local/e2e networks inherit the conservative production budgets
(p2p_propagation_time=2s, checkpoint_proposal_prepare_time=1s, min_block_duration=2s)
because the e2e configs only set slot durations and block duration, not the
budgets. Those large budgets shrink the per-checkpoint build window:
max_blocks_per_checkpoint = floor((S - init - D - 2P - prepCp)/D) drops below
the count the network needs, under-packing checkpoints under enforceTimeTable.

Concretely, with production budgets: epochs_l1_reorgs (S=36,E=4,D=8) and
l2_to_l1 (S=12,E=4,D=2) derive only 2 blocks per checkpoint instead of the
pre-refactor 3 and 4, so the multi-block checkpoints either fail to pack or
land too late to be observed before the test reads the archiver.

Reintroduce the fast profile in the stdlib timetable: when
ethereum_slot_duration < FAST_PROFILE_ETHEREUM_SLOT_DURATION (8s), clamp
p2p_propagation_time, checkpoint_proposal_prepare_time, and min_block_duration
down to the documented fast values (0.5/0.5/1). The clamp only lowers budgets,
so an explicitly smaller config is kept, and production (E>=8, e.g. high_tps at
E=12) is unaffected. checkpoint_proposal_init_time stays at its prologue value
so the first-sub-slot fix from the previous commit is preserved. Consensus
acceptance deadlines (ConsensusTimetable, p2p receive windows) are derived only
from {S,E,D} and are untouched.

calculateMaxBlocksPerSlot resolves budgets through the same profile so p2p
gossipsub scoring derives the same count as the proposer. Add unit assertions
pinning the derived capacity for each enforced multi-block e2e config to its
minBlocksForCheckpoint and expected build-window count.
Build the consensus/proposer timetables once per component and inject the
object into consumers instead of passing option bundles and constructing
timetables at scattered sites. Add constant getters (genesisTime,
getL1Constants, resolved budgets) so consumers read config off the timetable.

Delete PipeliningWindow; validators now check the disparity-widened receive
window directly via shared helpers in clock_tolerance. Block and checkpoint
proposals share the tight checkpoint proposal receive window as their arrival
gate, and the current/next-slot short-circuits in the attestation and proposal
validators are removed. Update the sequencer README to document the shared
block-proposal arrival gate.

Fold calculateMaxBlocksPerSlot into ProposerTimetable.getMaxBlocksPerCheckpoint
and have p2p gossipsub scoring construct a ProposerTimetable so block-count
budget resolution goes through the same path as the sequencer.
…tables

ConsensusTimetable and ProposerTimetable now require every protocol constant
and operational budget at the constructor. blockDuration stays number|undefined
for legacy single-block mode but is a required property (no ?), so every call
site passes it explicitly. Defaults live in the config layer (sequencer config
mappings, the SequencerTimetable adapter, and the p2p scoring adapter), not in
the timetable or at call sites.

Threads the real ethereumSlotDuration into calculateBlocksPerSlot and makes it
required, removing the slotDurationMs/1000 fallback that forced the production
budget profile.

Wires blockDurationMs into the validator-client config surface (via the shared
sequencer config mapping) and passes the explicit blockDuration into its
ConsensusTimetable. This tightens the block/checkpoint proposal receive deadline
the validator-client validates against from target_slot_start - E to - E - D
when blockDuration is set, which is the intended spec behavior; the attestation
deadline the validator-client itself consumes is unaffected (no D term).
retryUntil now accepts its timeout as a number of seconds, an explicit
{ timeout } in seconds, or an absolute { deadline } with an optional
DateProvider. The deadline form derives the remaining budget from the
provider's clock at call time and treats an already-past deadline as an
immediate timeout, matching the existing non-positive-timeout semantics.

Use the deadline form in the validator ProposalHandler re-execution and
parent-block paths instead of manually converting an absolute deadline to
fractional seconds.
Split timetable/index.ts into one file per concern: budgets.ts (the
DEFAULT_*/FAST_PROFILE_* constants and resolveTimingBudgets),
consensus_timetable.ts (ConsensusTimetable + SlotTimingConstants), and
proposer_timetable.ts (ProposerTimetable + SubslotSelection). index.ts is now
re-exports only. Split the test file to match. Rename
ProposerTimetable.getStartDeadline to getBuildStartDeadline. Retitle the moved
spec README to 'Block Building Timetable Spec' and repoint the sequencer-client
README references to its new stdlib location.
…deadline plumbing

- Drop the SequencerTimetable adapter; the Sequencer now constructs the
  ProposerTimetable from its config + l1Constants (filling unset budgets with the
  shared DEFAULT_* values) and keeps the max-blocks-per-checkpoint validation.
- CheckpointProposalJob reports the target slot (not the build slot) on setState
  via a private setState helper, and anchors build-frame timing on
  ProposerTimetable.getBuildFrameStart(targetSlot).
- Compute the setState seconds-into-slot metric and stateDurationMs from the date
  provider (getBuildFrameStart + the delta between consecutive setState calls) so
  they track simulated time under a mocked clock.
- Inline the sync-failure vote gate as a plain now <= start_deadline check (drop
  the next-L1-slot look-ahead) and rename getStartDeadline usages to
  getBuildStartDeadline.
- Mark the slot as attempted on the build-start-deadline abort path so a deadline
  abort does not rebuild the same checkpoint within the slot.
- Replace the deprecated MIN_EXECUTION_TIME with DEFAULT_MIN_BLOCK_DURATION (and
  config minBlockDuration in the sequencer) and remove the vestigial
  l1PublishingTime config field, mapping, env var, schema, and p2p plumbing.
… 0 in timetable

Codex review fixes: PROPOSER_CHECK state now reports the target slot like
the rest of the proposal lifecycle; getBuildFrameStart no longer throws on
slot 0 (p2p validators evaluate windows for peer-supplied slots); refresh
stale sequencer-client README sections describing the deleted timetable
wrapper, state-gated deadlines, and l1PublishingTime.
…rop slotNow

- Rename the state-changed payload's `secondsIntoSlot` to `secondsIntoBuildFrame`
  (seconds since the build-frame start of the target slot, which may be negative)
  and `slot` to `targetSlot`, with JSDoc documenting the build-frame anchoring and
  the undefined-for-lifecycle-states semantics. Rename the sequencer helper
  `getSecondsIntoSlot` to `getSecondsIntoBuildFrame`.
- Drop the redundant `slotNow` constructor arg/field from CheckpointProposalJob: it
  was always `targetSlot - 1` (the wall-clock build slot under pipelining). All uses
  now derive it from a private `getBuildSlot()` helper.
- Verify the private `setState(state)` helper is the sole caller of `setStateFn`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…validators

- Add a configurable maximum gossip clock-disparity tolerance (`maxGossipClockDisparityMs`,
  env `P2P_MAX_GOSSIP_CLOCK_DISPARITY_MS`, default 500ms) instead of the hardcoded
  MAXIMUM_GOSSIP_CLOCK_DISPARITY_MS constant. The value is threaded into the proposal and
  attestation validators, which now inline the receive-window check directly from the
  injected ConsensusTimetable getters. Deletes clock_tolerance.ts and its helpers; the
  window-boundary cases are rehomed to the validator tests.
- Remove `calculateBlocksPerSlot`: libp2p_service now builds a single ProposerTimetable
  (from p2p config budgets) shared by the validators and the gossipsub
  TopicScoreParamsFactory, which calls getMaxBlocksPerCheckpoint() directly. Drops the
  DEFAULT_*-filling adapter from topic_score_params.
- Simplify proposal_handler's block-sync wait back to a single retryUntil call using the
  deadline overload (a past deadline now times out natively), dropping the manual
  remaining<=0 fail-fast guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Remove hard line breaks within prose paragraphs and list items in the
sequencer-client and stdlib timetable documentation. Single logical lines
were split at ~120 chars for formatting; with word wrap enabled, these
should be single physical lines.

- sequencer-client/README.md: 18 joins across all changed sections
- stdlib/src/timetable/README.md: 36 joins in the new timetable spec document

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…land deadline

The orphan-prune gate fired at now >= expectedCheckpointedByTime, pruning a
block-only tip at the exact instant its enclosing checkpoint could still
legitimately land. The checkpoint receive deadline is an inclusive acceptance
bound, so a checkpoint arriving at that instant is still valid; pruning then is
one tick too aggressive. This regressed once the deadline anchor moved to the
checkpoint receive deadline (target_slot_start - E - D) and was rounded to an
integer tick, landing it exactly on the now tick under second-granularity
timing. Require strictly-past (now > expected) for both the archiver prune and
the sequencer overdue gate so the two agree.
…n deadline

The wall-clock orphan prune in pruneOrphanProposedBlocks tore down a proposed
block-only tip at the checkpoint receive deadline plus grace (expected land
time). That instant is earlier than the attestation deadline, which is both the
latest a checkpoint can still land on L1 (promoting these blocks into a
confirmed checkpoint during L1 sync) and the cutoff by which validators must
finish re-executing the proposals. Pruning at the earlier time discarded state
that was still within its valid window, breaking in-flight re-execution.

Gate the destructive prune on getAttestationDeadline instead. The sequencer's
non-destructive overdue warning stays at the expected land time as an earlier
soft signal; its comments are updated to reflect that the two boundaries now
differ by design.
@spalladino

Copy link
Copy Markdown
Contributor Author

The only concern I think I have is around the attestation_deadline = last_ethereum_block_in_target_slot - ethereum_slot_duration. Does this make it too easy for validators to grief proposers by withholding attestations until deep into the target slot. Potentially increasing proposer costs/or missed checkpoints.

I initially had a l1_prepare_time buffer inbetween the attestation_deadline and the last_l1_submission_time, so that validators had to send their attestation a bit earlier.

But if we assume that a majority of the validators is honest, then it makes sense to account as much as we can for delays in attestations reaching the proposer over the p2p network, so we have as much chances as we can to collect them. And the last possible moment where we could collect them is right before having to send the L1 tx, which is before the last L1 block of the L2 slot.

Happy to discuss more though.

…LINING_SLOT_OFFSET

Address review feedback: derive the build slot from PROPOSER_PIPELINING_SLOT_OFFSET
instead of a magic -1, and fold the single-caller waitUntilTimeInSlot helper into
waitUntilNextSubslot (which already holds the absolute sub-slot deadline). Tests now
override/spy waitUntilNextSubslot directly.
@spalladino spalladino force-pushed the spl/timetable-consensus-proposer-split branch 2 times, most recently from e02ad56 to 1875c33 Compare June 5, 2026 14:28
@spalladino spalladino enabled auto-merge (squash) June 5, 2026 14:29
@spalladino spalladino force-pushed the spl/timetable-consensus-proposer-split branch from 1875c33 to 571ffb0 Compare June 5, 2026 14:46
@spalladino spalladino force-pushed the spl/timetable-consensus-proposer-split branch from 571ffb0 to 0eb4aa9 Compare June 5, 2026 16:15
@AztecBot

AztecBot commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/88645903d59961c5�88645903d59961c58;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/sentinel_status_slash.parallel.test.ts "slashes the proposer with INACTIVITY when checkpoint validation records unvalidated" (231s) (code: 0) group:e2e-p2p-epoch-flakes

@spalladino spalladino merged commit 7b44d2e into merge-train/spartan-v5 Jun 5, 2026
12 checks passed
@spalladino spalladino deleted the spl/timetable-consensus-proposer-split branch June 5, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants