Skip to content

fix(p2p): do not penalize peers that signal a missing block with Fr.ZERO#23672

Merged
PhilWindle merged 1 commit into
merge-train/spartanfrom
mr/fix-fr-zero-peer-penalization
May 29, 2026
Merged

fix(p2p): do not penalize peers that signal a missing block with Fr.ZERO#23672
PhilWindle merged 1 commit into
merge-train/spartanfrom
mr/fix-fr-zero-peer-penalization

Conversation

@mrzeszutko

Copy link
Copy Markdown
Contributor

Stacked on: [mr/fix-block-txs-validation-archiver-fallback](https://github.com/AztecProtocol/aztec-packages/pull/23624) (the archiver-fallback PR).


Summary

A peer that legitimately can't find the requested block in its attestation pool or archiver, but matches the requested hashes against its tx pool and sends those txs back, signals "I don't have the block" by setting archiveRoot = Fr.ZERO in the response (block_txs_handler.ts:54-58). The requester's validation currently treats that response the same as a malicious archive-root mismatch and applies a MidToleranceError peer penalty. After enough such responses from the same helpful peer, that peer is disconnected by the requester for behaviour the protocol explicitly documents as legitimate.

This PR makes the validator recognise Fr.ZERO as the "I don't have the block" signal and stop penalising peers for it.

What's broken

Responder side (correct, intentional)

block_txs_handler.ts:54-58 — when the responder lacks the block (no proposal, no archived block) but the request carried full tx hashes, it tries to serve from its pool by hash and signals the "I don't know the block" condition with archiveRoot = Fr.zero():

if (!txHashes && requestedTxsHashes !== undefined) {
  const responseTxs = (await txPool.getTxsByHash(requestedTxsHashes)).filter(tx => !!tx);
  const response = new BlockTxsResponse(Fr.zero(), new TxArray(...responseTxs), BitVector.init(0, []));
  return response.toBuffer();
}

Requester side (broken)

The validator at libp2p_service.ts:1525-1530 penalizes any archive-root mismatch with MidToleranceError (-10 score points) and throws — including Fr.zero:

if (!response.archiveRoot.equals(request.archiveRoot)) {
  this.peerManager.penalizePeer(peerId, PeerErrorSeverity.MidToleranceError);
  throw new ValidationError(...);
}

After 5 such responses from the same helpful peer, that peer is at -50 in the requester's score book → disconnected by pruneUnhealthyPeers (peer_manager.ts:601-603).

Receiver-side code already documented the correct intent

The wrongful penalty contradicts what the receiver-side code explicitly says should happen. batch_tx_requester.ts:586-600 has handleArchiveRootMismatch, whose docstring spells out exactly the semantic we're restoring:

/**
 * Handles an archive root mismatch between local state and peer response.
 *
 * - Response archive is Fr.ZERO (peer pruned proposal, legitimate): marks peer dumb.
 * - Non-zero archive mismatch (malicious response): penalises + marks dumb.
 */
private handleArchiveRootMismatch(peerId: PeerId, response: BlockTxsResponse): void {
  if (!response.archiveRoot.isZero()) {
    this.peers.penalisePeer(peerId, PeerErrorSeverity.LowToleranceError);
  }
  this.peers.markPeerDumb(peerId);
  this.txsMetadata.clearPeerData(peerId);
}

But this function is only reached from decideIfPeerIsSmarthandleSuccessResponseFromPeer, which only runs when validation returns true. The validator's first check (archive-root equality) rejects every archive-mismatched response — including Fr.zero — so handleArchiveRootMismatch is never actually invoked. The "Fr.zero is legitimate" exemption it encodes has been unreachable since the validator's archive-root check was added.

The flow today:

   reqresp response
        │
        ▼
   validateRequestedBlockTxsConsistency
   ├─ Fr.zero       → reject + Mid penalty (BUG, contradicts the docstring above)
   ├─ non-zero ≠    → reject + Mid penalty
   └─ matches archive root ──► handleSuccessResponseFromPeer
                                  └─ decideIfPeerIsSmart
                                       └─ hasArchiveRootMismatch?
                                            always false (validator filtered them all out)
                                            ──► handleArchiveRootMismatch   ◄── DEAD CODE

So the receiver side already knew Fr.zero should not be penalised; the decision was just being made at the wrong layer. This PR moves it to the validator, where it can actually fire. handleArchiveRootMismatch itself remains dead code (separate cleanup candidate, not in this PR).

Fix

Special-case response.archiveRoot.isZero() at the top of validateRequestedBlockTxsConsistency so the archive-root mismatch path is bypassed for Fr.zero responses. We still return false (the txs are dropped because we can't verify membership/order without the block) but no peer penalty is applied — matching the intent of the Fr.zero exemption in batch_tx_requester.ts:593-600.

// libp2p_service.ts (inside validateRequestedBlockTxsConsistency)
if (response.archiveRoot.isZero()) {
  this.logger.debug(`Peer ${peerId.toString()} signalled missing block with Fr.zero archive root`);
  return false;
}

if (!response.archiveRoot.equals(request.archiveRoot)) {
  this.peerManager.penalizePeer(peerId, PeerErrorSeverity.MidToleranceError);
  ...
}

The validator's other checks are unchanged. The early return prevents the bitvector-length and maxReturnable checks downstream from firing on a zero-length bitvector response, which would otherwise also wrongly penalise the peer.

Tests

Unitp2p/src/services/libp2p/libp2p_service.test.ts

A new test in the validateRequestedBlockTxsConsistency describe block: "should not penalize a peer that signals lacking the block with Fr.ZERO archive root". Constructs a response with archiveRoot = Fr.ZERO and asserts that peerManager.penalizePeer is not called. Verified red before the fix, green after.

Integrationp2p/src/client/test/p2p_client.integration_block_txs.test.ts

A new test in the p2p client integration block txs protocol describe block: "requester does not penalize peer that returns Fr.zero (peer lacks proposal but matched by hash)". Drives a real reqresp BLOCK_TXS request over libp2p between two clients, lets the responder hit the Fr.zero branch in block_txs_handler, then runs the response through the requester's real validateRequestedBlockTxsConsistency and asserts the requester's peerManager.penalizePeer is not called with MidToleranceError or LowToleranceError. Verified red before the fix, green after.

@AztecBot

AztecBot commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/504c764c1d368192�504c764c1d3681928;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_invalidate_block.parallel.test.ts "proposer invalidates multiple checkpoints" (420s) (code: 0) group:e2e-p2p-epoch-flakes

@@ -1528,6 +1528,16 @@ export class LibP2PService extends WithTracer implements P2PService {
peerId: PeerId,
): Promise<boolean> {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like this functions would be better off returning more infomration than simply true/false. Perhaps ReqRespStatus. The case you have added gets translated to INTERNAL_ERROR. It has the same effect as NOT_FOUND but perhaps this is due a cleanup.

@mrzeszutko mrzeszutko force-pushed the mr/fix-fr-zero-peer-penalization branch from f97be2d to 480af1a Compare May 29, 2026 09:14
@PhilWindle PhilWindle enabled auto-merge (squash) May 29, 2026 09:16
@PhilWindle PhilWindle merged commit 5ee9284 into merge-train/spartan May 29, 2026
14 checks passed
@PhilWindle PhilWindle deleted the mr/fix-fr-zero-peer-penalization branch May 29, 2026 09:41
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request Jun 4, 2026
BEGIN_COMMIT_OVERRIDE
test(e2e): unskip pipelining related e2e tests (AztecProtocol#23642)
fix(archiver): prune blocks without proposed checkpoint by end of build
slot (AztecProtocol#23606)
test: migrate benchmarks to pipelining setup (AztecProtocol#23647)
fix(p2p): fall back to archiver in BLOCK_TXS response validation
(AztecProtocol#23624)
docs(slashing): align operator and slasher docs with AZIP-7 (AztecProtocol#23494)
fix(p2p): do not penalize peers that signal a missing block with Fr.ZERO
(AztecProtocol#23672)
chore: adjust metrics deployment (AztecProtocol#23676)
fix(cheat-codes): warpL2TimeAtLeastBy advances relative to leading clock
(AztecProtocol#23675)
chore: tighten node pool sizes (AztecProtocol#23678)
chore: remove archival nodes (AztecProtocol#23630)
chore: merge blob sink duties into RPC node (AztecProtocol#23631)
fix: sync avm-transpiler Cargo.lock with noir submodule (AztecProtocol#23683)
fix(spartan): set validator lag env vars in tps-scenario (AztecProtocol#23684)
fix: make world-state hash queries reorg-aware to close getWorldState
race (AztecProtocol#23677)
fix: pin noir submodule to next's version on merge-train/spartan
(AztecProtocol#23690)
fix: ensure image ref is used by bench runner (AztecProtocol#23682)
fix(ci): retry aztec-nr nargo dependency clone on transient network
flake (AztecProtocol#23653)
chore: run one-off jobs on network nodes (AztecProtocol#23701)
fix: simulate proposals inside target slot (AztecProtocol#23692)
chore: smaller eth-devnet (AztecProtocol#23704)
chore: enable testnet autoscaling (AztecProtocol#23705)
feat(api)!: redesign node log retrieval API around tag-based queries
(AztecProtocol#23625)
fix(sequencer): set own proposed checkpoint locally instead of via p2p
loopback (AztecProtocol#23659)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants