feat: persist peer bans for a configurable duration (A-1157) by PhilWindle · Pull Request #23922 · AztecProtocol/aztec-packages

PhilWindle · 2026-06-06T15:19:28Z

Fixes A-1157. Addresses security advisory GHSA-h4vv-85x5-6hmh.

Problem

Peer scores decay toward zero (~0.9/minute). A peer whose score crossed the ban threshold (MIN_SCORE_BEFORE_BAN = -100) recovered to a healthy score within approximately 1 hour.

Fix

Record a ban when a peer's score drops below the ban threshold and hold it for a configurable duration (default 24h). Bans are kept in memory only and are cleared on restart — a restarted node re-learns bad peers from their behaviour rather than carrying stale bans across runs.

PeerScoring records { score, expiry } in an in-memory bannedPeers map, so getScore/getScoreState stay synchronous (required by the peer-manager hot paths, including a .sort() comparator).
While banned, getScore returns the ban score regardless of decay, so a peer cannot recover its way out of the ban early — even after decayAllScores cleans up the decayed live-score entry. Once the ban expires it is lifted and the live (decayed) score takes over, letting the peer recover.
Expired bans are lifted lazily on the next score query (getActiveBanScore) and swept proactively each heartbeat via pruneExpiredBans(), so a banned peer that disconnects and is never queried again does not linger in the map.

Configuration

New P2P_PEER_BAN_DURATION_SECONDS (config field peerBanDurationSeconds), default 86400 (24h). Registered in foundation env vars and the P2P config mappings.

Tests

peer_scoring.test.ts covers the full lifecycle, asserting both score values and states:

ban floor held through banned → recovered-live-score → expiry transitions;
peerBanDurationSeconds drives the window (60s case);
the advisory regression: after decay cleans up the live-score entry, getScore still returns the -150 ban score (not 0), keeping the peer Banned;
a peer whose previous ban has expired can be re-banned;
pruneExpiredBans removes expired bans but keeps active ones.

Existing peer_manager and peer_scoring suites pass; the previously existing "returns to Healthy after improving score" assertion was updated to reflect the new intended behaviour (a banned peer stays banned for the full window).

Peer scores decay toward zero over time, so a peer that crossed the ban threshold recovered to a healthy score within minutes, effectively making bans toothless (see security advisory GHSA-h4vv-85x5-6hmh). Persist a ban when a peer's score drops below the ban threshold, holding it for a configurable duration (P2P_PEER_BAN_DURATION_SECONDS, default 24h): - PeerScoring records {score, expiry} in an in-memory map (so getScore stays synchronous for the peer-manager hot paths) and writes through to a dedicated kv-store map for durability across restarts. - While banned, getScore returns the persisted ban score regardless of decay, so the peer cannot recover out of the ban early. Once the ban expires it is removed and the live (decayed) score takes over, letting the peer recover. - PeerScoring.new() restores active bans on startup, pruning expired ones in a single transaction. - PeerManager.stop() flushes pending ban writes so they are durable on shutdown.

Update the peer-scoring docs to reflect that bans are now persisted for a configurable duration rather than decaying away within minutes: clarify the ban-floor behaviour, add a Ban Persistence section, fix the recovery example that assumed app-score decay un-bans a peer, and note P2P_PEER_BAN_DURATION_SECONDS.

Replace the hand-rolled promise-chain persistence queue with a SerialQueue (created and started when a store is configured). enqueueBanPersistence now puts onto the queue and flushBanPersistence awaits a syncPoint, matching the SerialQueue pattern used by the tx pool.

Persisting a ban only mattered if the peer was actually kept out. Banned peers were disconnected reactively each heartbeat, but the libp2p connection gater (isNodeAllowedToConnect, used by denyInboundEncryptedConnection) only checked failed auth handshakes, so a banned peer could reconnect inbound between heartbeats and resume sending messages. Reject peers with an active ban in isNodeAllowedToConnect so the gater refuses the connection during the noise handshake, before any protocol stream opens, for the full ban duration.

Introduce a single getActiveBanScore helper that returns a peer's persisted ban score while the ban is active, or undefined otherwise, lazily lifting an expired ban before returning undefined. getScore and maybeBanPeer now both go through it. This removes the duplicated ban-lookup/expiry logic and fixes an edge case: maybeBanPeer previously checked bannedPeers.has(), so an expired-but-not-yet- pruned record blocked a fresh ban. It now prunes the stale record and starts a new window, so a peer that re-offends after its ban expired is re-banned.

…1157) The ban threshold (MIN_SCORE_BEFORE_BAN) is a hardcoded constant. If it changes across a software upgrade, a persisted ban score from the old threshold may no longer cross the new one, which would otherwise pin the peer at a stale floor for the rest of the ban window. restoreBannedPeers now drops (and prunes) any ban whose score is no longer below MIN_SCORE_BEFORE_BAN, in addition to expired bans.

…nts (A-1157) isNodeAllowedToConnect took `string | PeerId` and was called with either a peer id or an IP, conflating two keys with different rules (bans are peer-id only). Replace it with isPeerAllowedToConnect (ban + failed-auth, used by the encrypted inbound gater and dialing) and isAddressAllowedToConnect (failed-auth only, used by the raw inbound gater), sharing a private isWithinFailedAuthLimit helper.

fcarreiro · 2026-06-08T09:11:19Z

+  private bannedPeers: Map<string, BanRecord> = new Map();
+  /** The kv-store backing bans, kept so ban pruning can run in a single transaction. */
+  private readonly kvStore?: AztecAsyncKVStore;
+  /** Backing store for bans, so they survive restarts. */


Is surviving restarts worth the hassle of a store and keeping things in sync? Assuming that restarts are not very common and that a previously banned peer would get itself banned again.

Perhaps not, I did consider it solely being in memory. I added the persistence just for completeness as I agree, it seems like something that will rarely come in use.

Personally I'd prefer to remove the DB persistence code, just because (1) more code means more chances for mistakes (2) once code gets added, it rarely goes away even if it could.

But I'm approving since I don't feel strongly

Have removed the persistence. It's reduced the changeset significantly.

…157) Adds the regression test recommended by GHSA-h4vv-85x5-6hmh: a peer banned via penalties must not be silently restored to Healthy once decayAllScores removes its decayed score entry. Fails against the pre-fix behaviour (peer reads back Healthy after ~hours of idle decay); passes now the ban is persisted.

fcarreiro · 2026-06-08T14:15:38Z

+   * lifted (removed in memory and in the store) before returning undefined, so callers never see a
+   * stale ban.
+   */
+  private getActiveBanScore(peerId: string): number | undefined {


Is this the only point at which an expired ban is removed from the maps? Could the maps potentially grow indefinitely if we lose track of a peerId and never again ask getActiveBanScore on it?

Yes, this should now be fixed.

fcarreiro · 2026-06-08T14:20:25Z

+  private bannedPeers: Map<string, BanRecord> = new Map();
+  /** The kv-store backing bans, kept so ban pruning can run in a single transaction. */
+  private readonly kvStore?: AztecAsyncKVStore;
+  /** Backing store for bans, so they survive restarts. */


Personally I'd prefer to remove the DB persistence code, just because (1) more code means more chances for mistakes (2) once code gets added, it rarely goes away even if it could.

But I'm approving since I don't feel strongly

fcarreiro · 2026-06-08T14:24:49Z

+      expect(peerManager.isPeerAllowedToConnect(peerIdStr)).toBe(false);
+      expect(peerManager.isAddressAllowedToConnect(ipAddress)).toBe(false);
+    });
+


Should/do we have a test for the case where the address is allowed to connect, but then we realize it's a known peer and we don't allow connection?

Have added one under 'allows the address but denies the banned peer id'.

Per review feedback: persisting bans across restarts isn't worth the extra code and sync surface for something that will rarely matter — a still-bad peer simply re-earns its ban. Remove the kv-store wiring, SerialQueue, restoreBannedPeers/flushBanPersistence and the PeerScoring.new factory; bans are now held in memory only and cleared on restart. The configurable ban duration (P2P_PEER_BAN_DURATION_SECONDS, default 24h) and gater enforcement are unchanged. Also add a test for the two-stage connection gate: the raw-inbound gate allows an address (no peer id to match a ban) while the encrypted-inbound gate denies the banned peer id.

Expired bans were only pruned lazily when a peer's score was next queried, so a banned peer that disconnects and is never queried again would linger in the ban map. Add PeerScoring.pruneExpiredBans(), called from PeerManager.heartbeat alongside decayAllScores, to drop elapsed bans and bound the map's size.

…g iteration (A-1157)

fcarreiro

LGTM

PhilWindle added 8 commits June 6, 2026 15:19

Comments and tests

fe733ab

fcarreiro reviewed Jun 8, 2026

View reviewed changes

fcarreiro approved these changes Jun 8, 2026

View reviewed changes

PhilWindle added 3 commits June 8, 2026 21:48

test: assert ban scores and prune expired bans without mutating durin…

252ad52

…g iteration (A-1157)

fcarreiro approved these changes Jun 10, 2026

View reviewed changes

fcarreiro merged commit 024cd37 into merge-train/spartan-v5 Jun 10, 2026
19 of 20 checks passed

fcarreiro deleted the pw/ban-peers branch June 10, 2026 12:29

AztecBot mentioned this pull request Jun 10, 2026

feat: merge-train/spartan-v5 #23975

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: persist peer bans for a configurable duration (A-1157)#23922

feat: persist peer bans for a configurable duration (A-1157)#23922
fcarreiro merged 12 commits into
merge-train/spartan-v5from
pw/ban-peers

PhilWindle commented Jun 6, 2026 •

edited

Loading

Uh oh!

fcarreiro Jun 8, 2026

Uh oh!

PhilWindle Jun 8, 2026

Uh oh!

fcarreiro Jun 8, 2026

Uh oh!

PhilWindle Jun 8, 2026

Uh oh!

fcarreiro Jun 8, 2026

Uh oh!

PhilWindle Jun 8, 2026

Uh oh!

fcarreiro Jun 8, 2026

Uh oh!

fcarreiro Jun 8, 2026

Uh oh!

PhilWindle Jun 8, 2026

Uh oh!

fcarreiro left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PhilWindle commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Configuration

Tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fcarreiro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PhilWindle commented Jun 6, 2026 •

edited

Loading