Skip to content

feat: improve blob download from filestores#22096

Merged
spalladino merged 4 commits into
merge-train/spartanfrom
mr/blob-filestores-download
Mar 27, 2026
Merged

feat: improve blob download from filestores#22096
spalladino merged 4 commits into
merge-train/spartanfrom
mr/blob-filestores-download

Conversation

@mrzeszutko

@mrzeszutko mrzeszutko commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Restructures blob retrieval in HttpBlobClient to fix timing issues caused by redundant retries and improve configurability for operators.

Problem: HttpFileStore retried all HTTP errors (including 404s) with a [1, 1, 3] backoff, wasting ~5s per blob hash on "not found" responses. This compounded badly with the outer retry loop in getBlobSidecar, leading to ~5-12s retrieval times when blobs weren't yet available in filestores.

Changes:

  • Configurable HttpFileStore retry/timeout (yarn-project/stdlib/src/file-store/http.ts): Added HttpFileStoreOptions with retryBackoff (empty array = no retry) and timeoutMs fields. Existing callers keep the default [1, 1, 3] retry. Blob filestore clients are created with retryBackoff: [] and a configurable timeout (default 10s via BLOB_FILE_STORE_TIMEOUT_MS), eliminating all inner retries.

  • Unified retry loop in getBlobSidecar (yarn-project/blob-client/src/client/http.ts): Replaced the 3-phase flow (filestore -> consensus -> filestore-with-retry -> archive) with a single retry loop that alternates between two sources (consensus and filestore), then falls back to archive. Historical sync uses a short [1, 1] backoff; near-tip uses [1, 1, 1, 2, 2] for eventual consistency.

  • Configurable source order: New BLOB_PREFER_FILESTORES env var (default: false) lets operators try filestores before consensus if their filestores are faster/more reliable.

  • Supernode detection persisted (testSources() now populates superNodeHostIndexes): Non-supernode consensus hosts are skipped during blob fetching, avoiding wasted requests to hosts that can't serve blob sidecars.

  • Consensus fetch no longer double-retries: fetchBlobSidecars now uses native fetch instead of the retry-wrapped this.fetch, since the outer retry loop handles transient errors.

  • Documentation updated (docs/docs-operate/operators/setup/blob_storage.md): New env vars documented, retrieval flow description updated.

  • Tests updated for the new fetch order, blobPreferFilestores order swap, and unified retry behavior.

Files changed

File Change
yarn-project/stdlib/src/file-store/http.ts HttpFileStoreOptions type, configurable retry/timeout
yarn-project/stdlib/src/file-store/factory.ts Pass options through createReadOnlyFileStore
yarn-project/stdlib/src/file-store/index.ts Re-export HttpFileStoreOptions
yarn-project/blob-client/src/filestore/factory.ts Pass { retryBackoff: [], timeoutMs } for blob filestores
yarn-project/blob-client/src/client/config.ts blobPreferFilestores, blobFileStoreTimeoutMs config
yarn-project/blob-client/src/client/http.ts Unified retry loop, supernode filtering, tryConsensusHosts(), resolveSlotNumber()
yarn-project/blob-client/src/client/interface.ts Simplified isHistoricalSync docs
yarn-project/blob-client/src/client/http.test.ts Updated tests for new flow
yarn-project/foundation/src/config/env_var.ts New env var types
docs/docs-operate/operators/setup/blob_storage.md New env vars, updated retrieval docs

Fixes A-880

@spalladino spalladino left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a small optimization.

Comment thread yarn-project/blob-client/src/client/http.ts Outdated
@spalladino spalladino enabled auto-merge (squash) March 27, 2026 14:06
@spalladino spalladino merged commit 2e33a21 into merge-train/spartan Mar 27, 2026
11 checks passed
@spalladino spalladino deleted the mr/blob-filestores-download branch March 27, 2026 14:19
@AztecBot

Copy link
Copy Markdown
Collaborator

❌ Failed to cherry-pick to v4-next due to conflicts. (🤖) View backport run.

AztecBot added a commit that referenced this pull request Mar 27, 2026
Cherry-pick of 2e33a21 with conflicts in:
- yarn-project/blob-client/src/client/http.ts
- yarn-project/blob-client/src/client/http.test.ts
AztecBot added a commit that referenced this pull request Mar 27, 2026
AztecBot added a commit that referenced this pull request Mar 27, 2026
- Updated interface.ts with new GetBlobSidecarOptions fields
- Updated getSlotNumber with parentBeaconBlockRoot and l1BlockTimestamp params
- Updated getBlobsFromHost and fetchBlobSidecars with blobHashes param
- Added fetchBeaconConfig method for timestamp-based slot resolution
- Updated parseBlobJson to use new blobs API format
- Updated test mock servers for new /eth/v1/beacon/blobs/ endpoint
github-merge-queue Bot pushed a commit that referenced this pull request Mar 27, 2026
BEGIN_COMMIT_OVERRIDE
fix: only clear provenBlockNumber when it exceeds prune point (#21946)
chore: (A-779) load all accounts before calling
LogService.#getSecretsForSenders (#21923)
fix: align staging-public mana target with testnet/mainnet (#21983)
chore: (A-777) add warn logs for regressive path in block synchronizer
(#21925)
fix: fully validate txs retrieved from tx file store (#21988)
refactor: extract checkpoint proposal handling to ProposalHandler
(#21999)
fix: unbounded memory in calldataRetriever (#22004)
fix(p2p): check peer rate limit before global to prevent quota
starvation (#21997)
fix(p2p): evict expired failed-auth-handshake entries on heartbeat
(#21992)
chore: defensively handle skipPushProposedBlocksToArchiver (#22017)
chore: bump testnet prover resource profile to prod-hi-tps (#22019)
chore: (A-835) remove unused serializer (#22037)
fix(p2p): remove disconnected peers from scoring maps (#22009)
fix(e2e): set anvilSlotsInAnEpoch in slashing tests (#21869)
fix(ethereum): Audit fixes A-810, A-812 (nonce race, isEscapeHatchOpen
logging) (#21948)
chore: remove old TxPool implementation (#22028)
fix: Fix blob encoding when uploaded from proposals (#22045)
chore: Adds /cycle and /fix skills. Also configures linear mcp server
(#22043)
chore: remove validatorReexecute config option (#22024)
fix(sequencer): use last L1 slot of L2 slot as eth_simulateV1 timestamp
(#22023)
docs(simulator): clarify teardown gas billing is intentional (#22057)
chore: revert account loading optimization in log service (#22062)
fix: use DateProvider in PeerScoring (#22070)
fix(aztec.js): preserve extraHashedArgs in DeployMethod.with() (#22053)
fix(p2p): replace process.exit() with graceful shutdown in worker
cleanup (#22046)
chore: merge next (#22089)
fix(stdlib): correct NoteDao size (#22068)
feat: improve blob download from filestores (#22096)
fix: remove stale tx_pool v1 benchmark reference (#22104)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants