Skip to content

fix(pkg/da): fallback to polling when ws cannot connect#3211

Merged
julienrbrt merged 3 commits intomainfrom
julien/p2p-only
Mar 30, 2026
Merged

fix(pkg/da): fallback to polling when ws cannot connect#3211
julienrbrt merged 3 commits intomainfrom
julien/p2p-only

Conversation

@julienrbrt
Copy link
Copy Markdown
Member

@julienrbrt julienrbrt commented Mar 30, 2026

Overview

Follow up of #3146.
Enforcing DA to be available at connection effectively makes the node require DA to be working prior to starting the node.
It is a regression. We should allow DA to be unavailable when starting. This does mean it will fully rely on polling when it will be back up instead of WS.

No changelog needed because not released.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced blob data availability client initialization with automatic fallback from WebSocket to HTTP polling when WebSocket endpoint is unavailable, ensuring service continuity and improved reliability.
  • Improvements

    • Improved logging throughout blob client creation across all services for better diagnostics and troubleshooting during connection initialization.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedMar 30, 2026, 3:21 PM

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 30, 2026

Claude finished @julienrbrt's task in 1m 35s —— View job


Code Review

  • Gather context
  • Read changed files
  • Analyze changes
  • Post review

The fix is correct and addresses the regression cleanly. A few observations:

pkg/da/jsonrpc/client.go

Function doc comment is misleading after fallback (client.go:72-76):

// Supports channel-based subscriptions (e.g. Subscribe).
// Note: WebSocket connections are eager — they connect at creation time
// if the initial WS dial fails, falls back to HTTP polling for the entire session.

The header says "Supports channel-based subscriptions (e.g. Subscribe)" but in the fallback path the returned client is HTTP-based and Subscribe will silently fail at call time. It's worth clarifying that subscription support is only guaranteed on a successful WS connection. Fix this →

Missing test for fallback behaviorclient_test.go tests NewClient directly but there's no test for NewWSClient exercising the WS-fails-fallback-to-HTTP path. Given this is the core behavioral change of the PR, a test that spins up an HTTP server (without WS support) and verifies NewWSClient succeeds and falls back would strengthen confidence here. Fix this →

apps/evm/cmd/run.go

Pre-existing: two DA clients created (not introduced by this PR but worth flagging separately). run.go:62-68 creates a blobClient/daClient that is passed to createSequencer, and then StartNode at pkg/cmd/run_node.go:152 creates a second blobClient/daClient independently. This means two WS connections (or HTTP connections after fallback) are opened. The EVM app is the only one with this pattern; grpc and testapp create the blob client only inside createSequencer and don't call StartNode's internal one additionally. This looks like it could be an existing structural issue worth tracking.


Overall: The change is small, correct, and fixes a real regression. The fallback error is properly propagated, the logger is consistently threaded through all callsites, and the intent is clear. The two minor points above (doc clarity and a missing test) are non-blocking.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Warning

Rate limit exceeded

@julienrbrt has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 42 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 42 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3ecc2f7-c131-4c8c-920f-382209ff7625

📥 Commits

Reviewing files that changed from the base of the PR and between 53ec7cb and 102f297.

📒 Files selected for processing (1)
  • pkg/da/jsonrpc/client.go
📝 Walkthrough

Walkthrough

Multiple application entry points and the blob DA JSON-RPC client are updated to pass a configured logger parameter to NewWSClient instead of an empty string. The NewWSClient signature now accepts a logger and implements fallback logic: try WebSocket first, warn on error, then use an HTTP (polling) client.

Changes

Cohort / File(s) Summary
Call Sites for NewWSClient
apps/evm/cmd/run.go, apps/grpc/cmd/run.go, apps/testapp/cmd/run.go, pkg/cmd/run_node.go
Call sites updated to pass the configured logger into blobrpc.NewWSClient instead of "". No other control-flow changes at call sites.
NewWSClient Implementation
pkg/da/jsonrpc/client.go
Function signature changed to accept logger zerolog.Logger. Attempts WebSocket client creation first; on WS creation error logs a warning and falls back to creating a non-WebSocket (HTTP/polling) client instead of returning the WS error.

Sequence Diagram

sequenceDiagram
    actor Caller
    participant NewWSClient
    participant WSEndpoint as WebSocket<br/>Endpoint
    participant Logger
    participant HTTPClient as HTTP Client<br/>(Fallback)

    Caller->>NewWSClient: NewWSClient(ctx, logger, addr, token, authHeaderName)
    NewWSClient->>WSEndpoint: Attempt WebSocket connection<br/>via httpToWS(addr)
    alt WebSocket Connection Succeeds
        WSEndpoint-->>NewWSClient: Connected
        NewWSClient-->>Caller: Return WebSocket Client
    else WebSocket Connection Fails
        WSEndpoint-->>NewWSClient: Connection Error
        NewWSClient->>Logger: Log Warning
        Logger-->>NewWSClient: Logged
        NewWSClient->>HTTPClient: Create HTTP Client<br/>via NewClient(addr)
        HTTPClient-->>NewWSClient: HTTP Client Created
        NewWSClient-->>Caller: Return HTTP Client (Fallback)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • alpe
  • tac0turtle

Poem

🐇 A little rabbit taps the log,
Hops from WS to HTTP in fog,
If sockets step back and fall,
I'll scurry down the HTTP hall,
Cheery hops keep data whole and strong.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: adding a fallback to polling when WebSocket connection to DA fails, which matches the core objective of this PR.
Description check ✅ Passed The description provides context referencing the previous PR, explains the regression, and justifies the fix. It covers the overview section adequately with background and rationale.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch julien/p2p-only

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@julienrbrt julienrbrt merged commit 8d68f9d into main Mar 30, 2026
14 of 17 checks passed
@julienrbrt julienrbrt deleted the julien/p2p-only branch March 30, 2026 15:21
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.41%. Comparing base (146e6e1) to head (102f297).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/da/jsonrpc/client.go 0.00% 6 Missing ⚠️
pkg/cmd/run_node.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3211      +/-   ##
==========================================
- Coverage   61.43%   61.41%   -0.02%     
==========================================
  Files         120      120              
  Lines       12470    12474       +4     
==========================================
  Hits         7661     7661              
- Misses       3949     3953       +4     
  Partials      860      860              
Flag Coverage Δ
combined 61.41% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants