Skip to content

feat(realtime): block TCP ICE candidates by default#151

Open
nagar-decart wants to merge 2 commits into
mainfrom
feat/disable-tcp-ice-by-default
Open

feat(realtime): block TCP ICE candidates by default#151
nagar-decart wants to merge 2 commits into
mainfrom
feat/disable-tcp-ice-by-default

Conversation

@nagar-decart
Copy link
Copy Markdown
Contributor

@nagar-decart nagar-decart commented May 31, 2026

Summary

  • The SDK blocks TCP ICE candidates by default. WebRTC media over TCP suffers from head-of-line blocking that produces visible stalls — UDP-only either works well or fails fast, both of which are better than limping along on TCP.
  • Adds allowTcpIce?: boolean to RealTimeClientConnectOptions and SubscribeOptions. Callers who insist on TCP (e.g. enterprise networks where outbound UDP is blocked) opt in by passing true.

Motivation

Last 48h of livekit-server "participant active" @participant:client-*:

Transport Count %
UDP direct 53,291 88.9%
TCP direct (LiveKit :7881) 6,299 10.5%
TURN-UDP 355 0.6%
TURN-TCP / TLS 0 0%

Inside the 10.5% TCP-direct slice we sampled 473 sessions and parsed publisherCandidates:

  • 90.1% had a udp srflx candidate (STUN responded over UDP) — so the client's network does carry UDP. The SFU-specific UDP path failed connectivity checks, and ICE fell back to TCP. The session then carried real-time media under TCP HOL blocking — exactly the choppy-playback pattern reported by customers.
  • Only 4.2% had zero UDP candidates gathered — these are the clients who genuinely can't do UDP and need TCP/TLS:443 to connect at all. They keep working via allowTcpIce: true.

Implementation

packages/sdk/src/realtime/webrtc-ice-filter.ts — pure helpers + a reference-counted install that swaps globalThis.RTCPeerConnection for a Proxy while a session is open. TCP candidates can reach a PC through three independent paths, so the filter closes each:

  1. RTCConfiguration.iceServers — strip ?transport=tcp URLs and any turns: (TLS-over-TCP).
  2. setRemoteDescription SDP — strip a=candidate:N M TCP ... lines so the SFU's TCP host candidate is never paired against ours.
  3. addIceCandidate — drop trickled TCP candidates arriving after the initial SDP.

Threaded through client.tsstream-session.tsmedia-channel.ts (publisher) and directly in subscribe-client.ts (subscriber).

Test plan

  • pnpm typecheck clean
  • pnpm test — 235/235 passing (22 new unit tests cover URL parsing, candidate-string parsing, SDP filtering, RTCConfiguration filtering, ref-counted install/uninstall, the constructor proxy, addIceCandidate filtering, and setRemoteDescription SDP munging)
  • pnpm lint + pnpm format:check clean
  • Manual: load packages/sdk/index.html against staging, connect, and verify in chrome://webrtc-internals that the selected candidate pair is UDP and that no TCP candidates appear in the gathered list.
  • Manual: set allowTcpIce: true, confirm TCP candidates reappear (opt-in works).
  • Smoke against the Korean customer's network (tester URL) — expected: UDP succeeds and quality is good, or the connection fails cleanly (no TCP-stall regime).

Rollout considerations

  • Behavioural change. Clients whose only viable path was TCP-direct will now fail to connect. Telemetry suggests this is ≤4% of currently-successful sessions; those callers can keep TCP via allowTcpIce: true.
  • We have zero TURN-TLS adoption in 48h, so dropping turns:turn.decart.ai:443 from the gathered list doesn't regress any live traffic — but it removes the theoretical "UDP blocked everywhere → TURN-TLS:443" fallback for callers who don't pass allowTcpIce: true.
  • The patch monkey-patches globalThis.RTCPeerConnection — apps embedding this SDK alongside their own WebRTC code will see the patch active during a realtime session. The ref count restores the original constructor when the last session disconnects.
  • The right belt for this braces is a LiveKit-side change (livekit.rtc.tcp_port: 0 on the server) so old SDK versions and other clients also stop seeing the SFU's TCP host candidate. This SDK change protects new builds and gives callers the per-session allowTcpIce opt-in.

🤖 Generated with Claude Code

Production telemetry over 48h shows ~10.5% of successfully-connected
clients land on TCP-direct to LiveKit `:7881` instead of UDP, even
though 90% of them had a working `udp srflx` candidate. TCP fallback
carries media under head-of-line blocking, causing the freezes and
choppy playback customers report.

This patch wraps `globalThis.RTCPeerConnection` while a session is open
and drops TCP candidates at three points (defence in depth):

  1. `RTCConfiguration.iceServers` — drop `?transport=tcp` and `turns:`
     URLs so the browser never gathers TURN-TCP/TLS candidates.
  2. `setRemoteDescription` — strip `a=candidate ... TCP ...` lines
     from the SFU's SDP so its TCP host candidate is never paired.
  3. `addIceCandidate` — drop trickled TCP candidates as a guard.

The patch is reference-counted so concurrent sessions cooperate and
the original `RTCPeerConnection` constructor is restored on disconnect.

Exposed as `allowTcpIce?: boolean` on `RealTimeClientConnectOptions`
and `SubscribeOptions`. Default `false` (TCP blocked). Set to `true`
only for clients with UDP fully blocked outbound (~4% in our data,
who would otherwise fail entirely without the TCP path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 31, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@decartai/sdk@151

commit: b48ae5f

- Drop "defence in depth" wording in module header — it's a
  quality-of-experience filter, not a threat mitigation.
- Tighten the rationale paragraph and the three-paths-into-PC list.
- Lead the connect-option JSDoc with "quality choice" framing and
  describe the opt-in path for callers who genuinely need TCP.
- Trim the now-redundant MediaChannelConfig doc to point at the
  public option.

No behaviour change. Tests + typecheck + lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b48ae5f. Configure here.

}
}
};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allowTcpIce: true silently ignored when concurrent session filters

Medium Severity

When allowTcpIce: true is passed but another concurrent session has already installed the filter with allowTcpIce: false, installIceFilter returns noop at line 64 without checking if the global RTCPeerConnection is currently the filtered proxy. The allowTcpIce: true session then creates its Room using the still-patched global, so TCP candidates are silently filtered despite the explicit opt-in. For example, a publisher using default allowTcpIce: false would cause a concurrent subscriber's allowTcpIce: true to be ignored entirely.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b48ae5f. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant