Skip to content

fix(voip): heal stale SDK socket on warm lock-screen accept#7293

Closed
diegolmello wants to merge 1 commit into
feat.voip-lib-newfrom
fix/voip-warm-accept-sdk-reopen
Closed

fix(voip): heal stale SDK socket on warm lock-screen accept#7293
diegolmello wants to merge 1 commit into
feat.voip-lib-newfrom
fix/voip-warm-accept-sdk-reopen

Conversation

@diegolmello

@diegolmello diegolmello commented May 4, 2026

Copy link
Copy Markdown
Member

Proposed changes

When an iOS user has the app warm in the background, locks the device, then accepts an incoming call from the lock screen, the local UI shows the call as connected but the remote peer keeps trying to connect for ~10s and then fails. Killed-app accept (cold start via PushKit) works.

Root cause: the SDK's DDP websocket can go stale during background suspension. On warm lock-screen accept, nothing nudges it back to life:

  • appHasComeBackToForeground only runs on AppState='active'. The device stays locked through CallKit accept, so the saga never fires.
  • The SDK's own reopen() only kicks in on an explicit onClose event; iOS suspended sockets often go zombie without firing close.
  • Inside the dead-socket window, mediaCall.accept() resolves locally (CallKit shows "connected") but the answer SDP and ICE candidates queued in setSendSignalFn → sdk.methodCall('stream-notify-user', …) block forever in ddp.send() waiting on the 'open' event.

Fix: call sdk.current?.checkAndReopen?.() before each signal send. It is a no-op when the socket is alive; otherwise it triggers the existing reconnect → relogin (connectedListener in connect.ts:112) → resubscribe (subscribeAll inside ddp.login) chain so signaling can flow over a fresh DDP session.

Issue(s)

VoIP warm lock-screen accept regression discovered while testing #7283 / #7281 follow-up scenarios.

How to test or reproduce

  1. Install a Release build (debug masks the bug — debugger keeps WS alive).
  2. Sign in, wait for the app to fully connect to the workspace.
  3. Lock the device with the app in the background.
  4. Have another user place a VoIP call to you.
  5. Accept from the lock screen.

Before: local CallKit shows connected; remote stays on "calling…" for ~10s and drops.
After: call connects on both ends; audio flows.

Cold-start path (kill app → call → accept from lock screen) should remain unaffected (regression check).

Screenshots

n/a

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves a current function)
  • New feature (non-breaking change which adds functionality)
  • Documentation update (if none of the other choices apply)

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works (if applicable)
  • I have added necessary documentation (if applicable)
  • Any dependent changes have been merged and published in downstream modules

Further comments

A regression test for this exact scenario is hard without an integration harness that simulates iOS suspension + CallKit accept; the bug only manifests on a real device with a Release build (debug attaches the JS debugger which keeps the socket alive). Suggested follow-up: add a unit assertion that setSendSignalFn calls checkAndReopen before methodCall, plus a manual QA item on the lock-screen accept flow.

Stronger alternative considered: full mediaSessionInstance.reset() + init() on VoipAcceptSucceeded. Rejected for blast radius — the minimal fix is co-located with the actual send site so any background-resume signaling path benefits, not just the iOS warm-locked accept.

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability of accepting calls on the lock screen by ensuring VoIP signaling connections remain stable during call initiation.

When the device stays locked through CallKit accept, the foreground saga
never runs (AppState never reaches 'active'), so nothing nudges the SDK
to reopen its DDP socket if it went stale during background suspension.
The local accept resolves and CallKit shows "connected", but answer SDP
and ICE candidates can never reach the peer because send() blocks waiting
on a socket that won't reopen on its own.

Call checkAndReopen() before each signal send. It is a no-op when the
socket is alive; otherwise it triggers the existing reconnect → relogin
→ resubscribe chain so signaling can flow over a fresh DDP session.
@diegolmello diegolmello had a problem deploying to approve_e2e_testing May 4, 2026 13:44 — with GitHub Actions Failure
@coderabbitai

coderabbitai Bot commented May 4, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 73cc19d8-b096-41ea-b0bd-a199017980cd

📥 Commits

Reviewing files that changed from the base of the PR and between 4707489 and e3db722.

📒 Files selected for processing (1)
  • app/lib/services/voip/MediaSessionInstance.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ESLint and Test / run-eslint-and-test
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{js,ts,jsx,tsx}: Use descriptive names for functions, variables, and classes that clearly convey their purpose
Write comments that explain the 'why' behind code decisions, not the 'what'
Keep functions small and focused on a single responsibility
Use const by default, let when reassignment is needed, and avoid var
Prefer async/await over .then() chains for handling asynchronous operations
Use explicit error handling with try/catch blocks for async operations
Avoid deeply nested code; refactor complex logic into helper functions

Files:

  • app/lib/services/voip/MediaSessionInstance.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Use TypeScript for type safety; add explicit type annotations to function parameters and return types
Prefer interfaces over type aliases for defining object shapes in TypeScript
Use enums for sets of related constants rather than magic strings or numbers

**/*.{ts,tsx}: Use TypeScript with strict mode enabled and baseUrl set to app/ for module imports
Support iOS 13.4+ and Android 6.0+ as minimum target platforms

Files:

  • app/lib/services/voip/MediaSessionInstance.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Use tabs for indentation with single quotes, 130 character line width, no trailing commas, and avoid arrow function parentheses when possible
Use ESLint with @rocket.chat/eslint-config base including React, React Native, TypeScript, and Jest plugins

Files:

  • app/lib/services/voip/MediaSessionInstance.ts
app/lib/services/voip/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Implement VoIP features in app/lib/services/voip/ directory using Zustand stores for WebRTC peer-to-peer audio calls with native CallKit (iOS) and Telecom (Android) integration

Files:

  • app/lib/services/voip/MediaSessionInstance.ts
🧠 Learnings (1)
📚 Learning: 2026-04-30T17:07:51.020Z
Learnt from: diegolmello
Repo: RocketChat/Rocket.Chat.ReactNative PR: 7274
File: app/lib/services/voip/MediaCallEvents.ts:0-0
Timestamp: 2026-04-30T17:07:51.020Z
Learning: In this Rocket.Chat React Native codebase, the ESLint rule `no-void: error` is enforced. When you see a promise returned from an async call that is not awaited (a “floating promise”), do not silence it with the `void somePromise()` pattern. Instead, handle the promise explicitly by attaching `.catch(...)` (or otherwise awaiting/handling the error) so unhandled-rejection risks are addressed in a way that satisfies the existing ESLint configuration.

Applied to files:

  • app/lib/services/voip/MediaSessionInstance.ts
🔇 Additional comments (1)
app/lib/services/voip/MediaSessionInstance.ts (1)

93-100: Looks good — the reconnect check is in the right place.

Calling sdk.current?.checkAndReopen?.() immediately before the DDP signal send is a clean fix for the stale-socket path, and it stays a no-op when the connection is already healthy.


Walkthrough

A socket-reconnection check is inserted into the media signaling callback to ensure a fresh DDP session exists before sending signaling messages during lock-screen call accepts, preventing stale socket errors.

Changes

Socket Reconnection Before Signaling

Layer / File(s) Summary
Core Logic
app/lib/services/voip/MediaSessionInstance.ts
sdk.current?.checkAndReopen?.() is called within mediaSessionStore.setSendSignalFn before sdk.methodCall('stream-notify-user', ...) to heal stale sockets prior to signaling.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Suggested labels

type: bug

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: calling sdk.current?.checkAndReopen?.() to heal a stale SDK socket during warm lock-screen VoIP accept scenarios.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@diegolmello diegolmello temporarily deployed to experimental_ios_build May 4, 2026 13:49 — with GitHub Actions Inactive
@diegolmello diegolmello had a problem deploying to official_ios_build May 4, 2026 13:49 — with GitHub Actions Failure
@diegolmello diegolmello had a problem deploying to official_android_build May 4, 2026 13:49 — with GitHub Actions Failure
@diegolmello diegolmello had a problem deploying to experimental_android_build May 4, 2026 13:49 — with GitHub Actions Failure
@github-actions

github-actions Bot commented May 4, 2026

Copy link
Copy Markdown

iOS Build Available

Rocket.Chat Experimental 4.72.0.108747

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant