feat(sdk): resync missed events on WebSocket reconnect#176
Conversation
The engine already stamps every delivered event with a monotonic
agent_seq, keeps a 500-event resync ring per agent, and falls back to a
DB-backed replay for larger gaps — but no shipping client used it, so
every disconnect window was silent event loss.
WsClient now tracks the highest agent_seq seen (read from the raw frame,
since schema parsing strips unknown keys) and, after each reconnect once
open handlers have re-subscribed, sends
{type: "resync", last_seen_seq, since}. Replayed events flow through the
normal dispatch path, deduplicated by stable event id, and the server's
resync_ack surfaces as a new "resynced" lifecycle event — exposed as
on.resynced(({replayed, gapDetected}) => ...) on RelayCast and
AgentClient. First connections behave exactly as before (no seq, no
resync frame).
@relaycast/types gains the missing wire frame schemas: resync (client),
resync_ack (server), and the client-only resynced event.
Also adds the package README (install, RelayCast vs AgentClient
quickstart, reconnect/resync behavior, self-hosting).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Your free trial PR review limit of 300 PRs has been reached. Please upgrade your plan to continue using CodeAnt AI. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (12)
📝 WalkthroughWalkthroughThis PR implements a complete WebSocket reconnect resynchronization protocol for the TypeScript SDK. Clients now track server-issued sequence numbers, send resync requests after reconnects to replay missed events, deduplicate replays by stable event id, and emit a resynced lifecycle event. The feature is tested comprehensively and documented in README and changelog. ChangesSDK WebSocket Reconnect Resync Protocol
Sequence DiagramsequenceDiagram
participant Client as WsClient
participant Server
participant App as Application
rect rgba(100, 200, 150, 0.5)
Note over Client,App: Initial Connection
Client->>Server: WebSocket connect
Server->>Client: open event
Client->>App: emit open event
App->>Client: subscribe handlers
Note over Client: No resync sent (no prior events)
end
rect rgba(100, 150, 200, 0.5)
Note over Client,App: Event Delivery & Disconnect
Server->>Client: message.created (agent_seq: 42)
Client->>Client: track agent_seq: 42
Client->>App: emit message event
App->>Client: (socket close)
Note over Client: Socket closed
end
rect rgba(200, 150, 100, 0.5)
Note over Client,App: Reconnect Resync Flow
Client->>Server: WebSocket reconnect
Server->>Client: open event
Client->>App: emit open event
App->>Client: subscribe handlers
Client->>Server: send resync {last_seen_seq: 42, since: timestamp}
Note over Server: Check for missed events between seq 42 and now
Server->>Client: resync_ack {replayed: 2, gapDetected: false}
Client->>Client: emit resynced event
Client->>App: on.resynced({replayed: 2, gapDetected: false})
Server->>Client: event_1 (replay, id: msg_123)
Server->>Client: event_2 (new, id: msg_456)
Client->>Client: deduplicate msg_123 (seen before)
Client->>App: emit only msg_456
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Your free trial PR review limit of 300 PRs has been reached. Please upgrade your plan to continue using CodeAnt AI. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b9e2dcee08
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // `agent_seq` (stripped by schema parsing, so read it raw here). | ||
| if (typeof parsed.agent_seq === 'number' && Number.isFinite(parsed.agent_seq)) { | ||
| this.lastSeenSeq = parsed.agent_seq; | ||
| this.lastEventAt = new Date().toISOString(); |
There was a problem hiding this comment.
Use a server-side cursor for DB replay
When the resync gap exceeds the 500-event ring, the server falls back to replayMissedEvents, which filters persisted messages/reactions with created_at > floor(since). This sets since from the SDK host's current receive time rather than from a server event timestamp or sequence-safe lower bound, so a client clock that is ahead of the server—or a last event received near the end of a second followed by more than 500 missed events in that same second—can cause the DB fallback to skip missed rows outside the ring and report the resync as complete. Use a server-derived timestamp/cursor, or otherwise bias since safely before the last seen event.
Useful? React with 👍 / 👎.
|
Your free trial PR review limit of 300 PRs has been reached. Please upgrade your plan to continue using CodeAnt AI. |
Summary
The server side already has a complete reconnect-replay protocol — every event pushed to an agent is stamped with a monotonic
agent_seq, each agent has a 500-event resync ring, and gaps beyond the ring get a DB-backed replay (engine/src/adapters/node/realtime.ts,engine/src/engine/resyncQuery.ts, mirrored by the cloud AgentDO). No shipping client used it, so every WS disconnect window was silent event loss for SDK users. This wires the TypeScript SDK into that protocol.Changes
packages/sdk-typescriptWsClienttracks the highestagent_seqseen across incoming events. The seq is read from the raw frame because the zod event schemas strip unknown keys.openhandlers have re-subscribed — the client sends{type: "resync", last_seen_seq, since}. First connections (no seq yet) send nothing and behave exactly as before.agent_seq, and seq comparison would silently drop live events after a server counter reset.resync_acksurfaces as a newresyncedlifecycle event (alongsidereconnecting/permanently_disconnected), exposed ason.resynced(({ replayed, gapDetected }) => ...)on bothRelayCastandAgentClient.RelayCastvsAgentClientquickstart, reconnect/resync behavior, self-hosting.packages/types(missing wire frame schemas only)ResyncEventSchema(client → server) added toClientEventSchema.ResyncAckEventSchema(server → client) added toServerEventSchema.WsResyncedEventSchema(client-emitted) added toWsClientEventSchema.Wire contract (confirmed against Node adapter and cloud AgentDO)
{"type": "resync", "last_seen_seq": <number>, "since": <ISO timestamp>}agent_seq), then DB-fallback replay when the gap exceeds the ring (events tagged"replayed": true, noagent_seq), then{"type": "resync_ack", "last_seen_seq", "current_seq", "replayed", "gap_detected"}Tests
New unit tests in
ws.test.ts(mock-WS harness) andagent-ws.test.ts:agent_seqtracked (including from unrecognized event types) andresyncsent with the correctlast_seen_seq+ ISOsinceresyncedemitted with{replayed, gapDetected}fromresync_ack;AgentClient.on.resyncedend to endFull workspace
npx turbo build testand lint pass (SDK: 19 files / 364 tests, engine: 55, types: 44).🤖 Generated with Claude Code