Fix: decouple chat from local Ollama, route inference through Rust#369
Fix: decouple chat from local Ollama, route inference through Rust#369senamakel merged 8 commits intotinyhumansai:mainfrom
Conversation
…se delivery logic - Eliminated the `deliverLocalResponse` function, which handled segmenting and dispatching local model responses, simplifying the message delivery process. - Updated socket status checks to remove unnecessary local model condition, ensuring clarity in connection handling. - Adjusted comments to reflect the new routing logic for cloud backend interactions, emphasizing that local models are now used only for supplementary features.
- Replaced Socket.IO message sending with core RPC calls for sending and canceling chat messages, enhancing the communication mechanism. - Improved error handling for socket connection checks, ensuring robust event routing. - Updated documentation to reflect the new RPC-based approach for chat message processing.
…tions - Introduced a new `ChatSegmentEvent` interface to support segmented message delivery, allowing for more natural chat interactions. - Updated the `Conversations` component to handle segmented responses and apply emoji reactions based on user messages. - Refactored the message delivery logic to improve clarity and maintainability, ensuring that responses are dispatched correctly based on segmentation. - Enhanced the chat service to emit segment events, facilitating a smoother user experience during conversations.
… and optimizing rendering logic - Removed unnecessary imports related to local model status and message segmentation, streamlining the codebase. - Updated the rendering logic to enhance clarity by simplifying conditions for displaying sending and delivering states in the chat interface. - Improved overall readability and maintainability of the Conversations component.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR removes the client-side local model chat path and migrates message handling through cloud socket RPC calls, introducing server-side response segmentation and emoji reactions delivered via new Changes
Sequence DiagramsequenceDiagram
participant Client as Client (Frontend)
participant Socket as Socket.IO
participant CoreRPC as Core RPC Layer
participant Web as Web Provider
participant Present as Presentation Module
Client->>Socket: connect (with socket.id)
Socket-->>Client: connected
Client->>CoreRPC: callCoreRpc('openhuman.channel_web_chat', {client_id, thread_id, message})
CoreRPC->>Web: route to web.rs chat handler
Web->>Web: run_chat_task (get full_response)
Web->>Present: deliver_response(client_id, thread_id, full_response, user_message)
Present->>Present: segment_for_delivery (check code/length/structure)
alt Multiple Segments
Present->>Web: emit chat_segment events (with segment_index, reaction_emoji on first)
Web->>Socket: emit 'chat:segment' for each segment
Socket-->>Client: onSegment fires for each segment
Client->>Client: dispatch addInferenceResponse
Present->>Web: emit chat_done (segment_total, full_response)
Web->>Socket: emit 'chat:done'
Socket-->>Client: onDone finalizes state
else Single Segment (≤1)
Present->>Web: emit chat_done (no segment events)
Web->>Socket: emit 'chat:done'
Socket-->>Client: onDone adds response bubble
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (5)
docs/TODO.md (1)
68-68: Consider making the TODO more specific and actionable.The TODO "integrate memory to all the various skills, like autocomplete and others" is somewhat vague. For better tracking and actionability, consider specifying:
- What "integrate memory" means concretely (e.g., "use memory embeddings for context-aware autocomplete suggestions")
- Which specific skills should be prioritized
- Who owns this work (like other TODOs that include names)
Example:
- [] integrate memory to all the various skills, like autocomplete and others. + [] integrate memory embeddings into autocomplete suggestions for context-aware completions (owner: TBD) + [] extend memory integration to skill execution context (segmentation, reactions)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/TODO.md` at line 68, Replace the vague TODO "integrate memory to all the various skills, like autocomplete and others." with a specific, actionable task: define exactly what "integrate memory" means (e.g., "use vector embeddings + short-term session store for context-aware autocomplete and conversation history"), list prioritized skills to update (e.g., "autocomplete, code completion, search suggestions"), assign an owner and target milestone, and include clear acceptance criteria (examples, performance constraints, and tests) so the task is ready to pick up; use the existing TODO text as the identifier when updating the entry.src/openhuman/channels/providers/presentation.rs (2)
199-200: Minor: Numbered list detection may have false positives.The condition
trimmed.chars().next().map_or(false, |c| c.is_ascii_digit()) && trimmed.contains(". ")would match lines like"2023. A great year"or"100. Some text"that aren't list items. Consider tightening to match typical list patterns like1.,2., etc.🔧 Stricter numbered list detection
- || trimmed.chars().next().map_or(false, |c| c.is_ascii_digit()) - && trimmed.contains(". ") + || is_numbered_list_item(trimmed)Add helper:
fn is_numbered_list_item(line: &str) -> bool { let mut chars = line.chars().peekable(); // Consume leading digits let mut has_digit = false; while chars.peek().map_or(false, |c| c.is_ascii_digit()) { chars.next(); has_digit = true; } // Must have at least one digit followed by ". " has_digit && chars.next() == Some('.') && chars.next() == Some(' ') }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/channels/providers/presentation.rs` around lines 199 - 200, The current numbered-list test using trimmed.chars().next().map_or(false, |c| c.is_ascii_digit()) && trimmed.contains(". ") is too permissive and can match dates or decimals; introduce a helper like is_numbered_list_item(line: &str) that consumes one or more leading ASCII digits and then requires an exact ". " sequence, and replace the existing condition (where trimmed is checked) with a call to is_numbered_list_item(trimmed) so only classic list prefixes such as "1. " or "12. " are accepted.
33-34: Misleading comment: reaction is awaited sequentially, not in parallel.The comment says "Fire-and-forget reaction decision (runs in parallel)" but
try_reactionis awaited directly before segmentation begins, making it sequential. This blocks the first segment delivery until the local model reaction decision completes (or times out).If parallel execution is desired, use
tokio::spawnorfutures::join!. If sequential is intentional (to attach reaction to the first segment), update the comment.🔧 Option A: Fix comment to reflect actual behavior
- // Fire-and-forget reaction decision (runs in parallel). + // Get reaction decision from local model (blocks until complete or timeout). let reaction_emoji = try_reaction(user_message).await;🔧 Option B: Actually run in parallel with segmentation
+use futures::future::join; + pub async fn deliver_response( client_id: &str, thread_id: &str, request_id: &str, full_response: &str, user_message: &str, ) { - // Fire-and-forget reaction decision (runs in parallel). - let reaction_emoji = try_reaction(user_message).await; - - // Attempt segmentation. - let segments = segment_for_delivery(full_response); + // Compute segments synchronously, reaction async in parallel. + let segments = segment_for_delivery(full_response); + let (reaction_emoji, _) = join( + try_reaction(user_message), + std::future::ready(()), + ).await;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/channels/providers/presentation.rs` around lines 33 - 34, The comment is wrong: try_reaction(user_message) is awaited sequentially which blocks segmentation; either update the comment to reflect sequential behavior or run the reaction decision in parallel by spawning it and awaiting when needed—e.g., call tokio::spawn(async move { try_reaction(user_message).await }) (store the JoinHandle) before starting segmentation, continue producing segments, then await the handle when you need the reaction (e.g., to attach to the first_segment); reference try_reaction, user_message, segmentation, and first_segment when making the change.app/src/services/chatService.ts (1)
41-49: Field naming may cause confusion:full_responsecontains segment text.The
ChatSegmentEvent.full_responsefield actually contains the segment text, not the full response. The inline comment clarifies this, but the field name inherited from the RustWebChannelEventstruct could mislead developers. Consider whether a type alias or wrapper interface with clearer naming would help.💡 Alternative: Add a clearer accessor
/** A single segment of a multi-bubble response, emitted before `chat_done`. */ export interface ChatSegmentEvent { thread_id: string; /** The segment text (named full_response for wire compatibility with WebChannelEvent). */ full_response: string; request_id: string; segment_index: number; segment_total: number; reaction_emoji?: string | null; } /** Helper to extract segment content with clearer semantics. */ export const getSegmentText = (event: ChatSegmentEvent): string => event.full_response;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/services/chatService.ts` around lines 41 - 49, The field name ChatSegmentEvent.full_response is misleading because it holds a segment's text, so add a clear accessor (e.g., getSegmentText(event: ChatSegmentEvent): string) that returns event.full_response and use it throughout the codebase, or introduce a thin wrapper/alias interface (e.g., ChatSegmentView with segment_text) that maps to the original ChatSegmentEvent for wire compatibility; ensure the original full_response is kept for compatibility but all internal consumers call the new accessor/wrapper to avoid confusion.app/src/pages/Conversations.tsx (1)
502-505: Verify socket connection check aligns with RPC-based sending.The guard checks
socketStatus !== 'connected'before sending, butchatSendnow uses core RPC (HTTP) rather than socket emit. The socket is still needed for receiving events and for theclient_id, but consider whether the error message accurately reflects the failure mode.💡 Consider clarifying the error context
if (socketStatus !== 'connected') { - setSendError(chatSendError('socket_disconnected', 'Realtime socket is not connected.')); + setSendError(chatSendError('socket_disconnected', 'Realtime connection unavailable — needed for receiving responses.')); return; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/pages/Conversations.tsx` around lines 502 - 505, The current guard treats any non-'connected' socketStatus as a send failure, but chatSend now uses core RPC (HTTP) and only requires the socket for receive/client_id-related realtime features; update the check around socketStatus in Conversations (the block that calls setSendError and chatSendError) to validate the actual requirement: if client_id is missing or realtime features are required and socketStatus !== 'connected' then set a clearer error via setSendError(chatSendError(...)) that mentions missing client_id or realtime socket being disconnected; otherwise allow RPC send to proceed and only log/warn that realtime delivery/acknowledgement may be unavailable. Ensure references include socketStatus, client_id, chatSend, setSendError, and chatSendError so the fix is applied in the right code path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@app/src/pages/Conversations.tsx`:
- Around line 502-505: The current guard treats any non-'connected' socketStatus
as a send failure, but chatSend now uses core RPC (HTTP) and only requires the
socket for receive/client_id-related realtime features; update the check around
socketStatus in Conversations (the block that calls setSendError and
chatSendError) to validate the actual requirement: if client_id is missing or
realtime features are required and socketStatus !== 'connected' then set a
clearer error via setSendError(chatSendError(...)) that mentions missing
client_id or realtime socket being disconnected; otherwise allow RPC send to
proceed and only log/warn that realtime delivery/acknowledgement may be
unavailable. Ensure references include socketStatus, client_id, chatSend,
setSendError, and chatSendError so the fix is applied in the right code path.
In `@app/src/services/chatService.ts`:
- Around line 41-49: The field name ChatSegmentEvent.full_response is misleading
because it holds a segment's text, so add a clear accessor (e.g.,
getSegmentText(event: ChatSegmentEvent): string) that returns
event.full_response and use it throughout the codebase, or introduce a thin
wrapper/alias interface (e.g., ChatSegmentView with segment_text) that maps to
the original ChatSegmentEvent for wire compatibility; ensure the original
full_response is kept for compatibility but all internal consumers call the new
accessor/wrapper to avoid confusion.
In `@docs/TODO.md`:
- Line 68: Replace the vague TODO "integrate memory to all the various skills,
like autocomplete and others." with a specific, actionable task: define exactly
what "integrate memory" means (e.g., "use vector embeddings + short-term session
store for context-aware autocomplete and conversation history"), list
prioritized skills to update (e.g., "autocomplete, code completion, search
suggestions"), assign an owner and target milestone, and include clear
acceptance criteria (examples, performance constraints, and tests) so the task
is ready to pick up; use the existing TODO text as the identifier when updating
the entry.
In `@src/openhuman/channels/providers/presentation.rs`:
- Around line 199-200: The current numbered-list test using
trimmed.chars().next().map_or(false, |c| c.is_ascii_digit()) &&
trimmed.contains(". ") is too permissive and can match dates or decimals;
introduce a helper like is_numbered_list_item(line: &str) that consumes one or
more leading ASCII digits and then requires an exact ". " sequence, and replace
the existing condition (where trimmed is checked) with a call to
is_numbered_list_item(trimmed) so only classic list prefixes such as "1. " or
"12. " are accepted.
- Around line 33-34: The comment is wrong: try_reaction(user_message) is awaited
sequentially which blocks segmentation; either update the comment to reflect
sequential behavior or run the reaction decision in parallel by spawning it and
awaiting when needed—e.g., call tokio::spawn(async move {
try_reaction(user_message).await }) (store the JoinHandle) before starting
segmentation, continue producing segments, then await the handle when you need
the reaction (e.g., to attach to the first_segment); reference try_reaction,
user_message, segmentation, and first_segment when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 27b328f1-5574-426c-9ace-3391c39cd45b
📒 Files selected for processing (10)
app/src/components/settings/panels/ScreenIntelligencePanel.tsxapp/src/pages/Conversations.tsxapp/src/services/chatService.tsdocs/TODO.mdsrc/core/screen_intelligence_cli.rssrc/core/socketio.rssrc/openhuman/channels/providers/mod.rssrc/openhuman/channels/providers/presentation.rssrc/openhuman/channels/providers/web.rssrc/openhuman/screen_intelligence/engine.rs
Summary
Fixes #366 — Chat was blocked while Ollama downloaded; local model was used as the primary chat backend.
chatSend()andchatCancel()now callopenhuman.channel_web_chat/openhuman.channel_web_cancelvia core RPC instead of emitting socket events directly. Zero inference logic in TypeScript.presentation.rsmodule runs on local Ollama (zero cloud cost). Smart segmentation: only splits prose into human-feeling bubbles; never splits code blocks, structured content (lists/tables), or short messages. Emitschat_segmentevents with natural delays.Test plan
channel_web_chat)chat_segmentevents deliver multi-bubble responses when local model is readycargo test --lib 'presentation::tests'— 7 unit tests passyarn typecheckandyarn lint— clean🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Chores