Skip to content

feat(voice): sync overlay orb with chat voice button state (#487)#490

Merged
graycyrus merged 2 commits intotinyhumansai:mainfrom
oxoxDev:feat/487-overlay-stt-sync
Apr 10, 2026
Merged

feat(voice): sync overlay orb with chat voice button state (#487)#490
graycyrus merged 2 commits intotinyhumansai:mainfrom
oxoxDev:feat/487-overlay-stt-sync

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented Apr 10, 2026

Summary

  • Add new RPC method openhuman.overlay_stt_notify so the chat voice button can signal the overlay orb
  • Chat "Start Talking" button now emits recording/transcription/cancel/error state transitions through the existing broadcast infrastructure
  • Overlay orb activates (blue, "Listening...") when recording from the chat prompt, not just from the global hotkey
  • Dynamic accessibility labels on the overlay orb reflect current mode

Problem

  • The overlay orb reacts to hotkey-based dictation via DICTATION_BUS/TRANSCRIPTION_BUS → Socket.IO events, but the chat prompt's "Start Talking" button uses browser MediaRecorder with local React state only
  • Since the overlay is a separate Tauri window, it cannot share React state with the main window — the orb stays idle when recording via the chat button
  • Closes [Feature] Sync overlay button active state with voice/STT prompt state #487

Solution

  • New Rust RPC handler (overlay_stt_notify in src/openhuman/voice/schemas.rs): accepts { state, text? } and calls the existing publish_dictation_event() / publish_transcription() functions — reusing 100% of existing broadcast infrastructure
  • Frontend wrapper (notifyOverlaySttState in app/src/utils/tauriCommands/voice.ts): fire-and-forget RPC call at each voice state transition
  • 7 call sites in Conversations.tsx: recording_started, transcription_done, 2× cancelled, 3× error
  • No changes to socketio.rs (existing bridge forwards broadcasts) or OverlayApp.tsx event handlers (already handle dictation:toggle + dictation:transcription)
  • Minor: dynamic aria-label on overlay orb button reflects current mode (stt/attention/idle)

Submission Checklist

  • Unit tests — 2 new broadcast tests in dictation_listener.rs + schema stability test updated in schemas.rs; all 59 voice tests pass
  • E2E / integration — Manual verification required: click "Start Talking" → overlay activates; stop → overlay shows transcript then dismisses; hotkey path still works
  • Doc comments — Handler and wrapper functions documented
  • Inline comments — State values documented in param struct

Impact

  • Desktop only — overlay window and core RPC are desktop-only features; no mobile impact
  • No performance concern — fire-and-forget RPC calls with tokio broadcast (sub-microsecond)
  • No breaking changes — additive only; existing hotkey path unchanged

Related

Summary by CodeRabbit

  • New Features

    • Overlay orb now exposes dynamic, context-aware accessibility labels: "Voice input active", "Attention message", or "OpenHuman overlay" depending on mode.
    • Overlay receives real-time voice transcription state updates (recording started, transcription done, cancelled, error) to improve UI responsiveness.
  • Tests

    • Added unit tests validating voice event and transcription broadcast delivery.

…sai#487)

The overlay orb already reacts to hotkey-based dictation via Socket.IO
events, but the chat "Start Talking" button used local React state only.
Add a new RPC method `openhuman.overlay_stt_notify` that the chat button
calls at each voice state transition, which publishes to the existing
DICTATION_BUS / TRANSCRIPTION_BUS broadcast channels — so the overlay
reflects recording/transcribing/idle from both input paths with zero
changes to the Socket.IO bridge or overlay event handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 064d8d8c-d29d-4219-b4fa-62fecd277913

📥 Commits

Reviewing files that changed from the base of the PR and between 0a52859 and d1edfc9.

📒 Files selected for processing (4)
  • app/src/overlay/OverlayApp.tsx
  • app/src/pages/Conversations.tsx
  • app/src/utils/tauriCommands/voice.ts
  • src/openhuman/voice/schemas.rs
✅ Files skipped from review due to trivial changes (2)
  • app/src/overlay/OverlayApp.tsx
  • app/src/pages/Conversations.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
  • app/src/utils/tauriCommands/voice.ts
  • src/openhuman/voice/schemas.rs

📝 Walkthrough

Walkthrough

Adds frontend notifications and dynamic aria-labels for overlay STT state, a new tauri command notifyOverlaySttState, and a backend voice.overlay_stt_notify RPC handler that publishes dictation/transcription events; also adds unit tests for dictation/transcription broadcast delivery.

Changes

Cohort / File(s) Summary
Overlay UI
app/src/overlay/OverlayApp.tsx
Changed overlay orb aria-label to reflect mode ('stt' → "Voice input active", 'attention' → "Attention message", otherwise "OpenHuman overlay").
Frontend voice flow
app/src/pages/Conversations.tsx, app/src/utils/tauriCommands/voice.ts
Added notifyOverlaySttState tauri command; integrated calls into recording/transcription lifecycle (recording_started, transcription_done, cancelled, error).
Backend voice RPC & wiring
src/openhuman/voice/schemas.rs
Added overlay_stt_notify controller schema and handler handle_overlay_stt_notify that deserializes params and publishes dictation/transcription events (pressed, transcription, released) based on state.
Backend tests
src/openhuman/voice/dictation_listener.rs
Added two unit tests verifying that published dictation events and transcription results are received by subscribers.

Sequence Diagram

sequenceDiagram
    actor User
    participant UI as Frontend\n(Conversations / Overlay)
    participant Tauri as Tauri Bridge
    participant Backend as voice.overlay_stt_notify
    participant EventBus as dictation_listener

    User->>UI: trigger voice record / stop
    UI->>Tauri: notifyOverlaySttState('recording_started')
    Tauri->>Backend: openhuman.overlay_stt_notify {state:"recording_started"}
    Backend->>EventBus: publish_dictation_event("pressed")
    EventBus->>UI: delivery -> overlay updates aria-label/state

    UI->>Tauri: notifyOverlaySttState('transcription_done', text)
    Tauri->>Backend: openhuman.overlay_stt_notify {state:"transcription_done", text}
    Backend->>EventBus: publish_transcription(text)
    Backend->>EventBus: publish_dictation_event("released")
    EventBus->>UI: delivery -> overlay resets aria-label/state
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • senamakel

Poem

🐇 I tapped my ear and gave a cheer,

Orb and voice now hop so near.
Pressed and spoken, then released with grace,
The overlay smiles and finds its place.
Hooray—sync'd state across the space!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main objective: syncing the overlay orb's state with the chat voice button.
Linked Issues check ✅ Passed All acceptance criteria from issue #487 are met: state sync, lifecycle coverage, no stale active UI, A11y alignment, and regression safety via tests.
Out of Scope Changes check ✅ Passed All changes directly support state synchronization between overlay and STT voice capture with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/overlay/OverlayApp.tsx`:
- Line 408: The aria-label currently uses a recording-specific string when mode
=== 'stt' which is inaccurate for final/linger states; update the aria-label
expression in OverlayApp (the JSX element that sets aria-label based on mode) to
use a neutral STT-active label (e.g., "Speech recognition active" or "STT
active") instead of "Recording speech", while keeping the existing labels for
'attention' and default cases.

In `@app/src/utils/tauriCommands/voice.ts`:
- Around line 178-185: The exported helper notifyOverlaySttState is using a
function declaration with a promise .catch chain; convert it to an arrow async
function that awaits callCoreRpc and handles errors with try/catch to comply
with the repo TypeScript style. Specifically, replace the function declaration
notifyOverlaySttState(...) with a const notifyOverlaySttState = async (...) :
Promise<void> => { try { await callCoreRpc({ method:
'openhuman.overlay_stt_notify', params: { state, text } }) } catch (err:
unknown) { console.debug('[overlay_stt_notify] fire-and-forget error:', err) } }
so the implementation uses async/await and try/catch while still calling
callCoreRpc and preserving the same types and exported name.

In `@src/openhuman/voice/schemas.rs`:
- Around line 47-54: The OverlaySttNotifyParams struct accepts arbitrary or
incomplete states; update the handler that deserializes OverlaySttNotifyParams
to validate the state string against the expected set ("recording_started",
"transcription_done", "cancelled", "error") and return a non-OK error (e.g.,
HTTP 400) for unknown values, and additionally enforce that when state ==
"transcription_done" the text Option is Some(non-empty) otherwise reject the
request; implement these checks immediately after deserialization (in the
function handling the overlay STT notify request) and include clear error
messages in the response so partial/invalid payloads are not treated as success.
- Around line 200-213: The formatting for the ControllerSchema entry named
"overlay_stt_notify" is failing rustfmt; open the block that constructs
ControllerSchema for namespace "voice" / function "overlay_stt_notify" (the call
site using required_string, optional_string, and json_output) and run rustfmt
(cargo fmt --all) or manually reformat the tuple/struct literal to match rustfmt
conventions (align commas, trailing commas, indentation for multi-line string
and vec![] elements) so the file src/openhuman/voice/schemas.rs passes cargo fmt
--all -- --check.
- Around line 409-413: The debug log in overlay_stt_notify currently prints raw
transcript text via p.text; change it to avoid logging sensitive contents and
instead log presence and length metadata (e.g., whether p.text.is_some() and
p.text.as_ref().map(|s| s.len())) along with the state (p.state) so the log
shows something like state, text_present=true/false, text_length=N rather than
the transcript itself; update the log call in overlay_stt_notify to use those
computed metadata values from p.text and remove the raw p.text output.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ee41ba3f-1bb4-4802-b5b3-5da4f1097cd0

📥 Commits

Reviewing files that changed from the base of the PR and between aee9c52 and 0a52859.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • app/src/overlay/OverlayApp.tsx
  • app/src/pages/Conversations.tsx
  • app/src/utils/tauriCommands/voice.ts
  • src/openhuman/voice/dictation_listener.rs
  • src/openhuman/voice/schemas.rs

Comment thread app/src/overlay/OverlayApp.tsx Outdated
Comment thread app/src/utils/tauriCommands/voice.ts Outdated
Comment thread src/openhuman/voice/schemas.rs
Comment thread src/openhuman/voice/schemas.rs
Comment thread src/openhuman/voice/schemas.rs
- Run cargo fmt and prettier to fix formatting violations
- Use typed enum OverlaySttState instead of raw String for state param
  (serde rejects invalid states at deserialization, eliminating the
  unknown state branch)
- Require `text` field for transcription_done state (return error if
  missing instead of silently ignoring)
- Replace raw transcript logging with metadata-only (has_text, text_len)
  to avoid logging sensitive user speech content
- Use "Voice input active" aria-label (covers recording + linger phases)
- Convert notifyOverlaySttState to arrow function with async/await per
  repo TS conventions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@graycyrus graycyrus merged commit 0cd0f7a into tinyhumansai:main Apr 10, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Sync overlay button active state with voice/STT prompt state

2 participants