fix(voice): recover buffered hotkey events after select! race (#527) by oxoxDev · Pull Request #545 · tinyhumansai/openhuman

oxoxDev · 2026-04-13T20:37:14Z

Summary

Fix tokio::select! race condition that caused voice transcripts to be silently dropped on second+ recordings
Add try_recv() check for buffered hotkey events when recording setup completes, recovering stop intents that lost the race
Add pipeline-level diagnostic logging with UUID correlation IDs across all processing stages
Make publish_transcription return receiver count and log warnings when no subscribers are connected

Problem

After the first successful voice recording, subsequent recordings would silently fail to produce transcriptions. The root cause is a race condition in the hotkey event loop's tokio::select!:

On warm CPAL init (second+ recording), audio_capture::start_recording() completes fast enough that both pending_ready and hotkey_rx branches are simultaneously ready. select! picks pseudo-randomly — when it picks pending_ready, the Released event sits unprocessed in hotkey_rx. The recording is stored as "live" (no deferred stop), then the next loop iteration immediately processes the buffered Released, stopping the recording after near-zero duration. The duration gate silently drops the clip with only a warn! log.

Additionally, multiple pipeline stages (duration gate, silence gate, hallucination filter, empty transcription) all silently discard recordings without structured logging, making diagnosis difficult.

Solution

Core fix (server.rs:295-344): After pending_ready resolves successfully with pending_stop == false, call hotkey_rx.try_recv() to check for a buffered stop event that lost the select! race. If a Released (or Pressed toggle-stop) is found, apply the existing deferred stop mechanism (MIN_RECORDING_AFTER_SETUP = 1500ms) instead of treating the recording as live.

Diagnostic logging (server.rs): Added pipeline_id (UUID v4 prefix) to spawn_process_recording and threaded it through process_recording_bg. Every pipeline stage now logs with [pipeline=<id>] stage=<name> format for end-to-end correlation. Drop points explicitly log DROPPED with the gate that triggered.

Receiver visibility (dictation_listener.rs): publish_transcription now returns usize (receiver count) and logs a warning when the broadcast has zero subscribers, making "delivered but nobody listening" scenarios visible.

Submission Checklist

Unit tests — existing process_recording_bg tests updated for new pipeline_id parameter; publish_transcription tests still pass with new return type
E2E / integration — voice dictation tested manually on macOS debug build; automated E2E for hotkey flows not yet available
Doc comments — publish_transcription documented with return value semantics
Inline comments — select! race recovery logic documented with explanation of the race condition

Impact

Desktop only — voice dictation is desktop-only (cpal + rdev)
No breaking API changes — publish_transcription return type changed from () to usize but all call sites either ignore or now use the value
Performance — negligible; one try_recv() (non-blocking) per recording setup completion
Known limitation — an alternating delivery pattern (transcription arrives every other recording) was observed during testing and appears to be a pre-existing issue unrelated to this fix

Summary by CodeRabbit

New Features
- Added hotkey buffering during recording setup for improved stop signal handling.
Improvements
- Transcription delivery now tracks and reports successful message routing count.
- Enhanced logging with pipeline tracking and detailed stage information for better diagnostics.

publish_transcription now returns the number of active TRANSCRIPTION_BUS subscribers that received the message. When no receivers are connected (e.g. Socket.IO bridge not yet subscribed), the function logs a warning instead of silently discarding the broadcast. This makes it possible to diagnose "transcription produced but never delivered" scenarios from logs alone. Closes tinyhumansai#527 (partial) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

On warm CPAL init (second+ recording), audio_capture::start_recording() completes fast enough that the pending_ready branch and the hotkey_rx branch are both ready simultaneously inside tokio::select!. select! picks one pseudo-randomly — when it picks pending_ready, the Released event sits unprocessed in hotkey_rx. The recording is stored as "live" with no deferred stop, then the next loop iteration processes the buffered Released and stops the recording almost immediately, producing a near-zero-length clip that the duration gate silently drops. Fix: after pending_ready resolves, call hotkey_rx.try_recv() to check for a buffered stop event that lost the race. If found, apply the deferred stop mechanism (MIN_RECORDING_AFTER_SETUP = 1500ms) instead of treating the recording as live. Also adds pipeline_id (UUID prefix) to all process_recording_bg log lines for end-to-end correlation, and labels each pipeline stage (stop_recording, gate_duration, gate_silence, transcribe, deliver) so dropped recordings are diagnosable from logs. Closes tinyhumansai#527 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-13T20:37:57Z

📝 Walkthrough

Walkthrough

The changes modify the transcription publishing mechanism to return receiver count alongside added structured pipeline logging, identifier tracking, and buffered hotkey recovery during recording setup completion.

Changes

Cohort / File(s)	Summary
Transcription Publishing `src/openhuman/voice/dictation_listener.rs`	Modified `publish_transcription` to return `usize` (receiver count) instead of `()`. Added receiver count check, logging with delivery status, and error handling for failed sends.
Recording Pipeline Infrastructure `src/openhuman/voice/server.rs`	Introduced per-recording `pipeline_id` parameter (UUID-based tracking). Added structured stage-based logging across recording setup, processing, and completion. Implemented buffered hotkey recovery during setup completion. Updated `process_recording_bg` signature to accept `pipeline_id` and capture transcription receiver count. Updated all test invocations accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

fix(voice): resolve dictation pipeline in embedded Tauri app #466: Directly impacts the new publish_transcription return value introduced here; server-side caller captures the receiver count from this signature change.
Fix stale voice server state between dictation sessions #517: Modifies process_recording_bg signature and recording-stage logging patterns that overlap with this PR's instrumentation enhancements.

Suggested reviewers

senamakel

Poem

🐰 A pipeline traced with glowing ID,
Hotkeys buffered, recovery set free,
Transcription counts now returned with care,
Structured stages logged with flair! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main fix: recovering buffered hotkey events after a select! race condition, which is the core issue addressed in this PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/openhuman/voice/server.rs (1)

646-656: ⚠️ Potential issue | 🟠 Major

Redact transcription text from pipeline logs.

These logs now emit recognized text on both the normal and drop paths. Even truncated snippets can contain user secrets or PII, and this code runs for every recording. Please log metadata only here.

Suggested change

                     info!(
-                        "{LOG_PREFIX} [pipeline={pipeline_id}] stage=transcription_result text='{}' chars={} elapsed_ms={}",
-                        truncate_for_log(text, 80),
+                        "{LOG_PREFIX} [pipeline={pipeline_id}] stage=transcription_result chars={} elapsed_ms={}",
                         text.len(),
                         transcribe_elapsed.as_millis()
                     );
@@
                         warn!(
-                            "{LOG_PREFIX} [pipeline={pipeline_id}] stage=gate_hallucination DROPPED text='{}'",
-                            truncate_for_log(text, 60)
+                            "{LOG_PREFIX} [pipeline={pipeline_id}] stage=gate_hallucination DROPPED chars={}",
+                            text.len()
                         );

As per coding guidelines "Never log secrets, raw JWTs, API keys, or full PII in debug logs; redact or omit sensitive fields".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/voice/server.rs` around lines 646 - 656, The logs currently
emit recognized transcription text (via truncate_for_log(text, ...)) in both the
normal transcription_result and the gate_hallucination warn paths; remove/raw
text from these logs and replace with metadata-only fields (e.g., pipeline_id,
stage, chars=text.len(), elapsed_ms=..., and a redaction marker like
"<REDACTED>"). Update the LOG_PREFIX/transcription_result logging call and the
warn! in the is_hallucinated_output branch so they do not include or interpolate
the transcription string (use truncate_for_log only for debugging counters if
absolutely required — better: omit it) and keep use of is_hallucinated_output,
pipeline_id, text.len(), and transcribe_elapsed to provide non-sensitive
context.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/voice/server.rs`:
- Around line 301-326: The buffered hotkey read via hotkey_rx.try_recv() is
currently consumed but not rebroadcast, so
dictation_listener::publish_dictation_event(...) is never called for buffered
HotkeyEvent::Released/Pressed; before setting pending_stop = true in the
try_recv() match arm, call the same publish path used in the main
hotkey_rx.recv() branch (i.e., forward the buffered HotkeyEvent to
dictation_listener::publish_dictation_event or the helper that publishes
dictation:toggle events), preserving the event variant (HotkeyEvent::Released or
HotkeyEvent::Pressed) so subscribers see the recovered stop/toggle, then mark
pending_stop.

---

Outside diff comments:
In `@src/openhuman/voice/server.rs`:
- Around line 646-656: The logs currently emit recognized transcription text
(via truncate_for_log(text, ...)) in both the normal transcription_result and
the gate_hallucination warn paths; remove/raw text from these logs and replace
with metadata-only fields (e.g., pipeline_id, stage, chars=text.len(),
elapsed_ms=..., and a redaction marker like "<REDACTED>"). Update the
LOG_PREFIX/transcription_result logging call and the warn! in the
is_hallucinated_output branch so they do not include or interpolate the
transcription string (use truncate_for_log only for debugging counters if
absolutely required — better: omit it) and keep use of is_hallucinated_output,
pipeline_id, text.len(), and transcribe_elapsed to provide non-sensitive
context.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a1e611f2-ec73-4d4a-bf20-eaf3b13ddf5a

📥 Commits

Reviewing files that changed from the base of the PR and between 57a8bed and 0e3aec5.

⛔ Files ignored due to path filters (2)

Cargo.lock is excluded by !**/*.lock
app/src-tauri/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

src/openhuman/voice/dictation_listener.rs
src/openhuman/voice/server.rs

coderabbitai · 2026-04-13T20:45:47Z

+                            // Check for a buffered stop event that lost the
+                            // select! race against pending_ready. On warm CPAL
+                            // init both branches may be ready simultaneously;
+                            // select! picks one pseudo-randomly, so a Released
+                            // event can sit unprocessed in hotkey_rx.
+                            let had_pending_stop = pending_stop;
+                            if !pending_stop {
+                                if let Ok(buffered) = hotkey_rx.try_recv() {
+                                    match buffered {
+                                        HotkeyEvent::Released => {
+                                            info!(
+                                                "{LOG_PREFIX} recording handle ready — found buffered Released in hotkey_rx (select! race recovered)"
+                                            );
+                                            pending_stop = true;
+                                        }
+                                        HotkeyEvent::Pressed => {
+                                            // A second Pressed while pending means
+                                            // user wants to stop (tap-style). Treat
+                                            // the same as a stop intent.
+                                            info!(
+                                                "{LOG_PREFIX} recording handle ready — found buffered Pressed in hotkey_rx (treating as stop intent)"
+                                            );
+                                            pending_stop = true;
+                                        }
+                                    }
+                                }


⚠️ Potential issue | 🟠 Major

Rebroadcast the recovered hotkey event before consuming it.

hotkey_rx.try_recv() removes the buffered Released/Pressed, but only the main hotkey_rx.recv() branch publishes dictation_listener::publish_dictation_event(...). In this raced path, dictation:toggle subscribers never see the recovered stop/toggle event, so the frontend can stay out of sync with the actual recording state. Please route buffered through the same publish path before turning it into pending_stop.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/openhuman/voice/server.rs` around lines 301 - 326, The buffered hotkey read via hotkey_rx.try_recv() is currently consumed but not rebroadcast, so dictation_listener::publish_dictation_event(...) is never called for buffered HotkeyEvent::Released/Pressed; before setting pending_stop = true in the try_recv() match arm, call the same publish path used in the main hotkey_rx.recv() branch (i.e., forward the buffered HotkeyEvent to dictation_listener::publish_dictation_event or the helper that publishes dictation:toggle events), preserving the event variant (HotkeyEvent::Released or HotkeyEvent::Pressed) so subscribers see the recovered stop/toggle, then mark pending_stop.

oxoxDev and others added 2 commits April 14, 2026 02:01

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

senamakel approved these changes Apr 13, 2026

View reviewed changes

senamakel merged commit 08d9fd2 into tinyhumansai:main Apr 13, 2026
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(voice): recover buffered hotkey events after select! race (#527)#545

fix(voice): recover buffered hotkey events after select! race (#527)#545
senamakel merged 2 commits intotinyhumansai:mainfrom
oxoxDev:fix/527-voice-dictation-drops-transcripts

oxoxDev commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oxoxDev commented Apr 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oxoxDev commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading