feat: screen intelligence pipeline + CLI + keep_screenshots config by senamakel · Pull Request #339 · tinyhumansai/openhuman

senamakel · 2026-04-06T05:08:15Z

Summary

Screen intelligence capture → vision pipeline: Screenshots are saved to {workspace_dir}/screenshots/, processed by local Ollama vision model, then cleaned up (or kept based on config)
keep_screenshots config flag (default: false): When enabled, screenshots persist on disk after vision processing instead of being deleted
openhuman screen-intelligence CLI: Standalone CLI for testing the capture + vision pipeline without the full desktop app (mirrors openhuman skills pattern)

CLI subcommands

Command	Description
`screen-intelligence run`	Start a lightweight JSON-RPC server
`screen-intelligence status`	Print engine status as JSON
`screen-intelligence capture [--keep]`	Take a single screenshot with diagnostics
`screen-intelligence start [--ttl <secs>]`	Start a capture + vision session
`screen-intelligence stop`	Stop the active session

Config flag

[screen_intelligence]
keep_screenshots = false  # default: delete after vision processing

Files changed

Rust config: accessibility.rs, ops.rs, schemas.rs — added keep_screenshots field
Rust engine: engine.rs — save_screenshot_to_disk() + vision worker integration
Rust CLI: new screen_intelligence_cli.rs, wired into cli.rs + mod.rs
Frontend: tauriCommands.ts, ScreenIntelligencePanel.tsx — UI toggle + type updates
Tests: all 5 test fixtures updated with keep_screenshots: false

Test plan

cargo check passes
tsc --noEmit passes
openhuman screen-intelligence --help prints usage
openhuman screen-intelligence status prints JSON with permissions, session, config
openhuman screen-intelligence capture --keep saves screenshot to ~/.openhuman/workspace/screenshots/
openhuman screen-intelligence start --ttl 60 -v runs capture + vision loop
Settings panel toggle for "Keep Screenshots" persists to config
Screenshots are deleted after processing when keep_screenshots = false

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added "Keep Screenshots" toggle in Screen Intelligence settings to control whether screenshots are retained after processing
- Added screen intelligence CLI commands for engine management (run, status, capture, start, stop)
Tests
- Updated test fixtures to reflect new keep_screenshots configuration field

- Introduced a new `keep_screenshots` option in the Screen Intelligence settings, allowing users to save captured screenshots to the workspace instead of deleting them after processing. - Updated relevant components and tests to reflect this new feature, ensuring consistent behavior across the application. - Enhanced the user interface to include a checkbox for the `keep_screenshots` setting, improving user experience and configurability.

…igence

Adds a dedicated CLI (mirroring `openhuman skills`) to run the screen intelligence capture + vision pipeline without the full desktop app. Subcommands: run, status, capture, start, stop. Makes save_screenshot_to_disk public so the CLI capture --keep flag works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-06T05:08:28Z

Warning

Rate limit exceeded

@senamakel has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 12 minutes and 27 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 12 minutes and 27 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f03a8025-e22b-49af-8b03-cc650f2e234d

📥 Commits

Reviewing files that changed from the base of the PR and between 4472048 and 2dada67.

📒 Files selected for processing (1)

app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx

📝 Walkthrough

Walkthrough

This pull request introduces a new keep_screenshots boolean configuration option throughout the screen intelligence system. Changes include frontend UI controls, TypeScript type definitions, Rust configuration schema and settings operations, a new screen-intelligence CLI subcommand, and engine logic to conditionally save and delete screenshots based on the configuration setting.

Changes

Cohort / File(s)	Summary
Frontend Component & Test Fixtures `app/src/components/settings/panels/ScreenIntelligencePanel.tsx`, `app/src/components/intelligence/__tests__/ScreenIntelligenceDebugPanel.test.tsx`, `app/src/components/settings/panels/__tests__/AccessibilityPanel.test.tsx`, `app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx`	Added new `keepScreenshots` UI state to the Screen Intelligence Policy panel with a checkbox control, initialization from config, and payload inclusion on save. Updated test fixtures with `keep_screenshots: false` field.
Type & Interface Definitions `app/src/utils/tauriCommands.ts`	Extended `AccessibilityConfig` with required `keep_screenshots: boolean` field and `ScreenIntelligenceSettingsUpdate` with optional `keep_screenshots?: boolean \| null` field for partial updates.
Configuration Schema `src/openhuman/config/schema/accessibility.rs`	Added new boolean field `keep_screenshots` to `ScreenIntelligenceConfig` with `#[serde(default)]` and explicit `false` default, controlling whether screenshots are retained post-vision processing.
Configuration Operations `src/openhuman/config/ops.rs`, `src/openhuman/config/schemas.rs`	Extended `ScreenIntelligenceSettingsPatch` with optional `keep_screenshots: Option<bool>` field and updated `apply_screen_intelligence_settings` to apply the patch conditionally. Added RPC schema integration for settings updates.
Service Tests `app/src/services/__tests__/coreRpcClient.test.ts`, `app/src/store/__tests__/accessibilitySlice.test.ts`	Updated `sampleAccessibilityStatus()` test fixtures to include `keep_screenshots: false` field.
Screen Intelligence CLI `src/core/cli.rs`, `src/core/mod.rs`, `src/core/screen_intelligence_cli.rs`	Added new `screen-intelligence` CLI subcommand dispatcher and standalone module implementing `run`, `status`, `capture`, `start`, `stop` subcommands with local HTTP server (Axum/Tokio), JSON-RPC endpoint, and session/capture management.
Engine Screenshot Management `src/openhuman/screen_intelligence/engine.rs`	Added `save_screenshot_to_disk()` function to persist base64-encoded captures, updated `run_vision_worker` to save frames before vision analysis and conditionally delete saved screenshots when `keep_screenshots` is `false`.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant UI as ScreenIntelligencePanel
    participant Config as Config System
    participant Engine as Screen Intelligence Engine
    participant Disk as Disk Storage

    User->>UI: Toggle keep_screenshots checkbox
    UI->>Config: Save setting via openhumanUpdateScreenIntelligenceSettings
    Config->>Config: Apply patch to config.screen_intelligence.keep_screenshots
    
    rect rgba(200, 150, 255, 0.5)
    Note over Engine,Disk: On each frame capture
    Engine->>Disk: save_screenshot_to_disk(frame)
    Disk-->>Engine: Return file path
    Engine->>Engine: analyze_frame_with_vision(frame)
    
    alt keep_screenshots == true
        Note over Engine,Disk: Retain screenshot
        Engine->>Disk: Screenshot persisted
    else keep_screenshots == false
        Engine->>Disk: Delete saved screenshot file
        Disk-->>Engine: Deletion complete
    end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Fix local reset flow and Accessibility permission handling #324 — Both modify src/openhuman/screen_intelligence/engine.rs to alter screen-intelligence behavior and screenshot handling.
fix: stabilize screen intelligence screenshot capture and settings flow #275 — Both modify screen intelligence capture and settings code, particularly engine-level screenshot capture and settings propagation.
refactor: extract accessibility middleware module #184 — Both modify the screen-intelligence/accessibility capture flow including engine-level screenshot save/delete and configuration field management.

Poem

🐰 A screenshot saved, a choice so clear,
Keep or discard, the control is near!
Through config and CLI, the setting flows,
From UI to engine, the preference goes.
Now screenshots dance where they're meant to stay! 📸

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 41.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the three main features added: screen intelligence pipeline, CLI subcommands, and keep_screenshots config.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/openhuman/screen_intelligence/engine.rs (1)

906-929: ⚠️ Potential issue | 🟠 Major

Potential screenshot leak when session disappears mid-loop.

Line 909 can write a screenshot before Line 924 checks session existence. If session is already gone, the break at Line 928 skips the deletion branch (Line 935+), so files may remain even when keep_screenshots=false.

Suggested fix (cleanup before early break)

             {
                 let mut state = self.inner.lock().await;
                 if let Some(session) = state.session.as_mut() {
                     session.vision_state = "processing".to_string();
                 } else {
+                    if !keep_screenshots {
+                        if let Some(path) = saved_path.as_ref() {
+                            let _ = std::fs::remove_file(path);
+                        }
+                    }
                     tracing::debug!("[screen_intelligence] vision worker: no session, exiting");
                     break;
                 }
             }

Also applies to: 934-944

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/screen_intelligence/engine.rs` around lines 906 - 929, A
screenshot can be saved via save_screenshot_to_disk before checking whether the
session exists (self.inner.lock().await -> session), so if session is gone we
break out and skip the later deletion branch causing a leak; modify the loop to,
immediately after saving (i.e. where saved_path is set), check whether
state.session is Some and if not then: if saved_path.is_some() and
keep_screenshots is false delete the saved file (using workspace_dir and the
saved_path) before breaking, and make the same change for the later mirror block
(the block around session.vision_state update and subsequent deletion logic) so
any saved screenshot is cleaned up when the session disappears.

🧹 Nitpick comments (1)

app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx (1)
188-194: Assert keep_screenshots in the save payload.

Line 188 currently doesn’t verify the new flag, so this test can still pass if keep_screenshots is accidentally dropped during save.
Suggested test assertion update
     expect(openhumanUpdateScreenIntelligenceSettings).toHaveBeenCalledWith({
       enabled: true,
       policy_mode: 'all_except_blacklist',
       baseline_fps: 1,
+      keep_screenshots: false,
       allowlist: ['Code'],
       denylist: ['1Password'],
     });
Based on learnings: "Applies to app/src/**/*.test.{ts,tsx} : Prefer testing behavior over implementation details; use existing helpers from app/src/test/ (test-utils.tsx, shared mock backend) before adding new harness code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx`
around lines 188 - 194, Update the test assertion to include the new
keep_screenshots flag in the payload so the save action is validated for that
field; specifically modify the expect for
openhumanUpdateScreenIntelligenceSettings to include keep_screenshots: true (or
the appropriate boolean) alongside enabled, policy_mode, baseline_fps,
allowlist, and denylist in the object so the test fails if keep_screenshots is
omitted or changed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/settings/panels/ScreenIntelligencePanel.tsx`:
- Around line 285-289: The new JSX block in the ScreenIntelligencePanel
component (the <label> block inside ScreenIntelligencePanel.tsx) is not
formatted to project Prettier rules; run Prettier over
app/src/components/settings/panels/ScreenIntelligencePanel.tsx (or the whole
app/src/**/*.{ts,tsx}) to fix formatting, then re-run lint/type checks (ESLint
and tsc --noEmit) and commit the formatted changes so `prettier --check` passes.

In `@src/core/cli.rs`:
- Around line 36-38: The new top-level command "screen-intelligence" was added
via run_screen_intelligence_command but isn’t listed in the global help output;
update the top-level help/usage text (the function or constant that prints
command listings for openhuman) to include an entry for "screen-intelligence"
with a short one-line description so it appears when users run `openhuman
--help`; locate the same help-generation place where other commands are listed
and add the "screen-intelligence" entry next to those commands so help stays
consistent with the match arm that calls
crate::core::screen_intelligence_cli::run_screen_intelligence_command.

In `@src/core/screen_intelligence_cli.rs`:
- Around line 307-415: The stop subcommand (run_stop_session) spins up a fresh
local engine via bootstrap_engine and thus only stops sessions in that process;
change it to call the global/persistent engine control API instead of creating a
new in-process engine. Replace the bootstrap_engine/engine.stop_session flow in
run_stop_session with the client/IPC path used to control the long-lived service
(e.g., use the EngineClient or connect_to_engine_daemon function used
elsewhere), so run_stop_session issues stop_session over the daemon IPC/RPC
channel (keep init_quiet_logging(opts.verbose) for logging and still call
engine_client.stop_session(Some("cli_stop".to_string())) and print
session.stop_reason as before).
- Around line 164-169: Run rustfmt (cargo fmt) to fix formatting mismatches in
this file: reformat the block that defines bind_addr, listener, and the
log::info! call (the variables bind_addr and listener and the log::info!
invocation around the JSON-RPC readiness message) so it matches project style;
after formatting, run cargo check to ensure no other style-related edits are
required and commit the formatted changes.

In `@src/openhuman/screen_intelligence/engine.rs`:
- Around line 863-870: The modified block around building app_slug and filename
is not rustfmt-compliant; run `cargo fmt` (and then `cargo check`) and commit
the reformatted changes so the `app_name` handling and the `filename` formatting
match project style, ensuring the closure using `.chars().map(...)` and the
`format!` call in the function that constructs `app_slug`/`filename` are
reformatted by rustfmt.
- Around line 859-874: The blocking std::fs calls in
run_vision_worker/save_screenshot_to_disk (notably std::fs::create_dir_all,
std::fs::write and later std::fs::remove_file) must be converted to non-blocking
async equivalents or run on a blocking thread; either replace them with
tokio::fs::create_dir_all / tokio::fs::write / tokio::fs::remove_file and await
them, or wrap the existing std::fs calls in tokio::task::spawn_blocking and
await the JoinHandle; update the code that constructs screenshots_dir,
file_path, and the write/remove operations in the functions handling screenshot
persistence (e.g., run_vision_worker and save_screenshot_to_disk) to use one of
these approaches so the Tokio runtime threads are not blocked.

---

Outside diff comments:
In `@src/openhuman/screen_intelligence/engine.rs`:
- Around line 906-929: A screenshot can be saved via save_screenshot_to_disk
before checking whether the session exists (self.inner.lock().await -> session),
so if session is gone we break out and skip the later deletion branch causing a
leak; modify the loop to, immediately after saving (i.e. where saved_path is
set), check whether state.session is Some and if not then: if
saved_path.is_some() and keep_screenshots is false delete the saved file (using
workspace_dir and the saved_path) before breaking, and make the same change for
the later mirror block (the block around session.vision_state update and
subsequent deletion logic) so any saved screenshot is cleaned up when the
session disappears.

---

Nitpick comments:
In
`@app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx`:
- Around line 188-194: Update the test assertion to include the new
keep_screenshots flag in the payload so the save action is validated for that
field; specifically modify the expect for
openhumanUpdateScreenIntelligenceSettings to include keep_screenshots: true (or
the appropriate boolean) alongside enabled, policy_mode, baseline_fps,
allowlist, and denylist in the object so the test fails if keep_screenshots is
omitted or changed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bf5a01cf-9078-4caa-bb13-380e19e2c2ff

📥 Commits

Reviewing files that changed from the base of the PR and between 09b627c and 4472048.

📒 Files selected for processing (14)

app/src/components/intelligence/__tests__/ScreenIntelligenceDebugPanel.test.tsx
app/src/components/settings/panels/ScreenIntelligencePanel.tsx
app/src/components/settings/panels/__tests__/AccessibilityPanel.test.tsx
app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx
app/src/services/__tests__/coreRpcClient.test.ts
app/src/store/__tests__/accessibilitySlice.test.ts
app/src/utils/tauriCommands.ts
src/core/cli.rs
src/core/mod.rs
src/core/screen_intelligence_cli.rs
src/openhuman/config/ops.rs
src/openhuman/config/schema/accessibility.rs
src/openhuman/config/schemas.rs
src/openhuman/screen_intelligence/engine.rs

coderabbitai · 2026-04-06T05:15:54Z

+          <label className="flex items-center justify-between rounded-xl border border-stone-200 bg-stone-50 px-3 py-2">
+            <div>
+              <span className="text-sm text-stone-700">Keep Screenshots</span>
+              <p className="text-xs text-stone-400">Save captured screenshots to the workspace instead of deleting after processing</p>
+            </div>


⚠️ Potential issue | 🟡 Minor

Run Prettier on the new JSX block before merge.

The pipeline reports prettier --check failure, and this newly added block is in the failing file.

As per coding guidelines: app/src/**/*.{ts,tsx}: Run Prettier, ESLint, and tsc --noEmit before merging changes to app/ code.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/src/components/settings/panels/ScreenIntelligencePanel.tsx` around lines 285 - 289, The new JSX block in the ScreenIntelligencePanel component (the <label> block inside ScreenIntelligencePanel.tsx) is not formatted to project Prettier rules; run Prettier over app/src/components/settings/panels/ScreenIntelligencePanel.tsx (or the whole app/src/**/*.{ts,tsx}) to fix formatting, then re-run lint/type checks (ESLint and tsc --noEmit) and commit the formatted changes so `prettier --check` passes.

coderabbitai · 2026-04-06T05:15:54Z

+        "screen-intelligence" => {
+            crate::core::screen_intelligence_cli::run_screen_intelligence_command(&args[1..])
+        }


⚠️ Potential issue | 🟡 Minor

Expose the new command in general help output.

Line 36 adds a new top-level command, but openhuman --help doesn’t list it yet, which hurts discoverability.

Suggested help text update

fn print_general_help(grouped: &BTreeMap<String, Vec<ControllerSchema>>) { println!("OpenHuman core CLI\n"); println!("Usage:"); println!(" openhuman run [--host <addr>] [--port <u16>] [--jsonrpc-only] [--verbose]"); println!(" openhuman repl [--verbose] [--eval '<cmd>'] [--batch]"); println!(" openhuman call --method <name> [--params '<json>']"); println!(" openhuman skills <subcommand> [options] (skill development runtime)"); + println!(" openhuman screen-intelligence <subcommand> [options] (capture + vision debug)"); println!(" openhuman <namespace> <function> [--param value ...]\n");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/core/cli.rs` around lines 36 - 38, The new top-level command "screen-intelligence" was added via run_screen_intelligence_command but isn’t listed in the global help output; update the top-level help/usage text (the function or constant that prints command listings for openhuman) to include an entry for "screen-intelligence" with a short one-line description so it appears when users run `openhuman --help`; locate the same help-generation place where other commands are listed and add the "screen-intelligence" entry next to those commands so help stays consistent with the match arm that calls crate::core::screen_intelligence_cli::run_screen_intelligence_command.

coderabbitai · 2026-04-06T05:15:54Z

+        let bind_addr = format!("127.0.0.1:{}", opts.port);
+        let listener = tokio::net::TcpListener::bind(&bind_addr).await?;
+
+        log::info!(
+            "[screen-intelligence-cli] ready — http://{bind_addr}/rpc (JSON-RPC 2.0)"
+        );


⚠️ Potential issue | 🟡 Minor

Apply cargo fmt for this file before merge.

CI is reporting formatting mismatches in these changed blocks.

As per coding guidelines: src/**/*.rs: Run cargo fmt and cargo check before merging changes to Rust code.

Also applies to: 355-358

🧰 Tools

🪛 GitHub Actions: Type Check

[error] 164-164: cargo fmt --check failed due to formatting mismatch. Expected log::info! macro to be on a single line: log::info!("[screen-intelligence-cli] ready — http://{bind_addr}/rpc (JSON-RPC 2.0)");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/core/screen_intelligence_cli.rs` around lines 164 - 169, Run rustfmt (cargo fmt) to fix formatting mismatches in this file: reformat the block that defines bind_addr, listener, and the log::info! call (the variables bind_addr and listener and the log::info! invocation around the JSON-RPC readiness message) so it matches project style; after formatting, run cargo check to ensure no other style-related edits are required and commit the formatted changes.

coderabbitai · 2026-04-06T05:15:54Z

+/// `openhuman screen-intelligence start` — start a capture + vision session.
+fn run_start_session(args: &[String]) -> Result<()> {
+    if args.iter().any(|a| is_help(a)) {
+        println!("Usage: openhuman screen-intelligence start [--ttl <secs>] [-v]");
+        println!();
+        println!("Start a screen intelligence capture session with vision analysis.");
+        println!("The session runs until TTL expires or Ctrl+C is pressed.");
+        println!();
+        println!("  --ttl <secs>     Session duration (default: 300, max: 3600)");
+        println!("  -v, --verbose    Enable debug logging");
+        return Ok(());
+    }
+
+    let (opts, _) = parse_opts(args)?;
+    crate::core::logging::init_for_cli_run(opts.verbose);
+
+    let rt = tokio::runtime::Builder::new_multi_thread()
+        .enable_all()
+        .build()?;
+
+    rt.block_on(async {
+        let engine = bootstrap_engine(opts.verbose).await?;
+
+        let params = crate::openhuman::screen_intelligence::StartSessionParams {
+            consent: true,
+            ttl_secs: Some(opts.ttl_secs),
+            screen_monitoring: Some(true),
+            device_control: Some(false),
+            predictive_input: Some(false),
+        };
+
+        match engine.start_session(params).await {
+            Ok(session) => {
+                eprintln!("  Session started!");
+                eprintln!("  TTL:           {}s", session.ttl_secs);
+                eprintln!("  Vision:        {}", session.vision_enabled);
+                eprintln!("  Panic hotkey:  {}", session.panic_hotkey);
+                eprintln!();
+                eprintln!("  Capturing screenshots and running vision analysis...");
+                eprintln!("  Press Ctrl+C to stop.");
+                eprintln!();
+
+                // Print periodic status updates until the session ends.
+                let mut tick = tokio::time::interval(std::time::Duration::from_secs(5));
+                loop {
+                    tick.tick().await;
+                    let status = engine.status().await;
+                    if !status.session.active {
+                        eprintln!(
+                            "\n  Session ended: {}",
+                            status.session.stop_reason.unwrap_or_else(|| "unknown".into())
+                        );
+                        break;
+                    }
+                    eprintln!(
+                        "  [{}] captures={} vision={} queue={} last_app={:?}",
+                        chrono::Utc::now().format("%H:%M:%S"),
+                        status.session.capture_count,
+                        status.session.vision_state,
+                        status.session.vision_queue_depth,
+                        status.session.last_context.as_deref().unwrap_or("-"),
+                    );
+                    if let Some(summary) = &status.session.last_vision_summary {
+                        let truncated = if summary.len() > 100 {
+                            format!("{}…", &summary[..100])
+                        } else {
+                            summary.clone()
+                        };
+                        eprintln!("           notes: {truncated}");
+                    }
+                }
+            }
+            Err(e) => {
+                eprintln!("  Failed to start session: {e}");
+                std::process::exit(1);
+            }
+        }
+        Ok(())
+    })
+}
+
+/// `openhuman screen-intelligence stop` — stop an active session.
+fn run_stop_session(args: &[String]) -> Result<()> {
+    if args.iter().any(|a| is_help(a)) {
+        println!("Usage: openhuman screen-intelligence stop [-v]");
+        println!();
+        println!("Stop the active screen intelligence session.");
+        return Ok(());
+    }
+
+    let (opts, _) = parse_opts(args)?;
+    init_quiet_logging(opts.verbose);
+
+    let rt = tokio::runtime::Builder::new_multi_thread()
+        .enable_all()
+        .build()?;
+
+    rt.block_on(async {
+        let engine = bootstrap_engine(opts.verbose).await?;
+        let session = engine.stop_session(Some("cli_stop".to_string())).await;
+        eprintln!(
+            "  Session stopped: {}",
+            session
+                .stop_reason
+                .unwrap_or_else(|| "no active session".into())
+        );
+        Ok(())
+    })
+}


⚠️ Potential issue | 🟠 Major

stop subcommand is process-local and won’t stop sessions started elsewhere.

start keeps the session in the current process, while stop (Line 405+) boots a fresh engine in a new process and stops that local state only. In practice, it won’t stop a session started by a different CLI process.

🧰 Tools

🪛 GitHub Actions: Type Check

[error] 354-354: cargo fmt --check failed due to formatting mismatch. stop_reason unwrap_or_else expression formatting differs (line wrapping/indentation).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/core/screen_intelligence_cli.rs` around lines 307 - 415, The stop subcommand (run_stop_session) spins up a fresh local engine via bootstrap_engine and thus only stops sessions in that process; change it to call the global/persistent engine control API instead of creating a new in-process engine. Replace the bootstrap_engine/engine.stop_session flow in run_stop_session with the client/IPC path used to control the long-lived service (e.g., use the EngineClient or connect_to_engine_daemon function used elsewhere), so run_stop_session issues stop_session over the daemon IPC/RPC channel (keep init_quiet_logging(opts.verbose) for logging and still call engine_client.stop_session(Some("cli_stop".to_string())) and print session.stop_reason as before).

coderabbitai · 2026-04-06T05:15:54Z

+        let screenshots_dir = workspace_dir.join("screenshots");
+        std::fs::create_dir_all(&screenshots_dir)
+            .map_err(|e| format!("failed to create screenshots dir: {e}"))?;
+
+        let app_slug = frame
+            .app_name
+            .as_deref()
+            .unwrap_or("unknown")
+            .chars()
+            .map(|c| if c.is_alphanumeric() || c == '-' { c } else { '_' })
+            .collect::<String>();
+        let filename = format!("{}_{}.png", frame.captured_at_ms, app_slug);
+        let file_path = screenshots_dir.join(&filename);
+
+        std::fs::write(&file_path, &raw_bytes)
+            .map_err(|e| format!("failed to write screenshot {filename}: {e}"))?;


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Locate blocking filesystem calls in engine:" rg -n 'std::fs::(create_dir_all|write|remove_file)' src/openhuman/screen_intelligence/engine.rs echo echo "Show call context around vision worker and screenshot save:" rg -n -A8 -B8 'save_screenshot_to_disk|run_vision_worker' src/openhuman/screen_intelligence/engine.rs

Repository: tinyhumansai/openhuman

Length of output: 3926

Use tokio::fs or spawn_blocking for blocking I/O in the vision worker.

Blocking std::fs calls (create_dir_all, write, remove_file) in run_vision_worker and save_screenshot_to_disk will block Tokio worker threads under load. Convert these to use tokio::fs or wrap in tokio::task::spawn_blocking to prevent stalling other async tasks.

Also applies to: lines 934-943

🧰 Tools

🪛 GitHub Actions: Type Check

[error] 865-865: cargo fmt --check failed due to formatting mismatch. Adjusted map closure formatting to multi-line if/else.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/openhuman/screen_intelligence/engine.rs` around lines 859 - 874, The blocking std::fs calls in run_vision_worker/save_screenshot_to_disk (notably std::fs::create_dir_all, std::fs::write and later std::fs::remove_file) must be converted to non-blocking async equivalents or run on a blocking thread; either replace them with tokio::fs::create_dir_all / tokio::fs::write / tokio::fs::remove_file and await them, or wrap the existing std::fs calls in tokio::task::spawn_blocking and await the JoinHandle; update the code that constructs screenshots_dir, file_path, and the write/remove operations in the functions handling screenshot persistence (e.g., run_vision_worker and save_screenshot_to_disk) to use one of these approaches so the Tokio runtime threads are not blocked.

coderabbitai · 2026-04-06T05:15:55Z

+        let app_slug = frame
+            .app_name
+            .as_deref()
+            .unwrap_or("unknown")
+            .chars()
+            .map(|c| if c.is_alphanumeric() || c == '-' { c } else { '_' })
+            .collect::<String>();
+        let filename = format!("{}_{}.png", frame.captured_at_ms, app_slug);


⚠️ Potential issue | 🟡 Minor

cargo fmt needs to be applied to this changed block.

CI reports a formatting mismatch in this region.

As per coding guidelines: src/**/*.rs: Run cargo fmt and cargo check before merging changes to Rust code.

🧰 Tools

🪛 GitHub Actions: Type Check

[error] 865-865: cargo fmt --check failed due to formatting mismatch. Adjusted map closure formatting to multi-line if/else.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/openhuman/screen_intelligence/engine.rs` around lines 863 - 870, The modified block around building app_slug and filename is not rustfmt-compliant; run `cargo fmt` (and then `cargo check`) and commit the reformatted changes so the `app_name` handling and the `filename` formatting match project style, ensuring the closure using `.chars().map(...)` and the `format!` call in the function that constructs `app_slug`/`filename` are reformatted by rustfmt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

senamakel and others added 3 commits April 5, 2026 17:24

Merge remote-tracking branch 'upstream/main' into fixes/screen-intell…

c71533b

…igence

coderabbitai Bot reviewed Apr 6, 2026

View reviewed changes

fix: add keep_screenshots to ScreenIntelligencePanel test assertion

2dada67

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

senamakel merged commit bb1f2c3 into tinyhumansai:main Apr 6, 2026
5 of 9 checks passed

This was referenced Apr 6, 2026

Feat/343 screen intelligence e2e tests #359

Merged

Fix: decouple chat from local Ollama, route inference through Rust #369

Merged

feat(screen-intelligence): standalone server with OCR + vision pipeline #382

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: screen intelligence pipeline + CLI + keep_screenshots config#339

feat: screen intelligence pipeline + CLI + keep_screenshots config#339
senamakel merged 4 commits intotinyhumansai:mainfrom
senamakel:fixes/screen-intelligence

senamakel commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Uh oh!

coderabbitai Bot Apr 6, 2026

Uh oh!

coderabbitai Bot Apr 6, 2026

Uh oh!

coderabbitai Bot Apr 6, 2026

Uh oh!

coderabbitai Bot Apr 6, 2026

Uh oh!

coderabbitai Bot Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented Apr 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

CLI subcommands

Config flag

Files changed

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senamakel commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading