Skip to content

feat: screen intelligence pipeline + CLI + keep_screenshots config#339

Merged
senamakel merged 4 commits intotinyhumansai:mainfrom
senamakel:fixes/screen-intelligence
Apr 6, 2026
Merged

feat: screen intelligence pipeline + CLI + keep_screenshots config#339
senamakel merged 4 commits intotinyhumansai:mainfrom
senamakel:fixes/screen-intelligence

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented Apr 6, 2026

Summary

  • Screen intelligence capture → vision pipeline: Screenshots are saved to {workspace_dir}/screenshots/, processed by local Ollama vision model, then cleaned up (or kept based on config)
  • keep_screenshots config flag (default: false): When enabled, screenshots persist on disk after vision processing instead of being deleted
  • openhuman screen-intelligence CLI: Standalone CLI for testing the capture + vision pipeline without the full desktop app (mirrors openhuman skills pattern)

CLI subcommands

Command Description
screen-intelligence run Start a lightweight JSON-RPC server
screen-intelligence status Print engine status as JSON
screen-intelligence capture [--keep] Take a single screenshot with diagnostics
screen-intelligence start [--ttl <secs>] Start a capture + vision session
screen-intelligence stop Stop the active session

Config flag

[screen_intelligence]
keep_screenshots = false  # default: delete after vision processing

Files changed

  • Rust config: accessibility.rs, ops.rs, schemas.rs — added keep_screenshots field
  • Rust engine: engine.rssave_screenshot_to_disk() + vision worker integration
  • Rust CLI: new screen_intelligence_cli.rs, wired into cli.rs + mod.rs
  • Frontend: tauriCommands.ts, ScreenIntelligencePanel.tsx — UI toggle + type updates
  • Tests: all 5 test fixtures updated with keep_screenshots: false

Test plan

  • cargo check passes
  • tsc --noEmit passes
  • openhuman screen-intelligence --help prints usage
  • openhuman screen-intelligence status prints JSON with permissions, session, config
  • openhuman screen-intelligence capture --keep saves screenshot to ~/.openhuman/workspace/screenshots/
  • openhuman screen-intelligence start --ttl 60 -v runs capture + vision loop
  • Settings panel toggle for "Keep Screenshots" persists to config
  • Screenshots are deleted after processing when keep_screenshots = false

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added "Keep Screenshots" toggle in Screen Intelligence settings to control whether screenshots are retained after processing
    • Added screen intelligence CLI commands for engine management (run, status, capture, start, stop)
  • Tests

    • Updated test fixtures to reflect new keep_screenshots configuration field

senamakel and others added 3 commits April 5, 2026 17:24
- Introduced a new `keep_screenshots` option in the Screen Intelligence settings, allowing users to save captured screenshots to the workspace instead of deleting them after processing.
- Updated relevant components and tests to reflect this new feature, ensuring consistent behavior across the application.
- Enhanced the user interface to include a checkbox for the `keep_screenshots` setting, improving user experience and configurability.
Adds a dedicated CLI (mirroring `openhuman skills`) to run the screen
intelligence capture + vision pipeline without the full desktop app.

Subcommands: run, status, capture, start, stop.
Makes save_screenshot_to_disk public so the CLI capture --keep flag works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 6, 2026

Warning

Rate limit exceeded

@senamakel has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 12 minutes and 27 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 12 minutes and 27 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f03a8025-e22b-49af-8b03-cc650f2e234d

📥 Commits

Reviewing files that changed from the base of the PR and between 4472048 and 2dada67.

📒 Files selected for processing (1)
  • app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx
📝 Walkthrough

Walkthrough

This pull request introduces a new keep_screenshots boolean configuration option throughout the screen intelligence system. Changes include frontend UI controls, TypeScript type definitions, Rust configuration schema and settings operations, a new screen-intelligence CLI subcommand, and engine logic to conditionally save and delete screenshots based on the configuration setting.

Changes

Cohort / File(s) Summary
Frontend Component & Test Fixtures
app/src/components/settings/panels/ScreenIntelligencePanel.tsx, app/src/components/intelligence/__tests__/ScreenIntelligenceDebugPanel.test.tsx, app/src/components/settings/panels/__tests__/AccessibilityPanel.test.tsx, app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx
Added new keepScreenshots UI state to the Screen Intelligence Policy panel with a checkbox control, initialization from config, and payload inclusion on save. Updated test fixtures with keep_screenshots: false field.
Type & Interface Definitions
app/src/utils/tauriCommands.ts
Extended AccessibilityConfig with required keep_screenshots: boolean field and ScreenIntelligenceSettingsUpdate with optional keep_screenshots?: boolean | null field for partial updates.
Configuration Schema
src/openhuman/config/schema/accessibility.rs
Added new boolean field keep_screenshots to ScreenIntelligenceConfig with #[serde(default)] and explicit false default, controlling whether screenshots are retained post-vision processing.
Configuration Operations
src/openhuman/config/ops.rs, src/openhuman/config/schemas.rs
Extended ScreenIntelligenceSettingsPatch with optional keep_screenshots: Option<bool> field and updated apply_screen_intelligence_settings to apply the patch conditionally. Added RPC schema integration for settings updates.
Service Tests
app/src/services/__tests__/coreRpcClient.test.ts, app/src/store/__tests__/accessibilitySlice.test.ts
Updated sampleAccessibilityStatus() test fixtures to include keep_screenshots: false field.
Screen Intelligence CLI
src/core/cli.rs, src/core/mod.rs, src/core/screen_intelligence_cli.rs
Added new screen-intelligence CLI subcommand dispatcher and standalone module implementing run, status, capture, start, stop subcommands with local HTTP server (Axum/Tokio), JSON-RPC endpoint, and session/capture management.
Engine Screenshot Management
src/openhuman/screen_intelligence/engine.rs
Added save_screenshot_to_disk() function to persist base64-encoded captures, updated run_vision_worker to save frames before vision analysis and conditionally delete saved screenshots when keep_screenshots is false.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant UI as ScreenIntelligencePanel
    participant Config as Config System
    participant Engine as Screen Intelligence Engine
    participant Disk as Disk Storage

    User->>UI: Toggle keep_screenshots checkbox
    UI->>Config: Save setting via openhumanUpdateScreenIntelligenceSettings
    Config->>Config: Apply patch to config.screen_intelligence.keep_screenshots
    
    rect rgba(200, 150, 255, 0.5)
    Note over Engine,Disk: On each frame capture
    Engine->>Disk: save_screenshot_to_disk(frame)
    Disk-->>Engine: Return file path
    Engine->>Engine: analyze_frame_with_vision(frame)
    
    alt keep_screenshots == true
        Note over Engine,Disk: Retain screenshot
        Engine->>Disk: Screenshot persisted
    else keep_screenshots == false
        Engine->>Disk: Delete saved screenshot file
        Disk-->>Engine: Deletion complete
    end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🐰 A screenshot saved, a choice so clear,
Keep or discard, the control is near!
Through config and CLI, the setting flows,
From UI to engine, the preference goes.
Now screenshots dance where they're meant to stay! 📸

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the three main features added: screen intelligence pipeline, CLI subcommands, and keep_screenshots config.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/screen_intelligence/engine.rs (1)

906-929: ⚠️ Potential issue | 🟠 Major

Potential screenshot leak when session disappears mid-loop.

Line 909 can write a screenshot before Line 924 checks session existence. If session is already gone, the break at Line 928 skips the deletion branch (Line 935+), so files may remain even when keep_screenshots=false.

Suggested fix (cleanup before early break)
             {
                 let mut state = self.inner.lock().await;
                 if let Some(session) = state.session.as_mut() {
                     session.vision_state = "processing".to_string();
                 } else {
+                    if !keep_screenshots {
+                        if let Some(path) = saved_path.as_ref() {
+                            let _ = std::fs::remove_file(path);
+                        }
+                    }
                     tracing::debug!("[screen_intelligence] vision worker: no session, exiting");
                     break;
                 }
             }

Also applies to: 934-944

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/screen_intelligence/engine.rs` around lines 906 - 929, A
screenshot can be saved via save_screenshot_to_disk before checking whether the
session exists (self.inner.lock().await -> session), so if session is gone we
break out and skip the later deletion branch causing a leak; modify the loop to,
immediately after saving (i.e. where saved_path is set), check whether
state.session is Some and if not then: if saved_path.is_some() and
keep_screenshots is false delete the saved file (using workspace_dir and the
saved_path) before breaking, and make the same change for the later mirror block
(the block around session.vision_state update and subsequent deletion logic) so
any saved screenshot is cleaned up when the session disappears.
🧹 Nitpick comments (1)
app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx (1)

188-194: Assert keep_screenshots in the save payload.

Line 188 currently doesn’t verify the new flag, so this test can still pass if keep_screenshots is accidentally dropped during save.

Suggested test assertion update
     expect(openhumanUpdateScreenIntelligenceSettings).toHaveBeenCalledWith({
       enabled: true,
       policy_mode: 'all_except_blacklist',
       baseline_fps: 1,
+      keep_screenshots: false,
       allowlist: ['Code'],
       denylist: ['1Password'],
     });

Based on learnings: "Applies to app/src/**/*.test.{ts,tsx} : Prefer testing behavior over implementation details; use existing helpers from app/src/test/ (test-utils.tsx, shared mock backend) before adding new harness code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx`
around lines 188 - 194, Update the test assertion to include the new
keep_screenshots flag in the payload so the save action is validated for that
field; specifically modify the expect for
openhumanUpdateScreenIntelligenceSettings to include keep_screenshots: true (or
the appropriate boolean) alongside enabled, policy_mode, baseline_fps,
allowlist, and denylist in the object so the test fails if keep_screenshots is
omitted or changed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/settings/panels/ScreenIntelligencePanel.tsx`:
- Around line 285-289: The new JSX block in the ScreenIntelligencePanel
component (the <label> block inside ScreenIntelligencePanel.tsx) is not
formatted to project Prettier rules; run Prettier over
app/src/components/settings/panels/ScreenIntelligencePanel.tsx (or the whole
app/src/**/*.{ts,tsx}) to fix formatting, then re-run lint/type checks (ESLint
and tsc --noEmit) and commit the formatted changes so `prettier --check` passes.

In `@src/core/cli.rs`:
- Around line 36-38: The new top-level command "screen-intelligence" was added
via run_screen_intelligence_command but isn’t listed in the global help output;
update the top-level help/usage text (the function or constant that prints
command listings for openhuman) to include an entry for "screen-intelligence"
with a short one-line description so it appears when users run `openhuman
--help`; locate the same help-generation place where other commands are listed
and add the "screen-intelligence" entry next to those commands so help stays
consistent with the match arm that calls
crate::core::screen_intelligence_cli::run_screen_intelligence_command.

In `@src/core/screen_intelligence_cli.rs`:
- Around line 307-415: The stop subcommand (run_stop_session) spins up a fresh
local engine via bootstrap_engine and thus only stops sessions in that process;
change it to call the global/persistent engine control API instead of creating a
new in-process engine. Replace the bootstrap_engine/engine.stop_session flow in
run_stop_session with the client/IPC path used to control the long-lived service
(e.g., use the EngineClient or connect_to_engine_daemon function used
elsewhere), so run_stop_session issues stop_session over the daemon IPC/RPC
channel (keep init_quiet_logging(opts.verbose) for logging and still call
engine_client.stop_session(Some("cli_stop".to_string())) and print
session.stop_reason as before).
- Around line 164-169: Run rustfmt (cargo fmt) to fix formatting mismatches in
this file: reformat the block that defines bind_addr, listener, and the
log::info! call (the variables bind_addr and listener and the log::info!
invocation around the JSON-RPC readiness message) so it matches project style;
after formatting, run cargo check to ensure no other style-related edits are
required and commit the formatted changes.

In `@src/openhuman/screen_intelligence/engine.rs`:
- Around line 863-870: The modified block around building app_slug and filename
is not rustfmt-compliant; run `cargo fmt` (and then `cargo check`) and commit
the reformatted changes so the `app_name` handling and the `filename` formatting
match project style, ensuring the closure using `.chars().map(...)` and the
`format!` call in the function that constructs `app_slug`/`filename` are
reformatted by rustfmt.
- Around line 859-874: The blocking std::fs calls in
run_vision_worker/save_screenshot_to_disk (notably std::fs::create_dir_all,
std::fs::write and later std::fs::remove_file) must be converted to non-blocking
async equivalents or run on a blocking thread; either replace them with
tokio::fs::create_dir_all / tokio::fs::write / tokio::fs::remove_file and await
them, or wrap the existing std::fs calls in tokio::task::spawn_blocking and
await the JoinHandle; update the code that constructs screenshots_dir,
file_path, and the write/remove operations in the functions handling screenshot
persistence (e.g., run_vision_worker and save_screenshot_to_disk) to use one of
these approaches so the Tokio runtime threads are not blocked.

---

Outside diff comments:
In `@src/openhuman/screen_intelligence/engine.rs`:
- Around line 906-929: A screenshot can be saved via save_screenshot_to_disk
before checking whether the session exists (self.inner.lock().await -> session),
so if session is gone we break out and skip the later deletion branch causing a
leak; modify the loop to, immediately after saving (i.e. where saved_path is
set), check whether state.session is Some and if not then: if
saved_path.is_some() and keep_screenshots is false delete the saved file (using
workspace_dir and the saved_path) before breaking, and make the same change for
the later mirror block (the block around session.vision_state update and
subsequent deletion logic) so any saved screenshot is cleaned up when the
session disappears.

---

Nitpick comments:
In
`@app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx`:
- Around line 188-194: Update the test assertion to include the new
keep_screenshots flag in the payload so the save action is validated for that
field; specifically modify the expect for
openhumanUpdateScreenIntelligenceSettings to include keep_screenshots: true (or
the appropriate boolean) alongside enabled, policy_mode, baseline_fps,
allowlist, and denylist in the object so the test fails if keep_screenshots is
omitted or changed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bf5a01cf-9078-4caa-bb13-380e19e2c2ff

📥 Commits

Reviewing files that changed from the base of the PR and between 09b627c and 4472048.

📒 Files selected for processing (14)
  • app/src/components/intelligence/__tests__/ScreenIntelligenceDebugPanel.test.tsx
  • app/src/components/settings/panels/ScreenIntelligencePanel.tsx
  • app/src/components/settings/panels/__tests__/AccessibilityPanel.test.tsx
  • app/src/components/settings/panels/__tests__/ScreenIntelligencePanel.test.tsx
  • app/src/services/__tests__/coreRpcClient.test.ts
  • app/src/store/__tests__/accessibilitySlice.test.ts
  • app/src/utils/tauriCommands.ts
  • src/core/cli.rs
  • src/core/mod.rs
  • src/core/screen_intelligence_cli.rs
  • src/openhuman/config/ops.rs
  • src/openhuman/config/schema/accessibility.rs
  • src/openhuman/config/schemas.rs
  • src/openhuman/screen_intelligence/engine.rs

Comment on lines +285 to +289
<label className="flex items-center justify-between rounded-xl border border-stone-200 bg-stone-50 px-3 py-2">
<div>
<span className="text-sm text-stone-700">Keep Screenshots</span>
<p className="text-xs text-stone-400">Save captured screenshots to the workspace instead of deleting after processing</p>
</div>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Run Prettier on the new JSX block before merge.

The pipeline reports prettier --check failure, and this newly added block is in the failing file.

As per coding guidelines: app/src/**/*.{ts,tsx}: Run Prettier, ESLint, and tsc --noEmit before merging changes to app/ code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/settings/panels/ScreenIntelligencePanel.tsx` around lines
285 - 289, The new JSX block in the ScreenIntelligencePanel component (the
<label> block inside ScreenIntelligencePanel.tsx) is not formatted to project
Prettier rules; run Prettier over
app/src/components/settings/panels/ScreenIntelligencePanel.tsx (or the whole
app/src/**/*.{ts,tsx}) to fix formatting, then re-run lint/type checks (ESLint
and tsc --noEmit) and commit the formatted changes so `prettier --check` passes.

Comment thread src/core/cli.rs
Comment on lines +36 to +38
"screen-intelligence" => {
crate::core::screen_intelligence_cli::run_screen_intelligence_command(&args[1..])
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Expose the new command in general help output.

Line 36 adds a new top-level command, but openhuman --help doesn’t list it yet, which hurts discoverability.

Suggested help text update
 fn print_general_help(grouped: &BTreeMap<String, Vec<ControllerSchema>>) {
     println!("OpenHuman core CLI\n");
     println!("Usage:");
     println!("  openhuman run [--host <addr>] [--port <u16>] [--jsonrpc-only] [--verbose]");
     println!("  openhuman repl [--verbose] [--eval '<cmd>'] [--batch]");
     println!("  openhuman call --method <name> [--params '<json>']");
     println!("  openhuman skills <subcommand> [options]   (skill development runtime)");
+    println!("  openhuman screen-intelligence <subcommand> [options]   (capture + vision debug)");
     println!("  openhuman <namespace> <function> [--param value ...]\n");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/cli.rs` around lines 36 - 38, The new top-level command
"screen-intelligence" was added via run_screen_intelligence_command but isn’t
listed in the global help output; update the top-level help/usage text (the
function or constant that prints command listings for openhuman) to include an
entry for "screen-intelligence" with a short one-line description so it appears
when users run `openhuman --help`; locate the same help-generation place where
other commands are listed and add the "screen-intelligence" entry next to those
commands so help stays consistent with the match arm that calls
crate::core::screen_intelligence_cli::run_screen_intelligence_command.

Comment on lines +164 to +169
let bind_addr = format!("127.0.0.1:{}", opts.port);
let listener = tokio::net::TcpListener::bind(&bind_addr).await?;

log::info!(
"[screen-intelligence-cli] ready — http://{bind_addr}/rpc (JSON-RPC 2.0)"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Apply cargo fmt for this file before merge.

CI is reporting formatting mismatches in these changed blocks.

As per coding guidelines: src/**/*.rs: Run cargo fmt and cargo check before merging changes to Rust code.

Also applies to: 355-358

🧰 Tools
🪛 GitHub Actions: Type Check

[error] 164-164: cargo fmt --check failed due to formatting mismatch. Expected log::info! macro to be on a single line: log::info!("[screen-intelligence-cli] ready — http://{bind_addr}/rpc (JSON-RPC 2.0)");

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/screen_intelligence_cli.rs` around lines 164 - 169, Run rustfmt
(cargo fmt) to fix formatting mismatches in this file: reformat the block that
defines bind_addr, listener, and the log::info! call (the variables bind_addr
and listener and the log::info! invocation around the JSON-RPC readiness
message) so it matches project style; after formatting, run cargo check to
ensure no other style-related edits are required and commit the formatted
changes.

Comment on lines +307 to +415
/// `openhuman screen-intelligence start` — start a capture + vision session.
fn run_start_session(args: &[String]) -> Result<()> {
if args.iter().any(|a| is_help(a)) {
println!("Usage: openhuman screen-intelligence start [--ttl <secs>] [-v]");
println!();
println!("Start a screen intelligence capture session with vision analysis.");
println!("The session runs until TTL expires or Ctrl+C is pressed.");
println!();
println!(" --ttl <secs> Session duration (default: 300, max: 3600)");
println!(" -v, --verbose Enable debug logging");
return Ok(());
}

let (opts, _) = parse_opts(args)?;
crate::core::logging::init_for_cli_run(opts.verbose);

let rt = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?;

rt.block_on(async {
let engine = bootstrap_engine(opts.verbose).await?;

let params = crate::openhuman::screen_intelligence::StartSessionParams {
consent: true,
ttl_secs: Some(opts.ttl_secs),
screen_monitoring: Some(true),
device_control: Some(false),
predictive_input: Some(false),
};

match engine.start_session(params).await {
Ok(session) => {
eprintln!(" Session started!");
eprintln!(" TTL: {}s", session.ttl_secs);
eprintln!(" Vision: {}", session.vision_enabled);
eprintln!(" Panic hotkey: {}", session.panic_hotkey);
eprintln!();
eprintln!(" Capturing screenshots and running vision analysis...");
eprintln!(" Press Ctrl+C to stop.");
eprintln!();

// Print periodic status updates until the session ends.
let mut tick = tokio::time::interval(std::time::Duration::from_secs(5));
loop {
tick.tick().await;
let status = engine.status().await;
if !status.session.active {
eprintln!(
"\n Session ended: {}",
status.session.stop_reason.unwrap_or_else(|| "unknown".into())
);
break;
}
eprintln!(
" [{}] captures={} vision={} queue={} last_app={:?}",
chrono::Utc::now().format("%H:%M:%S"),
status.session.capture_count,
status.session.vision_state,
status.session.vision_queue_depth,
status.session.last_context.as_deref().unwrap_or("-"),
);
if let Some(summary) = &status.session.last_vision_summary {
let truncated = if summary.len() > 100 {
format!("{}…", &summary[..100])
} else {
summary.clone()
};
eprintln!(" notes: {truncated}");
}
}
}
Err(e) => {
eprintln!(" Failed to start session: {e}");
std::process::exit(1);
}
}
Ok(())
})
}

/// `openhuman screen-intelligence stop` — stop an active session.
fn run_stop_session(args: &[String]) -> Result<()> {
if args.iter().any(|a| is_help(a)) {
println!("Usage: openhuman screen-intelligence stop [-v]");
println!();
println!("Stop the active screen intelligence session.");
return Ok(());
}

let (opts, _) = parse_opts(args)?;
init_quiet_logging(opts.verbose);

let rt = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?;

rt.block_on(async {
let engine = bootstrap_engine(opts.verbose).await?;
let session = engine.stop_session(Some("cli_stop".to_string())).await;
eprintln!(
" Session stopped: {}",
session
.stop_reason
.unwrap_or_else(|| "no active session".into())
);
Ok(())
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

stop subcommand is process-local and won’t stop sessions started elsewhere.

start keeps the session in the current process, while stop (Line 405+) boots a fresh engine in a new process and stops that local state only. In practice, it won’t stop a session started by a different CLI process.

🧰 Tools
🪛 GitHub Actions: Type Check

[error] 354-354: cargo fmt --check failed due to formatting mismatch. stop_reason unwrap_or_else expression formatting differs (line wrapping/indentation).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/screen_intelligence_cli.rs` around lines 307 - 415, The stop
subcommand (run_stop_session) spins up a fresh local engine via bootstrap_engine
and thus only stops sessions in that process; change it to call the
global/persistent engine control API instead of creating a new in-process
engine. Replace the bootstrap_engine/engine.stop_session flow in
run_stop_session with the client/IPC path used to control the long-lived service
(e.g., use the EngineClient or connect_to_engine_daemon function used
elsewhere), so run_stop_session issues stop_session over the daemon IPC/RPC
channel (keep init_quiet_logging(opts.verbose) for logging and still call
engine_client.stop_session(Some("cli_stop".to_string())) and print
session.stop_reason as before).

Comment on lines +859 to +874
let screenshots_dir = workspace_dir.join("screenshots");
std::fs::create_dir_all(&screenshots_dir)
.map_err(|e| format!("failed to create screenshots dir: {e}"))?;

let app_slug = frame
.app_name
.as_deref()
.unwrap_or("unknown")
.chars()
.map(|c| if c.is_alphanumeric() || c == '-' { c } else { '_' })
.collect::<String>();
let filename = format!("{}_{}.png", frame.captured_at_ms, app_slug);
let file_path = screenshots_dir.join(&filename);

std::fs::write(&file_path, &raw_bytes)
.map_err(|e| format!("failed to write screenshot {filename}: {e}"))?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate blocking filesystem calls in engine:"
rg -n 'std::fs::(create_dir_all|write|remove_file)' src/openhuman/screen_intelligence/engine.rs

echo
echo "Show call context around vision worker and screenshot save:"
rg -n -A8 -B8 'save_screenshot_to_disk|run_vision_worker' src/openhuman/screen_intelligence/engine.rs

Repository: tinyhumansai/openhuman

Length of output: 3926


Use tokio::fs or spawn_blocking for blocking I/O in the vision worker.

Blocking std::fs calls (create_dir_all, write, remove_file) in run_vision_worker and save_screenshot_to_disk will block Tokio worker threads under load. Convert these to use tokio::fs or wrap in tokio::task::spawn_blocking to prevent stalling other async tasks.

Also applies to: lines 934-943

🧰 Tools
🪛 GitHub Actions: Type Check

[error] 865-865: cargo fmt --check failed due to formatting mismatch. Adjusted map closure formatting to multi-line if/else.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/screen_intelligence/engine.rs` around lines 859 - 874, The
blocking std::fs calls in run_vision_worker/save_screenshot_to_disk (notably
std::fs::create_dir_all, std::fs::write and later std::fs::remove_file) must be
converted to non-blocking async equivalents or run on a blocking thread; either
replace them with tokio::fs::create_dir_all / tokio::fs::write /
tokio::fs::remove_file and await them, or wrap the existing std::fs calls in
tokio::task::spawn_blocking and await the JoinHandle; update the code that
constructs screenshots_dir, file_path, and the write/remove operations in the
functions handling screenshot persistence (e.g., run_vision_worker and
save_screenshot_to_disk) to use one of these approaches so the Tokio runtime
threads are not blocked.

Comment on lines +863 to +870
let app_slug = frame
.app_name
.as_deref()
.unwrap_or("unknown")
.chars()
.map(|c| if c.is_alphanumeric() || c == '-' { c } else { '_' })
.collect::<String>();
let filename = format!("{}_{}.png", frame.captured_at_ms, app_slug);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

cargo fmt needs to be applied to this changed block.

CI reports a formatting mismatch in this region.

As per coding guidelines: src/**/*.rs: Run cargo fmt and cargo check before merging changes to Rust code.

🧰 Tools
🪛 GitHub Actions: Type Check

[error] 865-865: cargo fmt --check failed due to formatting mismatch. Adjusted map closure formatting to multi-line if/else.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/screen_intelligence/engine.rs` around lines 863 - 870, The
modified block around building app_slug and filename is not rustfmt-compliant;
run `cargo fmt` (and then `cargo check`) and commit the reformatted changes so
the `app_name` handling and the `filename` formatting match project style,
ensuring the closure using `.chars().map(...)` and the `format!` call in the
function that constructs `app_slug`/`filename` are reformatted by rustfmt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@senamakel senamakel merged commit bb1f2c3 into tinyhumansai:main Apr 6, 2026
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant