feat(planner): read-only sandbox gate on composio meta-tools#904
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 6 minutes and 49 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR implements per-agent sandbox mode enforcement through a task-local context system. It removes an unused dependency, expands the planner's Composio tool configuration, and adds infrastructure to read per-agent sandbox modes from the registry and enforce read-only restrictions for Composio actions at runtime. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Bus as Bus Handler
participant Registry as AgentDefinition<br/>Registry
participant Context as Sandbox<br/>Context
participant ToolLoop as Tool-Call<br/>Loop
participant Composio as Composio<br/>Tools
Client->>Bus: run_turn(target_agent_id)
Bus->>Registry: resolve agent definition
Registry-->>Bus: SandboxMode
Bus->>Context: with_current_sandbox_mode(mode)
Context-->>Bus: Enter scope
Bus->>ToolLoop: execute tool calls
ToolLoop->>Composio: composio_execute(action)
Composio->>Context: current_sandbox_mode()
Context-->>Composio: ReadOnly?
alt ReadOnly Mode
Composio->>Composio: resolve action to ToolScope
alt Write/Admin Scope
Composio-->>ToolLoop: reject with error
else Read Scope
Composio->>Composio: proceed to execution
Composio-->>ToolLoop: result
end
else Other Modes
Composio->>Composio: proceed to execution
Composio-->>ToolLoop: result
end
ToolLoop-->>Bus: completion
Context-->>Bus: Exit scope
Bus-->>Client: turn result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/composio/tools.rs`:
- Around line 462-488: ComposioActionTool::execute bypasses the agent-level
read-only sandbox; add the same sandbox gate as in tools.rs by calling
current_sandbox_mode() at the start of ComposioActionTool::execute, and if
Some(SandboxMode::ReadOnly) resolve the tool scope (reuse resolve_action_scope
or the same logic) and refuse ToolScope::Write | ToolScope::Admin with a
ToolResult::error message matching the existing composio_execute block;
alternatively, if you prefer spawn-time enforcement, filter out
Write/Admin-scoped Composio actions where parent agent current_sandbox_mode() ==
Some(SandboxMode::ReadOnly) in subagent_runner/ops.rs before registering
per-action tools.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 089ed428-457b-4de6-9e3b-554e2f49c2ed
📒 Files selected for processing (7)
app/src-tauri/Cargo.tomlsrc/openhuman/agent/agents/planner/agent.tomlsrc/openhuman/agent/bus.rssrc/openhuman/agent/harness/mod.rssrc/openhuman/agent/harness/sandbox_context.rssrc/openhuman/agent/harness/subagent_runner/ops.rssrc/openhuman/composio/tools.rs
💤 Files with no reviewable changes (1)
- app/src-tauri/Cargo.toml
Addresses CodeRabbit feedback on tinyhumansai#904. The dispatcher path (`ComposioExecuteTool::execute`) gates Write/Admin slugs when the calling agent is `sandbox_mode = "read_only"`, but `ComposioActionTool::execute` — the per-action surface spawned dynamically under `integrations_agent` — called `client.execute_tool` directly, bypassing the gate. Safe today because `integrations_agent` is `sandbox_mode = "none"` and no read-only agent spawns per-action tools, but the invariant "read-only blocks all Composio writes" becomes implicit rather than enforced. This mirrors the gate on the per-action path so both routes to the backend honour the same rule. Touches `src/openhuman/composio/tools.rs` + `src/openhuman/composio/action_tool.rs`: - `resolve_action_scope` → `pub(super)` so sibling modules reuse the same scope resolution (curated catalog + `classify_unknown` fallback). - `ComposioActionTool::execute` → new gate block at the top that reads `current_sandbox_mode()`, resolves the action's scope, and returns a `ToolResult::error` + `tracing::info!` audit line when the mode is `ReadOnly` and the scope is `Write` / `Admin`. `SandboxMode::None` / `Sandboxed` and the unset task-local value pass through unchanged. Three new tests mirror the existing gate tests on `ComposioExecuteTool`: `ReadOnly + Write slug → blocked`, `ReadOnly + Admin slug → blocked`, `unset → gate is a no-op`.
…scope Adds the plumbing tinyhumansai#685 needs to gate tool behavior by the calling agent's `sandbox_mode` without touching the `Tool` trait signature. Touches `src/openhuman/agent/harness/mod.rs` + `src/openhuman/agent/harness/sandbox_context.rs` (new): - `tokio::task_local! CURRENT_AGENT_SANDBOX_MODE: SandboxMode` → scoped per agent turn. Sibling of the existing `PARENT_CONTEXT` / `FORK_CONTEXT` in `harness/fork_context.rs` but carries the *currently-executing* agent's mode rather than the parent's. - `current_sandbox_mode()` → reader. Returns `None` outside any scope (direct CLI / JSON-RPC / unit-test tool invocation) so callers can treat the addition as strictly additive. - `with_current_sandbox_mode(mode, future)` → setter wrapping the `tokio::task_local!` scope. Intended call site is the tool loop immediately around each `tool.execute(args)`. - re-exports from `harness/mod.rs` next to the existing fork-context re-exports. Four unit tests cover the four paths: unset → `None`; scoped → mode observed inside; no leak after scope ends; nested scope overrides outer. No behavior change yet — the task-local is not read by any production code in this commit; wiring lands in the next one.
Wires the task-local added in the previous commit so every tool invoked inside an agent turn observes the calling agent's `sandbox_mode`. No tool reads the mode yet — this commit only sets it; enforcement lands next. Touches `src/openhuman/agent/bus.rs` + `src/openhuman/agent/harness/subagent_runner/ops.rs`: - `bus::register_agent_handlers` → resolves `target_agent_id` to an `AgentDefinition` via `AgentDefinitionRegistry::global()` and wraps the `run_tool_call_loop(...)` call in `with_current_sandbox_mode(sandbox_mode, …)`. Falls back to `SandboxMode::None` when no id is set or the registry isn't bootstrapped (legacy "generic unfiltered turn" + tests). - `subagent_runner::run_subagent` → wraps the fork-mode / typed-mode dispatch in `with_current_sandbox_mode(definition.sandbox_mode, …)` so sub-agents carry their own sandbox mode rather than inheriting the parent's. Task-local scope semantics (`tokio::task_local!`) means the mode propagates across every `.await` boundary inside the wrapped future, including `tool.execute` at `tool_loop.rs:545` and `subagent_runner/ops.rs:1025` — no per-call-site wrapping needed.
Core behaviour change for tinyhumansai#685 — layers an agent-level strict read-only check on top of the existing per-user scope preference so an agent declaring `sandbox_mode = "read_only"` in its `agent.toml` can never mutate user state via the composio surface, regardless of what the user's scope pref allows. Touches `src/openhuman/composio/tools.rs`: - new `resolve_action_scope(slug)` → prefers the toolkit's curated catalog (`find_curated`), falls back to the existing `classify_unknown` heuristic, defaults unparseable slugs to `Write` so unknown slugs fail-closed under the gate. - `ComposioExecuteTool::execute` → reads `current_sandbox_mode()` (task-local from the previous commit); when it's `SandboxMode::ReadOnly` and the resolved scope is `Write` or `Admin`, returns a `ToolResult::error` explaining the refusal and emits a `tracing::info!` audit line. `SandboxMode::None` / `Sandboxed` and the `None` task-local value (direct CLI / JSON-RPC / tests) pass through unchanged — strictly additive. - ordering → sandbox gate runs *before* `evaluate_tool_visibility` so a read-only agent hits the cheapest, most definitive rejection first without waking up the user-scope lookup / curated-catalog path. Five unit tests cover: `ReadOnly + Write` slug → blocked, `ReadOnly + Admin` slug → blocked, `ReadOnly + Read` slug → passes through to downstream gates, task-local unset → gate is a no-op (back-compat), and `Sandboxed` → not read-only, no gate trigger.
Now that the sandbox gate (previous commit) refuses Write/Admin actions when the calling agent is `sandbox_mode = "read_only"`, the planner can safely carry the composio meta-tools on its whitelist — the agent-level gate guarantees the planner cannot send mail / create pages / mutate state via the composio surface, matching the issue's "strict read-only mode" requirement (tinyhumansai#685). Touches `src/openhuman/agent/agents/planner/agent.toml`: - `[tools] named` → add `composio_list_toolkits`, `composio_list_connections`, `composio_list_tools`, and `composio_execute` alongside the existing `file_read`/`memory_recall`/`web_search_tool` trio. - explanatory comment → documents the gate relationship so a future reader sees why `composio_execute` is on a read-only agent and understands the enforcement lives in `composio/tools.rs` rather than here. `agent.toml` is the only planner-facing change required; the gate in `composio_execute::execute` keys on `SandboxMode::ReadOnly` and applies uniformly to every `ReadOnly`-sandbox agent, so the planner is the first consumer and the other read-only agents (welcome, critic, morning_briefing, trigger_triage) inherit the same behaviour automatically if they ever pick up composio.
Addresses CodeRabbit feedback on tinyhumansai#904. The dispatcher path (`ComposioExecuteTool::execute`) gates Write/Admin slugs when the calling agent is `sandbox_mode = "read_only"`, but `ComposioActionTool::execute` — the per-action surface spawned dynamically under `integrations_agent` — called `client.execute_tool` directly, bypassing the gate. Safe today because `integrations_agent` is `sandbox_mode = "none"` and no read-only agent spawns per-action tools, but the invariant "read-only blocks all Composio writes" becomes implicit rather than enforced. This mirrors the gate on the per-action path so both routes to the backend honour the same rule. Touches `src/openhuman/composio/tools.rs` + `src/openhuman/composio/action_tool.rs`: - `resolve_action_scope` → `pub(super)` so sibling modules reuse the same scope resolution (curated catalog + `classify_unknown` fallback). - `ComposioActionTool::execute` → new gate block at the top that reads `current_sandbox_mode()`, resolves the action's scope, and returns a `ToolResult::error` + `tracing::info!` audit line when the mode is `ReadOnly` and the scope is `Write` / `Admin`. `SandboxMode::None` / `Sandboxed` and the unset task-local value pass through unchanged. Three new tests mirror the existing gate tests on `ComposioExecuteTool`: `ReadOnly + Write slug → blocked`, `ReadOnly + Admin slug → blocked`, `unset → gate is a no-op`.
Make the planner's read-only posture explicit at the agent-definition
layer so a future TOML edit that drops `sandbox_mode` cannot silently
turn the planner into a write-capable agent.
`src/openhuman/agent/agents/loader.rs`:
- Add `planner_is_read_only_with_composio_meta_tools` test alongside
the existing `critic_is_read_only` / `code_executor_is_sandboxed_*`
pattern. Asserts:
- `def.sandbox_mode == SandboxMode::ReadOnly` — the runtime hook the
`ComposioExecuteTool::execute` and `ComposioActionTool::execute`
gates key on. Without this, both gates' `current_sandbox_mode()`
check would return `None` and Write/Admin actions would slip
through.
- `def.tools` is `ToolScope::Named` and contains all four composio
meta-tools (`composio_list_toolkits`, `composio_list_connections`,
`composio_list_tools`, `composio_execute`) so the planner can
discover + execute Read-scoped actions while the gate filters out
everything mutating.
- Test docstring spells out the failure mode (planner becomes
write-capable) so a maintainer reading the test in isolation knows
why each assertion is load-bearing.
968696d to
13e842a
Compare
|
Rebased clean. Planner is sandboxed — |
|
@coderabbitai could you verify that the planner agent is the only one that runs in composeio tools in sandbox mode and point out the code for me that does that? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 29507
Agents with
|
| Agent | sandbox_mode |
How composio_execute is included |
Gate active? |
|---|---|---|---|
planner |
read_only |
named = [...] explicit |
✅ Write/Admin blocked |
integrations_agent |
none |
named = [...] explicit |
❌ ungated (by design — this is the actual execution agent) |
morning_briefing |
read_only |
wildcard = {} (picks up all tools) |
✅ Write/Admin blocked |
morning_briefing (src/openhuman/agent/agents/morning_briefing/agent.toml) uses wildcard = {} with sandbox_mode = "read_only", so it also runs composio_execute under the sandbox gate.
The enforcement chain (three pieces of code)
1. Per-agent declaration — sandbox_mode is set in each agent.toml:
src/openhuman/agent/agents/planner/agent.tomlLine 7:sandbox_mode = "read_only"src/openhuman/agent/agents/morning_briefing/agent.tomlLine 6:sandbox_mode = "read_only"
2. Task-local wiring — the declared mode is installed around every tool-call loop in two places:
-
src/openhuman/agent/harness/subagent_runner/ops.rsLine 74 (subagent path):let outcome = with_current_sandbox_mode(definition.sandbox_mode, async { // fork/typed dispatch runs here })
-
src/openhuman/agent/bus.rsLines 182–188 (bus/turn path):let sandbox_mode = target_agent_id ... .map(|def| def.sandbox_mode) ...; let text = with_current_sandbox_mode(sandbox_mode, async { run_tool_call_loop(...) })
3. The gate itself — src/openhuman/composio/tools.rs in ComposioExecuteTool::execute: reads current_sandbox_mode() from the task-local; if it is ReadOnly, it resolves the action slug to a ToolScope (curated catalog → classify_unknown fallback, defaulting unknown slugs to Write) and returns ToolResult::error for Write or Admin scopes before any backend dispatch occurs.
Summary
The read-only Composio gate applies to any agent whose agent.toml declares sandbox_mode = "read_only" — currently planner (explicit tools) and morning_briefing (via wildcard). integrations_agent deliberately runs without a sandbox since it is the designated write-execution agent downstream. The gate is not planner-specific; it's a general mechanism keyed off the task-local populated from whatever agent definition is active at runtime.
🧠 Learnings used
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 437
File: src/openhuman/subconscious/executor.rs:33-50
Timestamp: 2026-04-08T16:46:58.073Z
Learning: In `src/openhuman/subconscious/executor.rs`, `needs_tools(title: &str)` is a UX routing convenience only — it selects the execution path (local model vs. agentic-v1 full vs. analysis-only) based on task title heuristics. Write-action prevention is enforced at a deeper layer: (1) the analysis-only prompt sent to agentic-v1 explicitly instructs it not to execute write actions for read-only tasks, and (2) tool execution is controlled by the agent runtime. Do not flag `needs_tools()` as an insufficient write-authorization boundary.
|
Matches what i see. one thing coderabbit missed |
Summary
composio_list_toolkits/list_connections/list_tools/composio_execute) in its whitelist, butcomposio_executerefuses anyWrite/Admin-scoped action slug when the calling agent'ssandbox_modeisReadOnly.CURRENT_AGENT_SANDBOX_MODEtask-local (sibling of the existingPARENT_CONTEXT/FORK_CONTEXTinharness/fork_context.rs) scoped around everytool.executeso the gate has zero impact on theTooltrait signature and zero blast radius outside the agent runtime.ToolScopecatalog (find_curated+classify_unknownfallback) — no new classification rules, no new per-action metadata.chore(tauri)fix for an unrelated pre-existing dup-key inapp/src-tauri/Cargo.tomlthat was makingcargo fmt/cargo metadata/cargo checkabort on main. The fix is self-contained and can be dropped from this PR if maintainers would rather split it.Problem
#685 asks for the planner agent to use Composio tools in strict read-only mode. Two concrete things were missing:
evaluate_tool_visibilityincomposio_executegates by the user's scope preference, which is the user's per-toolkit choice. There was no way to say "this specific agent may never mutate state even if the user allows writes."sandbox_modewas dormant metadata —AgentDefinition::sandbox_modehas been on agent TOMLs for a while (planner, critic, welcome, morning_briefing, trigger_triage all declareread_only) but nothing in the runtime ever read it. The planner's own whitelist was also capped atfile_read / memory_recall / web_search_toolwith no composio access at all.Solution
Commits land in dependency order, each single-concern:
chore(tauri): drop duplicate base64 dep key— one-line prereq.base64 = "0.22"was declared twice inapp/src-tauri/Cargo.toml; removed the second occurrence socargo *parses.feat(agent): add CURRENT_AGENT_SANDBOX_MODE task-local for tool-exec scope— newsandbox_contextmodule withtokio::task_local!carrier,current_sandbox_mode()reader,with_current_sandbox_mode()scope setter, and four unit tests covering all four state transitions (unset / scoped / leak / nested override).feat(agent): scope sandbox_mode task-local around tool execution— wires the task-local at the two production call sites:agent::bus::register_agent_handlersresolvestarget_agent_id→AgentDefinitionRegistry::global().get(id)and wrapsrun_tool_call_loop, andsubagent_runner::run_subagentwraps its fork-mode / typed-mode dispatch in the sub-agent's owndefinition.sandbox_mode. Task-local scope semantics propagate across every.awaitboundary inside the wrapped future, sotool.executeat every depth observes the mode without per-call-site plumbing.feat(composio): reject Write/Admin actions under ReadOnly sandbox— newresolve_action_scopehelper (curated catalog +classify_unknownfallback, defaults unparseable slugs toWriteso the gate fails closed), plus the gate itself inComposioExecuteTool::executeruns beforeevaluate_tool_visibilityso a read-only agent takes the cheapest rejection path.ToolResult::errorwith a concrete "agent is read-only, action iswrite" message the model can read and recover from.tracing::info!audit line. Five unit tests coverReadOnly + Write / Admin → blocked,ReadOnly + Read → passes to downstream gates,task-local unset → no-op (back-compat), andSandboxed → not read-only, no trigger.feat(planner): allow composio meta-tools under read-only sandbox— extendsplanner/agent.toml's[tools] namedlist with the four composio meta-tools and adds an inline comment pointing at where enforcement actually lives.Submission Checklist
openhuman::agent::harness::sandbox_context::tests::*) and 5 on the gate (openhuman::composio::tools::tests::sandbox_*). All pass locally.cargo test --manifest-path Cargo.toml -p openhuman --libsuite: 5229 passed, 0 failed. Planner-definition parsing test (agents::loader) also green after theagent.tomlchange.current_sandbox_mode()returnsNoneoutside any scope (direct CLI / JSON-RPC / tests), so the gate is strictly additive; no existing path gets stricter without aReadOnlyagent actually invoking it.///headers on every new public item (CURRENT_AGENT_SANDBOX_MODE,current_sandbox_mode,with_current_sandbox_mode,resolve_action_scope) plus a file-level//!rationale onsandbox_context.rs. Inline comments at every gate site + inplanner/agent.tomlpointing future readers at the enforcement location.tracing::info!audit line in the rejection path following the existing[composio][scopes]/[composio][sandbox]prefix convention.Impact
ReadOnly-sandbox agents that invokecomposio_executewith aWrite- orAdmin-scoped slug now get aToolResult::errorinstead of executing. Today the only such agent withcomposio_executein its whitelist is the planner (added in this PR). OtherReadOnlyagents (welcome / critic / morning_briefing / trigger_triage) don't have composio access yet, so they see zero change — but they'd inherit the same enforcement for free if anyone adds composio to their TOML later.SandboxMode::NoneandSandboxMode::Sandboxedpass through unchanged. The task-local beingNone(unit tests, CLI, direct JSON-RPC dispatch) is explicitly treated as "no gate" — tests hit this path regularly and stay green.ComposioActionTool(per-action tools spawned dynamically underintegrations_agent): unaffected.integrations_agentissandbox_mode = "none", so the task-local carriesNoneand the per-action path is not gated — correct by design sinceintegrations_agentis specifically the write-capable dispatcher.task_local::try_with+ one scope classification percomposio_executecall on the gated path. Zero cost on all other paths.Related
Closes #685
release.yml/ CI changes.file_write,memory_store,cron_create, …) is intentionally out of scope; happy to take that as a follow-up if you want to generalise.Summary by CodeRabbit
New Features
Chores