Skip to content

feat(agent): pure orchestrator pattern with per-skill delegation tools#496

Merged
senamakel merged 4 commits intotinyhumansai:mainfrom
sanil-23:feat/478-agent-taxonomy-prompt
Apr 10, 2026
Merged

feat(agent): pure orchestrator pattern with per-skill delegation tools#496
senamakel merged 4 commits intotinyhumansai:mainfrom
sanil-23:feat/478-agent-taxonomy-prompt

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented Apr 10, 2026

Summary

Refactors the main agent from a direct tool-calling model to a pure orchestrator that delegates all work through dynamically generated tools.

Token savings per LLM call

Source Before After Saved
TOOLS.md in system prompt ~4,500 tokens (12 Notion tools with params + examples) 0 (skipped) ~4,500
MEMORY.md in system prompt ~2,500 tokens (memory protocol, integration quirks, knowledge base) 0 (skipped) ~2,500
AGENTS.md in system prompt ~800 tokens (bloated routing tables, old agent descriptions) ~300 tokens (lean orchestrator prompt) ~500
HEARTBEAT.md in system prompt ~50 tokens 0 (skipped) ~50
Function-calling tool schemas ~4,000 tokens (52 tool specs: all system + 32 skill tools) ~800 tokens (8 orchestrator tools) ~3,200
Total per orchestrator LLM call ~10,750 tokens

This applies to every turn the orchestrator makes (typically 1-3 per user message). Sub-agent calls are unaffected — they already use narrow prompts.

Architecture: Orchestrator -> Sub-agent delegation

The orchestrator LLM now sees a flat list of well-named tools instead of a single spawn_subagent mega-tool:

Tool Delegates to Source
notion skills_agent (filter: notion) Dynamic — per installed skill
gmail skills_agent (filter: gmail) Dynamic — per installed skill
research researcher subagent Static archetype
run_code code_executor subagent Static archetype
review_code critic subagent Static archetype
plan planner subagent Static archetype
spawn_subagent Any (fallback) Advanced use (fork mode)

Each tool's execute() internally calls run_subagent with the correct definition + skill_filter.

Prompt architecture diagram

See docs/agent-prompt-architecture.excalidraw for the full visual breakdown of which .md files are used by which agent.

Orchestrator system prompt (from workspace files):

  • AGENTS.md — routing logic
  • SOUL.md — personality/voice
  • IDENTITY.md — mission/values
  • USER.md — user adaptation
  • BOOTSTRAP.md — first interaction flow
  • TOOLS.md — skipped (subagents get tool specs from registry)
  • MEMORY.md — skipped (memory auto-injected)
  • HEARTBEAT.md — skipped (handled by cron system)

Sub-agents get only: their archetype .md prompt + filtered ToolSpec schemas + auto-forwarded memory context.

Changes

  • orchestrator_tools.rs (new) — generates SkillDelegationTool per installed skill + ArchetypeDelegationTool per archetype at agent build time
  • builder.rs — orchestrator uses generated tools; visible_tool_specs filters function-calling schema sent to provider
  • turn.rs[agent] tagged logs for tool responses, state transitions, delegation decisions; sends visible_tool_specs not full specs
  • types.rs — added visible_tool_specs and last_memory_context fields
  • definition.rsModelSpec::Hint resolves to {hint}-v1 (fixes hint:agentic -> agentic-v1)
  • archetypes.rs — orchestrator/planner use reasoning-v1; code_executor/tool_maker use coding-v1; all others use agentic-v1; archivist uses local-v1
  • subagent_runner.rs — auto-injects parent's memory context into subagent prompts
  • fork_context.rs — added memory_context to ParentExecutionContext
  • prompt.rs — orchestrator skips TOOLS/MEMORY/HEARTBEAT; workspace files auto-sync via builtin-hash mechanism
  • AGENTS.md — lean orchestrator prompt (no routing tables, tools are self-describing)
  • spawn_subagent.rs — fixed validate_skill_filter to use skill_id from engine tuple (was splitting on __ in raw tool name, always failed)
  • ops.rsMAIN_AGENT_TOOL_ALLOWLIST legacy constant, orchestrator tools are now dynamic

Bug fixes

  • ModelSpec::Hint("agentic") now resolves to agentic-v1 instead of hint:agentic (backend rejected the old format)
  • validate_skill_filter now correctly matches skill IDs (was always returning empty available list)
  • Memory context forwarded from orchestrator to subagents via ParentExecutionContext.memory_context
  • Workspace .md files auto-sync when compiled-in defaults change (hash-based migration)

Test plan

  • cargo check — compiles clean (14 pre-existing warnings)
  • cargo test --lib — 2303 passed, 0 new failures (11 pre-existing)
  • Router tests pass — model_spec_resolve_hint_appends_v1, resolve_model_default
  • Prompt tests pass — identity_section_creates_missing_workspace_files, identity_section_with_aieos_includes_workspace_files
  • Manual RPC test: orchestrator correctly routes gmail -> skills_agent(gmail)
  • Manual RPC test: simple Q&A answered directly (no delegation)
  • Manual RPC test: research tool delegates to researcher subagent
  • Skills verified running via openhuman.skills_call_tool (gmail get-profile returned user data)
  • Full E2E with desktop app (yarn core:stage && yarn tauri dev) — skills loaded + orchestrator routing end-to-end

Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Orchestrator architecture with delegated sub-agent execution and dynamic skill-based tools
    • Curated tool visibility/filtering for the main agent and a fallback spawn-subagent tool
    • Per-turn memory context forwarded into sub-agent execution
    • Workspace file sync using builtin hashes to preserve user edits
  • Documentation

    • Rewrote agent role docs for orchestrator model and added an Excalidraw architecture diagram
  • Bug Fixes

    • Normalized model hint resolution format

sanil-23 and others added 2 commits April 10, 2026 22:04
tinyhumansai#478)

Refactors the main agent from a direct tool-calling model to a pure
orchestrator that delegates all work through dynamically generated tools.

Architecture changes:
- Orchestrator only sees generated tools (notion, gmail, research,
  run_code, review_code, plan, spawn_subagent) — skill tools are
  architecturally unreachable from the main agent
- Each installed skill auto-generates a delegation tool at build time
  (SkillDelegationTool) that routes to skills_agent with the correct
  skill_filter
- Static archetype tools (research, run_code, etc.) delegate to their
  respective sub-agents
- visible_tool_specs filters the function-calling schema sent to the
  provider, enforcing the orchestrator boundary at the API level

Prompt changes:
- Rewrote AGENTS.md as a lean orchestrator prompt — no more routing
  tables or agent_id instructions
- Orchestrator skips TOOLS.md, MEMORY.md, HEARTBEAT.md (~6k tokens
  saved per turn) — subagents get tool specs from the registry
- Workspace .md files auto-sync via builtin-hash mechanism so prompt
  updates ship automatically to existing installs

Bug fixes:
- ModelSpec::Hint now resolves to {hint}-v1 (e.g. agentic-v1) instead
  of hint:agentic which the backend rejected
- validate_skill_filter now uses skill_id from the engine tuple instead
  of splitting on __ in the raw tool name (which always failed)
- Memory context forwarded to subagents via ParentExecutionContext

Observability:
- Added [agent] tagged logs for tool responses, agent state transitions,
  and delegation decisions throughout turn.rs

See docs/agent-prompt-architecture.excalidraw for the visual diagram.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

Introduces an orchestrator agent architecture: an allowlist to restrict tools visible to the main agent, new orchestrator tools (skill/archetype delegators + spawn_subagent), forwarding per-turn memory context into subagents, prompt rendering changes for orchestrator mode, workspace builtin-file sync, and model-hint resolution updates.

Changes

Cohort / File(s) Summary
Documentation & Architecture
docs/agent-prompt-architecture.excalidraw, src/openhuman/agent/prompts/AGENTS.md
Adds Excalidraw architecture diagram and rewrites AGENTS.md to define an Orchestrator role that delegates to subagents/tools and documents prompt contracts.
Agent Core Types & Builder
src/openhuman/agent/agent/types.rs, src/openhuman/agent/agent/builder.rs
Adds visible_tool_specs, visible_tool_names, and last_memory_context to Agent; new AgentBuilder::visible_tool_names to configure visible tool allowlist.
Agent Execution & Turn Logic
src/openhuman/agent/agent/turn.rs
Uses visible_tool_specs for function-calling, persists last_memory_context, wires memory and visible-tool info into subagent contexts, blocks calls to non-visible tools, and adds detailed logging.
Prompt Assembly & Workspace Sync
src/openhuman/agent/prompt.rs
Adds visible_tool_names to PromptContext; orchestrator mode omits TOOLS/HEARTBEAT/MEMORY files from prompt, filters tools/skills, and replaces inject-if-missing with sync_workspace_file using builtin-hash sidecars.
Orchestrator Tools & Spawn
src/openhuman/tools/orchestrator_tools.rs, src/openhuman/tools/spawn_subagent.rs, src/openhuman/tools/ops.rs, src/openhuman/tools/mod.rs
New collect_orchestrator_tools() builds skill-delegation and archetype-delegation tools plus SpawnSubagentTool fallback; adds helper exports and revised skill-filter validation.
Subagent Context & Runner
src/openhuman/agent/harness/fork_context.rs, src/openhuman/agent/harness/subagent_runner.rs
Adds memory_context to ParentExecutionContext; subagent runner merges parent memory context with task context when composing user messages.
Model / Archetype Resolution
src/openhuman/agent/harness/definition.rs, src/openhuman/agent/harness/executor.rs, src/openhuman/agent/harness/archetypes.rs
Changes hint resolution from hint:{hint}{hint}-v1; updates tests and changes Critic archetype default hint to agentic.
Tests & Fixtures
src/...
Updated tests/fixtures to include visible_tool_names and builtin-hash files so workspace sync is deterministic.

Sequence Diagram

sequenceDiagram
    actor User
    participant Orchestrator as Orchestrator Agent
    participant LLM as LLM / Provider
    participant ToolRegistry as Tool Registry
    participant DelegatingTool as Delegating Tool
    participant SubAgent as Sub-Agent

    User->>Orchestrator: user message + context
    Orchestrator->>ToolRegistry: collect_orchestrator_tools()
    ToolRegistry-->>Orchestrator: orchestrator-visible tools
    Orchestrator->>LLM: call with orchestrator system prompt (omit TOOLS/HEARTBEAT/MEMORY) + visible tools
    LLM-->>Orchestrator: returns tool call (delegating tool)
    Orchestrator->>DelegatingTool: execute(tool args)
    DelegatingTool->>Orchestrator: dispatch_subagent(event + options)
    DelegatingTool->>SubAgent: spawn with ParentExecutionContext (includes memory_context + filtered tools)
    SubAgent->>LLM: run subagent (archetype prompt + forwarded context)
    SubAgent-->>DelegatingTool: subagent result
    DelegatingTool-->>Orchestrator: tool result
    Orchestrator->>User: aggregated response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • senamakel

Poem

🐰 I hop with prompts tucked in my paw,

Orchestrator whispers, “Delegate, dear.”
Subagents spring where edges draw,
Memory carried, tools kept near,
A rabbit cheers — the flow is clear!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(agent): pure orchestrator pattern with per-skill delegation tools' accurately and concisely describes the primary architectural change: refactoring the agent to use a pure orchestrator pattern with per-skill delegation tools instead of direct tool-calling.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/agent/agent/turn.rs (1)

623-626: ⚠️ Potential issue | 🔴 Critical

Fork mode should replay visible_tool_specs, not the full registry.

Line 625 captures self.tool_specs, but the parent turn now sends self.visible_tool_specs. In orchestrator mode, a forked child will regain hidden tools and stop being an exact replay of the parent turn.

🪞 Suggested fix
         harness::ForkContext {
             system_prompt: Arc::new(system_prompt),
-            tool_specs: Arc::clone(&self.tool_specs),
+            tool_specs: Arc::clone(&self.visible_tool_specs),
             message_prefix: Arc::new(messages),
             cache_boundary: None,
             fork_task_prompt,
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/turn.rs` around lines 623 - 626, The ForkContext
construction is capturing the full tool registry via self.tool_specs instead of
the parent-turn visible subset; replace the reference to self.tool_specs with
the parent-provided visible set (self.visible_tool_specs) so the fork replays
exactly the parent's visible tools — e.g., in the harness::ForkContext
initializer use Arc::clone(&self.visible_tool_specs) (or
Arc::new(self.visible_tool_specs) as appropriate) instead of
Arc::clone(&self.tool_specs) while leaving system_prompt and message_prefix
unchanged.
🧹 Nitpick comments (2)
src/openhuman/agent/harness/subagent_runner.rs (1)

245-256: Skip blank context parts before rendering the [Context] block.

Right now, whitespace-only values can still trigger a noisy context wrapper. Filter empties before join.

♻️ Suggested refinement
-    let mut context_parts: Vec<&str> = Vec::new();
-    if let Some(ref mem_ctx) = parent.memory_context {
-        context_parts.push(mem_ctx);
-    }
-    if let Some(ref ctx) = options.context {
-        context_parts.push(ctx);
-    }
+    let mut context_parts: Vec<&str> = Vec::new();
+    if let Some(ref mem_ctx) = parent.memory_context {
+        if !mem_ctx.trim().is_empty() {
+            context_parts.push(mem_ctx);
+        }
+    }
+    if let Some(ref ctx) = options.context {
+        if !ctx.trim().is_empty() {
+            context_parts.push(ctx);
+        }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/subagent_runner.rs` around lines 245 - 256, The
current construction of context_parts (used to build user_message) can include
whitespace-only strings from parent.memory_context or options.context which
still trigger the [Context] block; update the logic that populates context_parts
(the block referencing parent.memory_context and options.context) to trim and
skip any entries where .trim().is_empty() so only non-empty context pieces are
pushed, then leave the rest of the user_message construction (including
task_prompt) unchanged.
src/openhuman/tools/ops.rs (1)

259-263: Add a focused unit test for main_agent_tools filtering.

Given this is a compatibility path, a tiny regression test would prevent silent visibility drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/tools/ops.rs` around lines 259 - 263, Add a focused unit test
for the main_agent_tools function to guard the filtering behavior: create test
helpers that implement the Tool trait (e.g., dummy tools with distinct name()
values), call main_agent_tools with a Vec of allowed and disallowed dummy tools,
and assert the returned Vec only contains tools whose name() is listed in
MAIN_AGENT_TOOL_ALLOWLIST and preserves expected count/order. Use the same
module path as ops.rs tests and include edge cases like empty input and
all-disallowed input to catch regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/agent/agent/builder.rs`:
- Around line 444-455: The orchestrator-generated tools (from
tools::orchestrator_tools::collect_orchestrator_tools()) can duplicate entries
already registered by all_tools_with_runtime() (notably "spawn_subagent"), so
before calling tools.extend(orchestrator_tools) filter out any orchestrator tool
whose name already exists in the current registry; obtain the existing names
from the current tools collection (or build a HashSet of tool.name() for tools)
and only extend with orchestrator tools whose name is not in that set, keeping
the visible HashSet consistent and preventing duplicate "spawn_subagent" specs
that confuse build() and dispatch.

In `@src/openhuman/agent/agent/turn.rs`:
- Around line 221-223: The code currently only hides tools from the outbound
schema (using self.tool_dispatcher.should_send_tool_specs() and
self.visible_tool_specs) but still resolves/executes against the full self.tools
set later, allowing hidden tools to be invoked; update the dispatch/resolution
path (the code that resolves parser-driven calls around the dispatch/resolve
block that currently references self.tools, lines ~474-491) to enforce the
visible-tool allowlist by filtering self.tools to only entries whose names
appear in self.visible_tool_specs (or by checking membership in that allowlist
before any resolution/execution) so that hidden tools cannot be dispatched even
if requested.
- Around line 373-381: The loop that logs tool results currently includes
raw/sliced output (using truncate_with_ellipsis(&r.output, 300)) which can leak
secrets; replace that with metadata-only logging and a sanitized summary
instead: remove direct use of r.output in the log and call a
sanitizer/summarizer (e.g., create or use a function like sanitize_tool_output
or summarize_tool_output) to produce a redacted one-line summary, then log
r.name, r.success, r.output.chars().count() or a content hash, and the sanitized
summary; apply the same change for the other occurrence referenced around lines
520-530 so no raw tool payload is written to logs.

In `@src/openhuman/agent/harness/definition.rs`:
- Around line 198-207: The ModelSpec::resolve implementation currently maps
ModelSpec::Hint to "{hint}-v1", which breaks the router/consumer hint contract;
change the Hint arm in the resolve method (enum ModelSpec, function resolve) to
return the consumer-compatible form "hint:{hint}" (e.g., format!("hint:{}",
hint)) so hint-based route lookups and hint stripping continue to work and do
not bypass model_routes; update the corresponding duplicate at the other
occurrence (around the other resolve usage) to the same "hint:{hint}" behavior.

In `@src/openhuman/agent/harness/executor.rs`:
- Line 567: The code returns a hardcoded string using
archetype.default_model_hint() (format!("{}-v1")) which bypasses the hint-based
router resolution; update the default branch in resolve_model to call the
existing hint-resolution path instead of constructing "{hint}-v1" directly—use
the same router/hint resolution logic the function uses for explicit hints
(refer to resolve_model and archetype.default_model_hint()) so deployments keyed
on hint:* go through the route-table lookup rather than returning a fabricated
model name.

In `@src/openhuman/agent/prompt.rs`:
- Around line 199-227: The orchestrator branch currently omits TOOLS.md,
HEARTBEAT.md, and MEMORY.md from files passed to inject_workspace_file which
also performs sync via sync_workspace_file, so those on-disk files stop being
updated; modify the logic in the block around is_orchestrator (and where
inject_workspace_file and sync_workspace_file are used) to decouple syncing from
prompt injection — keep the shorter files list for prompt rendering when
is_orchestrator is true but still call sync_workspace_file (or a small helper
like sync_workspace_files) for "TOOLS.md", "HEARTBEAT.md", and "MEMORY.md"
(using ctx.workspace_dir and the same parameters as inject_workspace_file) so
the files remain created/updated even when they are not injected into the
prompt.

In `@src/openhuman/tools/ops.rs`:
- Around line 251-253: The const declaration MAIN_AGENT_TOOL_ALLOWLIST is
failing CI formatting; run the Rust formatter (cargo fmt or rustfmt) on the file
or project and reformat the declaration for MAIN_AGENT_TOOL_ALLOWLIST so it
matches the project's rustfmt style (proper spacing/line breaks and trailing
commas as required), then commit the formatted change.

---

Outside diff comments:
In `@src/openhuman/agent/agent/turn.rs`:
- Around line 623-626: The ForkContext construction is capturing the full tool
registry via self.tool_specs instead of the parent-turn visible subset; replace
the reference to self.tool_specs with the parent-provided visible set
(self.visible_tool_specs) so the fork replays exactly the parent's visible tools
— e.g., in the harness::ForkContext initializer use
Arc::clone(&self.visible_tool_specs) (or Arc::new(self.visible_tool_specs) as
appropriate) instead of Arc::clone(&self.tool_specs) while leaving system_prompt
and message_prefix unchanged.

---

Nitpick comments:
In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 245-256: The current construction of context_parts (used to build
user_message) can include whitespace-only strings from parent.memory_context or
options.context which still trigger the [Context] block; update the logic that
populates context_parts (the block referencing parent.memory_context and
options.context) to trim and skip any entries where .trim().is_empty() so only
non-empty context pieces are pushed, then leave the rest of the user_message
construction (including task_prompt) unchanged.

In `@src/openhuman/tools/ops.rs`:
- Around line 259-263: Add a focused unit test for the main_agent_tools function
to guard the filtering behavior: create test helpers that implement the Tool
trait (e.g., dummy tools with distinct name() values), call main_agent_tools
with a Vec of allowed and disallowed dummy tools, and assert the returned Vec
only contains tools whose name() is listed in MAIN_AGENT_TOOL_ALLOWLIST and
preserves expected count/order. Use the same module path as ops.rs tests and
include edge cases like empty input and all-disallowed input to catch
regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b7c6a96c-b54e-4344-a682-50288965229a

📥 Commits

Reviewing files that changed from the base of the PR and between b8ded44 and 52729e0.

📒 Files selected for processing (15)
  • docs/agent-prompt-architecture.excalidraw
  • src/openhuman/agent/agent/builder.rs
  • src/openhuman/agent/agent/turn.rs
  • src/openhuman/agent/agent/types.rs
  • src/openhuman/agent/harness/archetypes.rs
  • src/openhuman/agent/harness/definition.rs
  • src/openhuman/agent/harness/executor.rs
  • src/openhuman/agent/harness/fork_context.rs
  • src/openhuman/agent/harness/subagent_runner.rs
  • src/openhuman/agent/prompt.rs
  • src/openhuman/agent/prompts/AGENTS.md
  • src/openhuman/tools/mod.rs
  • src/openhuman/tools/ops.rs
  • src/openhuman/tools/orchestrator_tools.rs
  • src/openhuman/tools/spawn_subagent.rs

Comment thread src/openhuman/agent/agent/builder.rs
Comment thread src/openhuman/agent/agent/turn.rs
Comment thread src/openhuman/agent/agent/turn.rs
Comment on lines +198 to +207
/// Hints are resolved to `{hint}-v1` (e.g. `"agentic"` → `"agentic-v1"`)
/// which matches the backend's standard model naming convention. When
/// a `RouterProvider` is present its route table takes priority over
/// this default; when no router is configured (empty `model_routes`)
/// the resolved name goes directly to the backend.
pub fn resolve(&self, parent_model: &str) -> String {
match self {
Self::Inherit => parent_model.to_string(),
Self::Exact(name) => name.clone(),
Self::Hint(hint) => format!("hint:{hint}"),
Self::Hint(hint) => format!("{hint}-v1"),
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

ModelSpec::Hint no longer matches router/consumer hint contract.

At Line 207, resolving hints as "{hint}-v1" breaks consumers that still expect hint:{hint} (e.g., route lookup and hint stripping paths). This makes the doc at Line 200-202 inaccurate and can bypass configured model routes.

🔧 Proposed compatibility fix
-            Self::Hint(hint) => format!("{hint}-v1"),
+            Self::Hint(hint) => format!("hint:{hint}"),

Also applies to: 470-472

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/definition.rs` around lines 198 - 207, The
ModelSpec::resolve implementation currently maps ModelSpec::Hint to "{hint}-v1",
which breaks the router/consumer hint contract; change the Hint arm in the
resolve method (enum ModelSpec, function resolve) to return the
consumer-compatible form "hint:{hint}" (e.g., format!("hint:{}", hint)) so
hint-based route lookups and hint stripping continue to work and do not bypass
model_routes; update the corresponding duplicate at the other occurrence (around
the other resolve usage) to the same "hint:{hint}" behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional — this is a bug fix, not a regression. The hint:agentic format was being sent directly to the backend when model_routes is empty (most deployments), and the backend rejects it with Model 'hint:agentic' is not available. The RouterProvider::resolve() that strips the hint: prefix is only active when model_routes is populated. Since the backend model registry uses agentic-v1, reasoning-v1, coding-v1 naming, resolving directly to {hint}-v1 works in both cases (with and without router).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanil-23, thanks for the detailed explanation — that makes complete sense. My concern was unfounded. The old hint:{hint} format was only safe when RouterProvider with a populated model_routes was in the path; without it, the raw string hit the backend and was rejected. Resolving directly to {hint}-v1 is correct for both the routed and non-routed cases, and the updated doc comment accurately captures that behaviour. Apologies for the noise.


✏️ Learnings added
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 496
File: src/openhuman/agent/harness/definition.rs:198-207
Timestamp: 2026-04-10T18:46:46.175Z
Learning: In `src/openhuman/agent/harness/definition.rs`, `ModelSpec::Hint(hint)` intentionally resolves to `"{hint}-v1"` (e.g., `"agentic-v1"`, `"reasoning-v1"`, `"coding-v1"`). The old `"hint:{hint}"` format was a bug: it was sent verbatim to the backend when `model_routes` is empty (the common deployment), causing backend rejections (`Model 'hint:...' is not available`). The `hint:` prefix stripping in `RouterProvider::resolve()` is only active when `model_routes` is populated. Do not flag the `{hint}-v1` resolution as a router contract violation.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-09T22:52:14.248Z
Learning: New or changed behavior must ship with matching documentation. Add concise rustdoc / code comments where the flow is not obvious, and update `AGENTS.md`, architecture docs, or feature docs when repository rules or user-visible behavior change.

Learnt from: graycyrus
Repo: tinyhumansai/openhuman PR: 283
File: app/test/e2e/specs/skill-execution-flow.spec.ts:67-114
Timestamp: 2026-04-03T06:28:58.632Z
Learning: In `app/test/e2e/specs/skill-execution-flow.spec.ts`, the model→agent→tool→conversation E2E path (model-triggered tool_calls) is intentionally deferred to a future PR. It is tracked via an `it.skip` placeholder at the end of the spec. The blocker is that the mock backend does not yet support returning structured `tool_calls` in chat completion responses. The current spec only covers the deterministic skill runtime RPC surface: `skills_start` → `skills_list_tools` → `skills_call_tool` → `skills_stop`.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 437
File: src/openhuman/subconscious/executor.rs:33-50
Timestamp: 2026-04-08T16:46:58.073Z
Learning: In `src/openhuman/subconscious/executor.rs`, `needs_tools(title: &str)` is a UX routing convenience only — it selects the execution path (local model vs. agentic-v1 full vs. analysis-only) based on task title heuristics. Write-action prevention is enforced at a deeper layer: (1) the analysis-only prompt sent to agentic-v1 explicitly instructs it not to execute write actions for read-only tasks, and (2) tool execution is controlled by the agent runtime. Do not flag `needs_tools()` as an insufficient write-authorization boundary.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

}
}
format!("hint:{}", archetype.default_model_hint())
format!("{}-v1", archetype.default_model_hint())
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

resolve_model default path bypasses hint-based router resolution.

At Line 567, returning "{hint}-v1" directly can skip route-table hint resolution in deployments keyed on hint:*, and the updated tests lock in that incompatible behavior.

🔧 Proposed fix
-    format!("{}-v1", archetype.default_model_hint())
+    format!("hint:{}", archetype.default_model_hint())

Also applies to: 623-632

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/executor.rs` at line 567, The code returns a
hardcoded string using archetype.default_model_hint() (format!("{}-v1")) which
bypasses the hint-based router resolution; update the default branch in
resolve_model to call the existing hint-resolution path instead of constructing
"{hint}-v1" directly—use the same router/hint resolution logic the function uses
for explicit hints (refer to resolve_model and archetype.default_model_hint())
so deployments keyed on hint:* go through the route-table lookup rather than
returning a fabricated model name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same rationale as the ModelSpec::Hint comment — the executor resolve_model path mirrors the same fix. When model_routes is empty (no RouterProvider), the hint: prefix goes directly to the backend which rejects it. {hint}-v1 works in both configurations.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanil-23, thanks for the clarification — that makes sense. When no RouterProvider is in place the hint: prefix is passed verbatim to the backend and gets rejected, whereas {hint}-v1 is a valid model identifier in both configurations. My concern was misplaced; the fix is correct as implemented.


✏️ Learnings added
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 496
File: src/openhuman/agent/harness/executor.rs:567-567
Timestamp: 2026-04-10T18:46:53.247Z
Learning: In `src/openhuman/agent/harness/executor.rs` (and `src/openhuman/agent/harness/definition.rs`), the correct format for hint-based model resolution is `"{hint}-v1"` (e.g., `"coding-v1"`, `"reasoning-v1"`), NOT `"hint:{hint}"`. When no RouterProvider is configured, `hint:` is forwarded verbatim to the backend and rejected. `{hint}-v1` works in both the RouterProvider and direct-backend configurations. Do not flag `{hint}-v1` as bypassing hint resolution.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 437
File: src/openhuman/subconscious/executor.rs:33-50
Timestamp: 2026-04-08T16:46:58.073Z
Learning: In `src/openhuman/subconscious/executor.rs`, `needs_tools(title: &str)` is a UX routing convenience only — it selects the execution path (local model vs. agentic-v1 full vs. analysis-only) based on task title heuristics. Write-action prevention is enforced at a deeper layer: (1) the analysis-only prompt sent to agentic-v1 explicitly instructs it not to execute write actions for read-only tasks, and (2) tool execution is controlled by the agent runtime. Do not flag `needs_tools()` as an insufficient write-authorization boundary.

Learnt from: graycyrus
Repo: tinyhumansai/openhuman PR: 283
File: app/test/e2e/specs/skill-execution-flow.spec.ts:67-114
Timestamp: 2026-04-03T06:28:58.632Z
Learning: In `app/test/e2e/specs/skill-execution-flow.spec.ts`, the model→agent→tool→conversation E2E path (model-triggered tool_calls) is intentionally deferred to a future PR. It is tracked via an `it.skip` placeholder at the end of the spec. The blocker is that the mock backend does not yet support returning structured `tool_calls` in chat completion responses. The current spec only covers the deterministic skill runtime RPC surface: `skills_start` → `skills_list_tools` → `skills_call_tool` → `skills_stop`.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

Comment thread src/openhuman/agent/prompt.rs Outdated
Comment thread src/openhuman/tools/ops.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/openhuman/agent/agent/turn.rs (2)

371-379: ⚠️ Potential issue | 🟠 Major

Stop logging raw tool output previews.

Line 372 and Line 525 log direct tool payload content (truncated, but still sensitive). This can leak secrets/PII into normal logs.

🧹 Log metadata + sanitized summary only
                 for r in &results {
-                    let preview = truncate_with_ellipsis(&r.output, 300);
+                    let summary = hooks::sanitize_tool_output(&r.output, &r.name, r.success);
                     log::info!(
-                        "[agent] tool response name={} success={} output_chars={}\n{}",
+                        "[agent] tool response name={} success={} output_chars={} summary={}",
                         r.name,
                         r.success,
                         r.output.chars().count(),
-                        preview
+                        summary
                     );
                 }
-        log::debug!(
-            "[agent] tool output for {}: {}",
-            call.name,
-            truncate_with_ellipsis(&result, 500)
-        );
+        let output_summary = hooks::sanitize_tool_output(&result, &call.name, success);
+        log::debug!(
+            "[agent] tool output summary name={} success={} output_chars={} summary={}",
+            call.name,
+            success,
+            result.chars().count(),
+            output_summary
+        );
@@
-        let output_summary = hooks::sanitize_tool_output(&result, &call.name, success);
+        // reuse `output_summary` computed above

Also applies to: 524-528

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/turn.rs` around lines 371 - 379, The current log in
the loop over results prints raw/truncated tool output
(truncate_with_ellipsis(&r.output)) which can leak secrets; remove the raw
preview from the log and instead log only metadata (r.name, r.success,
r.output.chars().count()) plus a sanitized or redacted summary. Replace the use
of truncate_with_ellipsis(&r.output) in the log::info call with either a call to
a sanitization/summarization helper (e.g., sanitize_tool_output(&r.output) or
summarize_tool_output(&r.output)) that strips/obfuscates PII/secrets, or a fixed
placeholder like "<redacted>" if no sanitizer exists; update both occurrences
around the loop over results and any other log::info that currently uses
truncate_with_ellipsis so logs never contain raw tool payloads.

472-492: ⚠️ Potential issue | 🔴 Critical

Enforce allowlist during dispatch, not only in provider schema.

Line 472 still resolves against self.tools without checking visibility. A hidden tool name can still execute if emitted by the parser path, bypassing the orchestrator boundary.

🔒 Minimal guard
-        let (raw_result, success) =
-            if let Some(tool) = self.tools.iter().find(|t| t.name() == call.name) {
+        let (raw_result, success) = if !self.visible_tool_names.is_empty()
+            && !self.visible_tool_names.contains(&call.name)
+        {
+            (
+                format!("Tool '{}' is not available to this agent", call.name),
+                false,
+            )
+        } else if let Some(tool) = self.tools.iter().find(|t| t.name() == call.name) {
                 let exec = tool.execute(call.arguments.clone());
                 let outcome = if let Some(fork_ctx) = fork_context_for_call {
                     harness::with_fork_context(fork_ctx, exec).await
                 } else {
                     exec.await
                 };
                 match outcome {
                     Ok(r) => {
                         if !r.is_error {
                             (r.output(), true)
                         } else {
                             (format!("Error: {}", r.output()), false)
                         }
                     }
                     Err(e) => (format!("Error executing {}: {e}", call.name), false),
                 }
             } else {
                 (format!("Unknown tool: {}", call.name), false)
             };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/turn.rs` around lines 472 - 492, The dispatch
currently resolves against self.tools and executes any matching tool by name
(using tool.execute and call.name) without verifying visibility; add an
allowlist/visibility check on the discovered tool before running
harness::with_fork_context or exec.await (e.g., require
tool.is_visible()/tool.allowed_to_run() or equivalent) and if the check fails
return a denied/unknown-tool response (instead of executing) so hidden tools
cannot be invoked via the parser path; apply this check in the same matching
block that finds the tool (before calling tool.execute) and adjust the error
branch to reflect permission denied when visibility fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/openhuman/agent/agent/turn.rs`:
- Around line 371-379: The current log in the loop over results prints
raw/truncated tool output (truncate_with_ellipsis(&r.output)) which can leak
secrets; remove the raw preview from the log and instead log only metadata
(r.name, r.success, r.output.chars().count()) plus a sanitized or redacted
summary. Replace the use of truncate_with_ellipsis(&r.output) in the log::info
call with either a call to a sanitization/summarization helper (e.g.,
sanitize_tool_output(&r.output) or summarize_tool_output(&r.output)) that
strips/obfuscates PII/secrets, or a fixed placeholder like "<redacted>" if no
sanitizer exists; update both occurrences around the loop over results and any
other log::info that currently uses truncate_with_ellipsis so logs never contain
raw tool payloads.
- Around line 472-492: The dispatch currently resolves against self.tools and
executes any matching tool by name (using tool.execute and call.name) without
verifying visibility; add an allowlist/visibility check on the discovered tool
before running harness::with_fork_context or exec.await (e.g., require
tool.is_visible()/tool.allowed_to_run() or equivalent) and if the check fails
return a denied/unknown-tool response (instead of executing) so hidden tools
cannot be invoked via the parser path; apply this check in the same matching
block that finds the tool (before calling tool.execute) and adjust the error
branch to reflect permission denied when visibility fails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2be3465c-ae6f-4362-908e-0004b97ef843

📥 Commits

Reviewing files that changed from the base of the PR and between 52729e0 and 95f914b.

📒 Files selected for processing (3)
  • src/openhuman/agent/agent/turn.rs
  • src/openhuman/agent/prompt.rs
  • src/openhuman/tools/ops.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/tools/ops.rs
  • src/openhuman/agent/prompt.rs

senamakel
senamakel previously approved these changes Apr 10, 2026
… sync

- Enforce visible-tool allowlist at dispatch time (not just schema)
- Fork mode uses visible_tool_specs (not full registry)
- De-duplicate spawn_subagent when extending orchestrator tools
- Raw tool output moved to debug level, info level logs metadata only
- Decouple workspace file sync from prompt rendering so skipped files
  still get synced to disk

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sanil-23
Copy link
Copy Markdown
Contributor Author

Re: fork mode tool_specs (outside-diff comment)

Fixed — build_fork_context now uses self.visible_tool_specs instead of the full registry, so fork mode replays exactly the parent's visible tool set. See commit 0989023.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/openhuman/agent/agent/builder.rs (1)

75-78: The rustdoc describes a None call path this setter doesn't expose.

visible_tool_names() only accepts a HashSet<String>, so callers cannot "Pass None" here. Reword this to "omit this setter" or "pass an empty set" so the public API docs match the actual behavior.

Based on learnings, "New or changed behavior must ship with matching documentation. Add concise rustdoc / code comments where the flow is not obvious, and update AGENTS.md, architecture docs, or feature docs when repository rules or user-visible behavior change."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/builder.rs` around lines 75 - 78, The docstring for
the builder method visible_tool_names is incorrect about accepting None; update
the rustdoc on the visible_tool_names(found in builder.rs) to state that callers
should either omit calling the setter to leave all tools visible or pass an
empty HashSet to hide all tools (instead of "Pass `None`"), and add a brief
clarifying comment in the visible_tool_names implementation explaining the
chosen convention; also update the consumer-facing docs (e.g., AGENTS.md or
architecture/feature docs) to reflect this behavior so public docs and code are
consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/agent/agent/builder.rs`:
- Around line 449-461: The current name-based de-dup allows a raw skill tool to
remain registered while the orchestrator delegator with the same name is
filtered out, yet `visible` still contains that name causing `build()` to
publish the wrong implementation; update the block that collects
`orchestrator_tools`/computes `visible`/`existing_names` (symbols:
collect_orchestrator_tools, visible, existing_names, tools.extend, build,
spawn_subagent) to detect name collisions and fail fast: compute the
intersection of orchestrator tool names and existing_names and if non-empty
return an Err or panic with a clear message (listing the colliding names) so the
PR must resolve ambiguous names before proceeding; alternatively remove collided
names from `visible` so they can't be published, but prefer the fail-fast
behavior until names are made unambiguous.

In `@src/openhuman/agent/prompt.rs`:
- Around line 383-399: The code currently uses
std::fs::read_to_string(&hash_path).unwrap_or_default(), which treats any read
error (including NotFound) as an empty hash and causes existing workspace files
(path) to be overwritten; change to explicitly match the read result: if
Ok(stored_hash) continue with the existing logic; if Err(e) and e.kind() ==
std::io::ErrorKind::NotFound then do NOT overwrite an existing path — if
path.exists() only write the sidecar/hash file with current_hash (do not write
default_content), and only write default_content when the workspace file itself
does not exist; for other read errors (Err but not NotFound) log a warning and
abort the write to avoid clobbering; update the logic around stored_hash,
current_hash, hash_path, and path to reflect this explicit match-based handling.

---

Nitpick comments:
In `@src/openhuman/agent/agent/builder.rs`:
- Around line 75-78: The docstring for the builder method visible_tool_names is
incorrect about accepting None; update the rustdoc on the
visible_tool_names(found in builder.rs) to state that callers should either omit
calling the setter to leave all tools visible or pass an empty HashSet to hide
all tools (instead of "Pass `None`"), and add a brief clarifying comment in the
visible_tool_names implementation explaining the chosen convention; also update
the consumer-facing docs (e.g., AGENTS.md or architecture/feature docs) to
reflect this behavior so public docs and code are consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fed364f4-d583-4c72-b845-b1c4c4304afc

📥 Commits

Reviewing files that changed from the base of the PR and between 95f914b and 0989023.

📒 Files selected for processing (3)
  • src/openhuman/agent/agent/builder.rs
  • src/openhuman/agent/agent/turn.rs
  • src/openhuman/agent/prompt.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/agent/agent/turn.rs

Comment thread src/openhuman/agent/agent/builder.rs
Comment thread src/openhuman/agent/prompt.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants