Skip to content

feat(agent): sub-agents, reasoning→agentic routing, layered context pipeline#474

Merged
senamakel merged 14 commits intotinyhumansai:mainfrom
senamakel:feat/sub-agents
Apr 10, 2026
Merged

feat(agent): sub-agents, reasoning→agentic routing, layered context pipeline#474
senamakel merged 14 commits intotinyhumansai:mainfrom
senamakel:feat/sub-agents

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented Apr 10, 2026

Summary

This branch lands four connected pieces of agent-runtime work:

  1. Sub-agents as toolsspawn_subagent is wired into the parent loop end-to-end. A first-class AgentDefinition + AgentDefinitionRegistry replaces the stub, built-ins are derived from the eight AgentArchetypes, custom TOML definitions load from <workspace>/agents/*.toml, and a new subagent_runner executes both typed and fork-cache modes. Tool filtering supports category (system vs skill), skill prefix, and named allowlists — all composable.

  2. Reasoning → agentic routing + KV-cache stability — the main user-facing agent is now pinned to reasoning-v1 (decision/routing tier). When it spawns a sub-agent to execute tool calls, the sub-agent rides on hint:agenticagentic-v1. Sub-agent output returns as a tool_result and the parent continues on reasoning. To preserve the backend's automatic prefix cache, Agent::turn no longer rebuilds the system prompt on subsequent turns and no longer runs per-message model classification; sub-agent system prompts no longer embed chrono::Local::now().

  3. Layered context pipeline — new agent/context_pipeline/ module implementing an ordered reduction chain:

    • Tool-result budget (cache-safe, inline) — caps oversized tool results at 16 KiB before they enter history.
    • Microcompact — clears the payload of older ToolResults envelopes with [Old tool result content cleared] while preserving the AssistantToolCalls ⇔ ToolResults API invariant.
    • Autocompact signal — the pipeline surfaces PipelineOutcome::AutocompactionRequested with a utilisation percentage when microcompact can't free enough.
    • Session memory (stage 5) — separate from compaction. After each turn, if token/tool-call/turn deltas cross their thresholds, the agent spawns a background archivist sub-agent that uses the update_memory_md tool to distil durable facts into the workspace MEMORY.md. Fire-and-forget on the agentic tier.
  4. agent.rs split — the single 1988-line agent.rs is now an agent/ folder with six focused files: types.rs (struct defs), builder.rs (builder API + from_config factory), turn.rs (turn lifecycle + tool dispatch + pipeline wiring), runtime.rs (accessors + run helpers + static helpers), tests.rs (integration tests), and mod.rs (glue + top-level run).

Test plan

  • cargo test --lib -p openhuman2314 passed / 0 failed (includes 28 new context_pipeline tests and a KV-cache stability regression test under agent::agent::tests::system_prompt_and_model_are_byte_stable_across_turns)
  • cargo test --test json_rpc_e2e14 passed / 0 failed
  • cargo fmt --check clean
  • cargo clippy clean on touched files (agent/agent/**, agent/context_pipeline/**, agent/harness/subagent_runner.rs, config/schema/{types,agent}.rs)
  • Manual smoke: cargo run --bin openhuman -- chat "research the ECS design pattern and summarise in 3 bullets" and confirm [context_pipeline], [session_memory], and [spawn_subagent] log lines appear as expected.

Notable decisions

  • Sub-agents route via ModelSpec::Hint(...)RouterProvider → concrete backend tier — no hard-coded model names leak into archetypes.
  • The context pipeline is a pure state machine (no LLM calls); autocompaction is signalled to the caller rather than dispatched, keeping unit tests provider-free.
  • Session-memory extraction is optimistically marked complete at spawn time; a failed extraction is retried after the next threshold crossing (idempotent librarian task, no channel back needed for v1).
  • Fork-cache reuse is implemented via byte-stable prefix replay (OpenAI automatic prefix caching), not cache_control blocks — the field was plumbed but never emitted downstream, so the user-visible win is the same either way.
  • The agent.rs split is behaviour-preserving: no public API changes, all existing tests pass unchanged. Field visibility uses pub(super) so sibling files can construct/read without widening visibility beyond the agent::agent:: module.

Summary by CodeRabbit

  • New Features

    • Sub-agent spawning with a global agent-definition registry and controller endpoints to list/get/reload definitions.
    • Layered context pipeline: automatic tool-result compaction, session-memory extraction, and per-tool-result byte budgeting.
  • Improvements

    • More stable system-prompt reuse and KV-cache-friendly behavior across turns.
    • Better fork-mode support and clearer tool categorization (system vs skill).
    • Spawn-subagent tool now fully wired to run typed or forked sub-agents.
  • Configuration

    • New tool-result budget setting; fork-mode enabled by default.
  • Defaults

    • Default model changed to a reasoning-oriented option.

…tions

- Refactored the `Agent` struct to use `Arc` for shared ownership of the provider, tools, and tool specifications, enabling efficient memory management and concurrent access.
- Introduced a `NullMemoryLoader` to optimize memory usage for sub-agents, allowing them to operate without incurring the cost of memory recall.
- Added new methods in the `Agent` implementation to facilitate sharing of the provider, tools, and tool specifications with sub-agents, enhancing their operational efficiency.
- Implemented a new `SystemPromptBuilder` method for constructing prompts specifically for sub-agents, ensuring they receive tailored context while minimizing unnecessary information.
- Established a framework for loading custom agent definitions from TOML files, allowing for dynamic agent configuration and specialization.
- Introduced a `ForkContext` to support efficient sub-agent execution in fork mode, leveraging shared resources for improved performance and reduced token usage.
- Introduced a global `AgentDefinitionRegistry` to manage built-in and custom agent definitions from TOML files, ensuring idempotent initialization.
- Added new RPC handlers for listing, fetching, and reloading agent definitions, improving the flexibility of agent management.
- Refactored the `Agent` struct to streamline sub-agent execution, including enhancements to the task execution flow and context handling.
- Updated the orchestrator configuration to support fork mode for sub-agents, optimizing resource usage and performance.
- Improved error handling and logging for agent definition loading and initialization processes, enhancing system reliability.
- Implemented a new asynchronous test to validate the full path of a parent agent issuing a `spawn_subagent` tool call.
- The test ensures that the sub-agent's output is correctly folded into the parent's response, verifying the interaction between the parent agent and the sub-agent.
- Enhanced the `AgentDefinitionRegistry` to support global initialization of built-in agents, ensuring consistent behavior across tests.
- Updated the handling of tool calls and memory configuration to facilitate the new test scenario, improving overall test coverage for agent interactions.
- Introduced a new `category_filter` in `AgentDefinition` to restrict tool visibility based on their category (System or Skill).
- Updated the `from_archetype` function to apply the category filter for the `SkillsAgent` archetype.
- Modified `SubagentRunOptions` to include a `category_filter_override` for dynamic filtering during sub-agent execution.
- Enhanced the `filter_tool_indices` function to incorporate category filtering logic, ensuring tools are correctly filtered based on their defined categories.
- Updated relevant tests to validate the new category filtering functionality, improving overall test coverage for agent interactions.
- Introduced a new `context_pipeline` module to manage a layered context reduction strategy, enhancing memory efficiency during agent interactions.
- Added stages for tool-result budgeting, history trimming, microcompaction, autocompaction, and session memory extraction, each with specific triggers and cache implications.
- Updated the `Agent` struct to include a `context_pipeline` field, ensuring state persistence across turns.
- Enhanced the `AgentBuilder` to initialize the context pipeline by default.
- Implemented tests to validate the functionality and stability of the context pipeline, ensuring consistent behavior across agent sessions.
- Refactored relevant components to integrate the new context management features, improving overall agent performance and memory handling.
…mpaction

- Renamed variable for clarity in tool execution result handling.
- Implemented a new stage in the context pipeline to apply a byte budget to tool results, ensuring efficient memory usage.
- Added logging for budget application, including details on original and final byte sizes.
- Integrated microcompaction stages before tool calls to manage history and reduce memory footprint, with appropriate logging for outcomes.
- Updated the agent's session memory management to track turn counts, facilitating better resource handling across iterations.
…l call management

- Simplified method calls in the `Agent` struct for clarity and efficiency.
- Enhanced session memory extraction logic to spawn a background archivist sub-agent when thresholds are met.
- Improved context pipeline handling for tool call recording and usage tracking.
- Updated documentation and comments for better understanding of session memory extraction process.
- Refactored microcompact function for cleaner code structure and readability.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

Warning

Rate limit exceeded

@senamakel has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 14 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 14 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0ddaa3d6-e0b7-4573-9e51-61a2cfa18d52

📥 Commits

Reviewing files that changed from the base of the PR and between fc0a068 and f760291.

📒 Files selected for processing (1)
  • src/openhuman/skills/qjs_skill_instance/event_loop/rpc_handlers.rs
📝 Walkthrough

Walkthrough

Adds a declarative sub-agent system, global AgentDefinition registry and TOML loader, an async subagent runner (typed + fork modes) with task-local contexts, a layered context-reduction pipeline (microcompact, tool-result budgeting, session-memory), spawn_subagent tool wiring, and a refactor splitting Agent into modular builder/runtime/turn components.

Changes

Cohort / File(s) Summary
Context pipeline
src/openhuman/agent/context_pipeline/mod.rs, src/openhuman/agent/context_pipeline/pipeline.rs, src/openhuman/agent/context_pipeline/microcompact.rs, src/openhuman/agent/context_pipeline/session_memory.rs, src/openhuman/agent/context_pipeline/tool_result_budget.rs
New pipeline orchestrator, microcompact placeholder compaction, session-memory extraction policy/state, and per-tool-result byte-budget truncation with outcomes and tests.
Agent definitions & loader
src/openhuman/agent/harness/definition.rs, src/openhuman/agent/harness/definition_loader.rs, src/openhuman/agent/harness/builtin_definitions.rs
Introduces AgentDefinition types, AgentDefinitionRegistry global (OnceLock), built-in archetype mapping, and workspace + fallback TOML loader (merge/override semantics).
Sub-agent harness & contexts
src/openhuman/agent/harness/subagent_runner.rs, src/openhuman/agent/harness/fork_context.rs, src/openhuman/agent/harness/mod.rs
Adds async run_subagent, typed vs fork execution logic, prompt/tool filtering, and task-local PARENT_CONTEXT/FORK_CONTEXT APIs; exports harness surface.
Spawn subagent tool & tools surface
src/openhuman/tools/spawn_subagent.rs, src/openhuman/tools/ops.rs, src/openhuman/tools/traits.rs, src/openhuman/tools/mod.rs, src/openhuman/tools/skill_bridge.rs
spawn_subagent becomes full tool (agent_id, skill/category filters, mode), emits lifecycle events, calls runner; adds ToolCategory and registers spawn_subagent; skill bridge reports Skill category.
Agent core refactor (new modules)
src/openhuman/agent/agent/mod.rs, src/openhuman/agent/agent/types.rs, src/openhuman/agent/agent/builder.rs, src/openhuman/agent/agent/runtime.rs, src/openhuman/agent/agent/turn.rs, src/openhuman/agent/agent/tests.rs
Splits Agent into builder/runtime/turn/types, adds AgentBuilder/Agent::from_config, full turn orchestration (tool loop, tool execution, fork-parent integration, history trimming, background extraction), and extensive tests.
Harness executor
src/openhuman/agent/harness/executor.rs
Executor now dispatches real sub-agent runs via run_subagent, applies archetype overrides, and maps/handles outcomes and timeouts.
Startup & integration
src/core/jsonrpc.rs, src/openhuman/channels/runtime/startup.rs, src/openhuman/agent/mod.rs
Registry initialization added to startup paths (guarded/log-warn on failure); context_pipeline exported from agent module.
Prompt & memory helpers
src/openhuman/agent/prompt.rs, src/openhuman/agent/memory_loader.rs
Adds SystemPromptBuilder::for_subagent and ArchetypePromptSection; introduces NullMemoryLoader test stub.
Configuration changes
src/openhuman/config/schema/agent.rs, src/openhuman/config/schema/orchestrator.rs, src/openhuman/config/schema/types.rs
Adds tool_result_budget_bytes to AgentConfig (defaulted), adds fork_mode_enabled to OrchestratorConfig (default true), and changes DEFAULT_MODEL to MODEL_REASONING_V1.
Events & RPC schemas
src/openhuman/event_bus/events.rs, src/openhuman/agent/schemas.rs
New domain events SubagentSpawned/Completed/Failed under "agent"; RPC controllers added: agent.list_definitions, agent.get_definition, agent.reload_definitions.
Docs / prompts
src/openhuman/agent/prompts/*
Multiple prompt docs updated to broaden audience and wording (documentation-only edits).

Sequence Diagram(s)

sequenceDiagram
    participant Agent
    participant ContextPipeline
    participant Microcompact
    participant Provider
    participant SessionMemory

    Agent->>ContextPipeline: tick_turn() / record_usage()
    Agent->>ContextPipeline: run_before_call(&mut history)
    alt CompactionNeeded
        ContextPipeline->>Microcompact: microcompact(history, keep_recent)
        Microcompact-->>ContextPipeline: MicrocompactStats
        ContextPipeline-->>Agent: PipelineOutcome::Microcompacted
    else AutocompactionRequested
        ContextPipeline-->>Agent: PipelineOutcome::AutocompactionRequested
    else ContextExhausted
        ContextPipeline-->>Agent: PipelineOutcome::ContextExhausted
    else NoOp
        ContextPipeline-->>Agent: PipelineOutcome::NoOp
    end
    Agent->>Provider: chat(messages)
    Provider-->>Agent: response + usage
    Agent->>ContextPipeline: record_tool_calls(n)
    Agent->>SessionMemory: should_extract_session_memory()?
    alt Extraction needed
        SessionMemory-->>Agent: true
        Agent->>Agent: spawn archivist sub-agent (background)
    end
Loading
sequenceDiagram
    participant Parent as Parent Agent
    participant SpawnTool as SpawnSubagent Tool
    participant Registry as AgentDefinitionRegistry
    participant Runner as Subagent Runner
    participant Provider as Sub-agent Provider
    participant ForkCtx as Fork Context

    Parent->>SpawnTool: spawn_subagent(agent_id, prompt, opts)
    SpawnTool->>Registry: get(agent_id)
    Registry-->>SpawnTool: AgentDefinition
    alt Fork Mode
        SpawnTool->>ForkCtx: with_fork_context(message_prefix, fork_prompt)
        ForkCtx->>Runner: run_subagent(def, prompt, mode=fork)
        Runner->>Provider: call(parent_prefix + fork_prompt)
        Provider-->>Runner: assistant messages
    else Typed Mode
        SpawnTool->>Runner: run_subagent(def, prompt, mode=typed)
        Runner->>Runner: load archetype prompt, filter tools
        Runner->>Provider: call(filtered_messages)
        Provider-->>Runner: assistant messages
    end
    Runner-->>SpawnTool: SubagentRunOutcome
    SpawnTool-->>Parent: ToolResult(sub-agent output)
    Parent->>Parent: append tool-result to history
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • graycyrus

Poem

🐇
I nibbled prompts and stitched designs,
Poked tunnels where the registry shines,
Cleared old crumbs with microcompact cheer,
Spawned little agents far and near,
The warren hums — new stories here.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the three main changes: sub-agents as a new capability, routing model transition (reasoning→agentic), and a new layered context pipeline architecture.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (10)
src/openhuman/tools/ops.rs (1)

78-84: Add a focused test asserting spawn_subagent is present in all_tools_with_runtime.

Since this is a key behavioral addition, a direct assertion in the test module would guard against accidental registry regressions.

[suggested fix]

Example test addition
+    #[test]
+    fn all_tools_includes_spawn_subagent() {
+        let tmp = TempDir::new().unwrap();
+        let security = Arc::new(SecurityPolicy::default());
+        let mem_cfg = MemoryConfig { backend: "markdown".into(), ..MemoryConfig::default() };
+        let mem: Arc<dyn Memory> =
+            Arc::from(crate::openhuman::memory::create_memory(&mem_cfg, tmp.path(), None).unwrap());
+        let cfg = test_config(&tmp);
+
+        let tools = all_tools(
+            Arc::new(Config::default()),
+            &security,
+            mem,
+            None,
+            None,
+            &BrowserConfig::default(),
+            &crate::openhuman::config::HttpRequestConfig::default(),
+            tmp.path(),
+            &HashMap::new(),
+            None,
+            &cfg,
+        );
+        let names: Vec<&str> = tools.iter().map(|t| t.name()).collect();
+        assert!(names.contains(&"spawn_subagent"));
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/tools/ops.rs` around lines 78 - 84, Add a unit test that
asserts the spawn_subagent tool is registered in the runtime tools list: call
the all_tools_with_runtime() registry function (or the test helper that returns
the Vec/Map of tools) and assert it contains an entry corresponding to
SpawnSubagentTool::new() (check for the tool's public identifier
"spawn_subagent" or the tool's type/name), failing the test if that identifier
is missing; place the test in the existing test module for tools so future
regressions to the registry are caught.
src/openhuman/event_bus/events.rs (1)

230-233: Consider adding test coverage for new sub-agent event variants.

The all_variants_have_correct_domain test doesn't include the new SubagentSpawned, SubagentCompleted, and SubagentFailed variants. While the match arm correctly routes them to "agent", adding explicit test cases would maintain the comprehensive coverage pattern established by existing tests.

💡 Example test cases to add
// Add to the `cases` vec in all_variants_have_correct_domain test:
(
    DomainEvent::SubagentSpawned {
        parent_session: "p".into(),
        agent_id: "a".into(),
        mode: "typed".into(),
        task_id: "t".into(),
        prompt_chars: 100,
    },
    "agent",
),
(
    DomainEvent::SubagentCompleted {
        parent_session: "p".into(),
        task_id: "t".into(),
        agent_id: "a".into(),
        elapsed_ms: 0,
        output_chars: 0,
        iterations: 0,
    },
    "agent",
),
(
    DomainEvent::SubagentFailed {
        parent_session: "p".into(),
        task_id: "t".into(),
        agent_id: "a".into(),
        error: "e".into(),
    },
    "agent",
),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/event_bus/events.rs` around lines 230 - 233, The test
all_variants_have_correct_domain is missing explicit cases for the new
DomainEvent variants SubagentSpawned, SubagentCompleted, and SubagentFailed; add
entries to the cases Vec in that test with representative values for each of
those variants and expect the domain "agent" (e.g., create
DomainEvent::SubagentSpawned { ... }, DomainEvent::SubagentCompleted { ... },
DomainEvent::SubagentFailed { ... } paired with "agent") so the test covers the
new match arm that returns "agent".
src/openhuman/agent/schemas.rs (1)

250-258: Redundant serde_json::to_value wrapping serde_json::json!.

serde_json::json! already returns a serde_json::Value, so wrapping it in to_value() is unnecessary and adds a fallible conversion that can't fail in practice.

♻️ Simplified serialization
 fn handle_list_definitions(_params: Map<String, Value>) -> ControllerFuture {
     Box::pin(async {
         let registry = crate::openhuman::agent::harness::AgentDefinitionRegistry::global()
             .ok_or_else(|| "AgentDefinitionRegistry not initialised".to_string())?;
         let defs: Vec<&crate::openhuman::agent::harness::AgentDefinition> = registry.list();
-        serde_json::to_value(serde_json::json!({ "definitions": defs }))
-            .map_err(|e| format!("serialise: {e}"))
+        Ok(serde_json::json!({ "definitions": defs }))
     })
 }

Same applies to handle_get_definition at lines 271-272.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/schemas.rs` around lines 250 - 258, Both
handle_list_definitions and handle_get_definition are calling
serde_json::to_value on a value already produced by serde_json::json!, which is
redundant and introduces an unnecessary fallible conversion; update
handle_list_definitions (and handle_get_definition) to directly return
Ok(serde_json::json!({ "definitions": defs })) (or Ok(serde_json::json!({
"definition": def })) respectively) instead of wrapping json! in
serde_json::to_value(...). Keep the existing error propagation from
AgentDefinitionRegistry::global() and registry.list() unchanged so the function
still returns the same Result/ControllerFuture type.
src/openhuman/agent/context_pipeline/tool_result_budget.rs (1)

104-111: Unnecessary out.clone() before return.

The out String is moved into the tuple on return. The .clone() creates an extra allocation.

Remove clone
     (
-        out.clone(),
+        out,
         BudgetOutcome {
             original_bytes,
-            final_bytes: out.len(),
+            final_bytes: cut + TRAILER_RESERVED, // or compute before move
             truncated: true,
         },
     )

Since out.len() is needed for final_bytes, you can capture it before moving:

+    let final_bytes = out.len();
     (
-        out.clone(),
+        out,
         BudgetOutcome {
             original_bytes,
-            final_bytes: out.len(),
+            final_bytes,
             truncated: true,
         },
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/context_pipeline/tool_result_budget.rs` around lines 104
- 111, The return is cloning `out` unnecessarily; compute final_bytes before
moving `out` and return the owned `out` into the tuple instead of `out.clone()`.
Specifically, capture `let final_bytes = out.len();` (or similar) and then
return `(out, BudgetOutcome { original_bytes, final_bytes, truncated: true })`,
removing the `out.clone()` call.
src/openhuman/agent/harness/executor.rs (1)

405-411: Consider logging unrecognized sandbox values.

Unrecognized strings silently fall back to SandboxMode::None. A tracing::warn! when the value is neither "sandboxed" nor "read_only" nor empty would surface config typos during development.

Example
         if let Some(sb) = over.sandbox.as_deref() {
             definition.sandbox_mode = match sb {
                 "sandboxed" => super::definition::SandboxMode::Sandboxed,
                 "read_only" => super::definition::SandboxMode::ReadOnly,
-                _ => super::definition::SandboxMode::None,
+                other => {
+                    if !other.is_empty() {
+                        tracing::warn!(
+                            archetype = %archetype,
+                            value = other,
+                            "[orchestrator] unrecognized sandbox override, defaulting to None"
+                        );
+                    }
+                    super::definition::SandboxMode::None
+                }
             };
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/executor.rs` around lines 405 - 411, The current
match in executor.rs that sets definition.sandbox_mode from over.sandbox (the
block using over.sandbox.as_deref()) silently maps any unexpected string to
SandboxMode::None; update this to emit a tracing::warn! (including the
unrecognized value and context such as the agent or definition id if available)
in the default arm when sb is not "sandboxed" or "read_only" and not empty, so
typos are surfaced during development while preserving the fallback to
super::definition::SandboxMode::None.
src/openhuman/tools/spawn_subagent.rs (1)

288-301: Minor: avoid cloning before sort+dedup.

known is consumed by this function, so you can sort and dedup it in place rather than cloning into unique.

Simplified dedup
-    let mut unique = known.clone();
-    unique.sort();
-    unique.dedup();
-    if unique.iter().any(|s| s == skill_id) {
+    known.sort();
+    known.dedup();
+    if known.iter().any(|s| s == skill_id) {
         Ok(())
     } else {
         Err(format!(
             "skill_filter '{skill_id}' does not match any installed skill. Available: {}",
-            unique.join(", ")
+            known.join(", ")
         ))
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/tools/spawn_subagent.rs` around lines 288 - 301, The code
clones `known` into `unique` before sorting/deduping which is unnecessary
because `known` is not used elsewhere; remove the `unique` clone and perform
sort+dedup in place on `known` (call known.sort(); known.dedup();) and then use
known.iter().any(|s| s == skill_id) for the membership check; the transformation
should be applied after you build `known` from engine.all_tools() by splitting
def.name on "__".
src/openhuman/agent/harness/definition_loader.rs (1)

193-206: Test assumes fixed builtin count.

Lines 201-202 assert reg.len() == 10 (9 builtins + 1 custom). If builtins are added/removed, this test will fail. Consider asserting reg.len() > 1 and that specific IDs exist rather than exact counts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/definition_loader.rs` around lines 193 - 206, The
test registry_load_merges_builtins_and_custom assumes an exact builtin count by
asserting reg.len() == 10; change this to a non-brittle assertion (e.g.,
assert!(reg.len() > 1)) and keep the targeted checks for specific IDs using
reg.get("notion_specialist"), reg.get("code_executor"), and reg.get("fork") to
ensure the builtins and the custom definition loaded by
AgentDefinitionRegistry::load are present without relying on a fixed total
count.
src/openhuman/agent/agent.rs (1)

1388-1398: Optimistic extraction marking is correct for fire-and-forget.

The comment (lines 1388-1392) clearly explains the trade-off: marking complete immediately avoids needing a channel back from the background task. Failed extractions retry after the next threshold crossing, which is acceptable for an idempotent archivist.

Consider whether mark_extraction_started() adds value if it's immediately followed by mark_extraction_complete() — the started state is never observable externally. If there's no telemetry or conditional logic relying on the "in-flight" state, you could simplify to just mark_extraction_complete().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent.rs` around lines 1388 - 1398, The call sequence
marks extraction started then immediately complete, making
mark_extraction_started() effectively no-op; update the code in agent.rs by
removing the call to
self.context_pipeline.session_memory.mark_extraction_started() and keep only
self.context_pipeline.session_memory.mark_extraction_complete(), unless there is
telemetry or conditional logic that explicitly depends on the "in-flight"
state—if such dependencies exist, retain started and add the necessary
observable usage instead.
src/openhuman/agent/context_pipeline/pipeline.rs (1)

151-163: Consider calling guard.record_compaction_success() after successful microcompact.

When microcompact clears envelopes (line 155), this is a successful compaction that should reset the guard's circuit breaker. Currently, only the caller would need to call this, but the pipeline could do it internally for consistency.

♻️ Reset circuit breaker after successful microcompact

Based on the ContextGuard snippet showing record_compaction_success() resets the failure counter, consider calling it here:

                 if stats.envelopes_cleared > 0 {
+                    self.guard.record_compaction_success();
                     tracing::info!(
                         envelopes_cleared = stats.envelopes_cleared,
                         entries_cleared = stats.entries_cleared,
                         bytes_freed = stats.bytes_freed,
                         "[context_pipeline] microcompact fired"
                     );
                     return PipelineOutcome::Microcompacted(stats);
                 }

This ensures the circuit breaker is reset when microcompact succeeds, preventing unnecessary escalation to autocompaction on subsequent turns.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/context_pipeline/pipeline.rs` around lines 151 - 163,
When microcompaction succeeds (ContextCheckResult::CompactionNeeded branch) and
stats.envelopes_cleared > 0, call the ContextGuard method
record_compaction_success() to reset the compaction circuit breaker before
returning PipelineOutcome::Microcompacted(stats); locate the
microcompact(history, self.config.microcompact_keep_recent) call and insert a
guard.record_compaction_success() (using the same guard instance used for the
pipeline) immediately after the success check and before the
tracing::info/return path so successful microcompaction clears the failure
counter.
src/openhuman/agent/harness/fork_context.rs (1)

172-206: Tests verify scope boundaries correctly.

The tests confirm that both task-locals return None outside their scopes and that ForkContext is visible inside with_fork_context. Consider adding a similar visibility test for ParentExecutionContext for completeness.

💡 Optional: Add a visibility test for ParentExecutionContext
#[tokio::test]
async fn parent_context_visible_inside_scope() {
    use crate::openhuman::config::{AgentConfig, IdentityConfig};
    // Construct a minimal ParentExecutionContext and verify it's accessible inside scope
    // (would require test helpers for Provider/Memory mocks)
}

This would mirror the fork_context_visible_inside_scope test and ensure symmetric coverage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/fork_context.rs` around lines 172 - 206, Add a
test mirroring fork_context_visible_inside_scope that ensures
ParentExecutionContext is visible inside its task-local scope and None outside:
create a minimal ParentExecutionContext instance, call with_parent_context(<that
instance>, async { let p = current_parent().expect("parent context should be
visible"); assert on a couple of fields }); .await, and then assert
current_parent().is_none(); reference the ParentExecutionContext type and the
with_parent_context and current_parent helpers to locate where to add the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/agent/harness/definition.rs`:
- Around line 7-9: The docs are inconsistent about config format: the module
docstring and the load method mention YAML (`*.yaml`) while
DefinitionSource::File's docstring says "TOML file"; pick the actual supported
format (or state both) and make all docstrings consistent. Update the
module-level comment at the top of definition.rs, the DefinitionSource::File
docstring (the enum variant), and the load method docstring to uniformly state
the correct format (or explicitly list supported formats and precedence),
ensuring the same extension and parser name appear in each place so users are
not confused.

In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 646-658: render_subagent_system_prompt currently only appends
definition.system_prompt when it's PromptSource::Inline, so PromptSource::File
content that was preloaded into archetype_prompt_body is never written into out;
update render_subagent_system_prompt to accept an additional archetype_body:
&str (or Option<&str>) parameter and, when definition.system_prompt is
PromptSource::File, append archetype_body.trim_end() + "\n\n" into out (fall
back to existing Inline behavior otherwise). Also update the callsite that
constructs SystemPromptBuilder::for_subagent to pass &archetype_prompt_body (or
Some(&archetype_prompt_body)) so the loaded archetype prompt is actually
rendered into the system prompt.

In `@src/openhuman/agent/prompt.rs`:
- Around line 59-66: The sub-agent prompt assembly currently injects a dynamic
DateTimeSection using Local::now(), which makes the sub-agent system prompt
change per call and breaks KV-prefix cache stability; remove the DateTimeSection
(and any use of Local::now()) from the narrow prompt builder so the sub-agent
prefix is deterministic, keeping the other sections (workspace, archetype
section, and respect for omit_identity/omit_safety_preamble flags in the
AgentDefinition) intact; if a timestamp is required for display only, replace
DateTimeSection with a deterministic placeholder or configuration-time value
instead of Local::now().

---

Nitpick comments:
In `@src/openhuman/agent/agent.rs`:
- Around line 1388-1398: The call sequence marks extraction started then
immediately complete, making mark_extraction_started() effectively no-op; update
the code in agent.rs by removing the call to
self.context_pipeline.session_memory.mark_extraction_started() and keep only
self.context_pipeline.session_memory.mark_extraction_complete(), unless there is
telemetry or conditional logic that explicitly depends on the "in-flight"
state—if such dependencies exist, retain started and add the necessary
observable usage instead.

In `@src/openhuman/agent/context_pipeline/pipeline.rs`:
- Around line 151-163: When microcompaction succeeds
(ContextCheckResult::CompactionNeeded branch) and stats.envelopes_cleared > 0,
call the ContextGuard method record_compaction_success() to reset the compaction
circuit breaker before returning PipelineOutcome::Microcompacted(stats); locate
the microcompact(history, self.config.microcompact_keep_recent) call and insert
a guard.record_compaction_success() (using the same guard instance used for the
pipeline) immediately after the success check and before the
tracing::info/return path so successful microcompaction clears the failure
counter.

In `@src/openhuman/agent/context_pipeline/tool_result_budget.rs`:
- Around line 104-111: The return is cloning `out` unnecessarily; compute
final_bytes before moving `out` and return the owned `out` into the tuple
instead of `out.clone()`. Specifically, capture `let final_bytes = out.len();`
(or similar) and then return `(out, BudgetOutcome { original_bytes, final_bytes,
truncated: true })`, removing the `out.clone()` call.

In `@src/openhuman/agent/harness/definition_loader.rs`:
- Around line 193-206: The test registry_load_merges_builtins_and_custom assumes
an exact builtin count by asserting reg.len() == 10; change this to a
non-brittle assertion (e.g., assert!(reg.len() > 1)) and keep the targeted
checks for specific IDs using reg.get("notion_specialist"),
reg.get("code_executor"), and reg.get("fork") to ensure the builtins and the
custom definition loaded by AgentDefinitionRegistry::load are present without
relying on a fixed total count.

In `@src/openhuman/agent/harness/executor.rs`:
- Around line 405-411: The current match in executor.rs that sets
definition.sandbox_mode from over.sandbox (the block using
over.sandbox.as_deref()) silently maps any unexpected string to
SandboxMode::None; update this to emit a tracing::warn! (including the
unrecognized value and context such as the agent or definition id if available)
in the default arm when sb is not "sandboxed" or "read_only" and not empty, so
typos are surfaced during development while preserving the fallback to
super::definition::SandboxMode::None.

In `@src/openhuman/agent/harness/fork_context.rs`:
- Around line 172-206: Add a test mirroring fork_context_visible_inside_scope
that ensures ParentExecutionContext is visible inside its task-local scope and
None outside: create a minimal ParentExecutionContext instance, call
with_parent_context(<that instance>, async { let p =
current_parent().expect("parent context should be visible"); assert on a couple
of fields }); .await, and then assert current_parent().is_none(); reference the
ParentExecutionContext type and the with_parent_context and current_parent
helpers to locate where to add the test.

In `@src/openhuman/agent/schemas.rs`:
- Around line 250-258: Both handle_list_definitions and handle_get_definition
are calling serde_json::to_value on a value already produced by
serde_json::json!, which is redundant and introduces an unnecessary fallible
conversion; update handle_list_definitions (and handle_get_definition) to
directly return Ok(serde_json::json!({ "definitions": defs })) (or
Ok(serde_json::json!({ "definition": def })) respectively) instead of wrapping
json! in serde_json::to_value(...). Keep the existing error propagation from
AgentDefinitionRegistry::global() and registry.list() unchanged so the function
still returns the same Result/ControllerFuture type.

In `@src/openhuman/event_bus/events.rs`:
- Around line 230-233: The test all_variants_have_correct_domain is missing
explicit cases for the new DomainEvent variants SubagentSpawned,
SubagentCompleted, and SubagentFailed; add entries to the cases Vec in that test
with representative values for each of those variants and expect the domain
"agent" (e.g., create DomainEvent::SubagentSpawned { ... },
DomainEvent::SubagentCompleted { ... }, DomainEvent::SubagentFailed { ... }
paired with "agent") so the test covers the new match arm that returns "agent".

In `@src/openhuman/tools/ops.rs`:
- Around line 78-84: Add a unit test that asserts the spawn_subagent tool is
registered in the runtime tools list: call the all_tools_with_runtime() registry
function (or the test helper that returns the Vec/Map of tools) and assert it
contains an entry corresponding to SpawnSubagentTool::new() (check for the
tool's public identifier "spawn_subagent" or the tool's type/name), failing the
test if that identifier is missing; place the test in the existing test module
for tools so future regressions to the registry are caught.

In `@src/openhuman/tools/spawn_subagent.rs`:
- Around line 288-301: The code clones `known` into `unique` before
sorting/deduping which is unnecessary because `known` is not used elsewhere;
remove the `unique` clone and perform sort+dedup in place on `known` (call
known.sort(); known.dedup();) and then use known.iter().any(|s| s == skill_id)
for the membership check; the transformation should be applied after you build
`known` from engine.all_tools() by splitting def.name on "__".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 84d8dc14-b0b7-4db0-a728-c4d6fedac753

📥 Commits

Reviewing files that changed from the base of the PR and between 38eb934 and 430b2b8.

📒 Files selected for processing (28)
  • src/core/jsonrpc.rs
  • src/openhuman/agent/agent.rs
  • src/openhuman/agent/context_pipeline/microcompact.rs
  • src/openhuman/agent/context_pipeline/mod.rs
  • src/openhuman/agent/context_pipeline/pipeline.rs
  • src/openhuman/agent/context_pipeline/session_memory.rs
  • src/openhuman/agent/context_pipeline/tool_result_budget.rs
  • src/openhuman/agent/harness/builtin_definitions.rs
  • src/openhuman/agent/harness/definition.rs
  • src/openhuman/agent/harness/definition_loader.rs
  • src/openhuman/agent/harness/executor.rs
  • src/openhuman/agent/harness/fork_context.rs
  • src/openhuman/agent/harness/mod.rs
  • src/openhuman/agent/harness/subagent_runner.rs
  • src/openhuman/agent/memory_loader.rs
  • src/openhuman/agent/mod.rs
  • src/openhuman/agent/prompt.rs
  • src/openhuman/agent/schemas.rs
  • src/openhuman/channels/runtime/startup.rs
  • src/openhuman/config/schema/agent.rs
  • src/openhuman/config/schema/orchestrator.rs
  • src/openhuman/config/schema/types.rs
  • src/openhuman/event_bus/events.rs
  • src/openhuman/tools/mod.rs
  • src/openhuman/tools/ops.rs
  • src/openhuman/tools/skill_bridge.rs
  • src/openhuman/tools/spawn_subagent.rs
  • src/openhuman/tools/traits.rs

Comment thread src/openhuman/agent/harness/definition.rs Outdated
Comment thread src/openhuman/agent/harness/subagent_runner.rs Outdated
Comment thread src/openhuman/agent/prompt.rs Outdated
- Convert agent.rs (1988 lines) into agent/ folder with six files:
  * types.rs — Agent + AgentBuilder struct defs
  * builder.rs — AgentBuilder fluent API + Agent::from_config factory
  * turn.rs — turn lifecycle, tool dispatch, context pipeline wiring
  * runtime.rs — public accessors, run_single/run_interactive, helpers
  * tests.rs — integration tests with shared fakes
  * mod.rs — glue + top-level `run` convenience function

- Drop misc external inspiration references from doc comments in the
  context_pipeline module and fork_context — the files now stand on
  their own design language.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (4)
src/openhuman/agent/agent/tests.rs (1)

277-379: Add a fork-mode regression test.

This end-to-end case covers sub-agent dispatch well, but it does not explicitly pin the mode: "fork" path or assert the prefix-replay behavior introduced in this PR. A dedicated fork-mode test would keep the KV-cache reuse path from regressing silently.

Based on learnings: Keep functionality validation in Rust tests for rust-core.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/tests.rs` around lines 277 - 379, Add a new tokio
test (e.g. turn_dispatches_spawn_subagent_in_fork_mode) that mirrors
turn_dispatches_spawn_subagent_through_full_path but sets the SpawnSubagentTool
call arguments to include "mode": "fork"; initialize builtins via
AgentDefinitionRegistry::init_global_builtins(), build the Agent with the same
MockProvider and tools (SpawnSubagentTool::new()), call Agent::turn and assert
the final response equals the expected folded text, then assert the parent's
history contains the spawn_subagent assistant tool call and a tool-result/chat
tool entry containing the sub-agent output (same checks as existing test);
additionally assert the provider response consumption order demonstrates
prefix-replay (i.e. the mocked responses vector is consumed in the expected
sequence: parent tool_call, sub-agent reply, parent folded reply) to cover the
fork-mode KV-cache reuse path.
src/openhuman/agent/agent/builder.rs (3)

341-341: Minor: full_config Arc allocated even when unused.

The full_config Arc is created unconditionally when learning.enabled is true, but it's only used inside the reflection_enabled block (line 362). If reflection is disabled, the Arc allocation and clone are wasted.

♻️ Suggested refactor
         if config.learning.reflection_enabled {
+            let full_config = Arc::new(config.clone());
             // For cloud reflection, wrap the provider in an Arc.
             // For local, no provider needed.
             let reflection_provider: Option<Arc<dyn crate::openhuman::providers::Provider>> =

And remove the let full_config = Arc::new(config.clone()); from line 341.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/builder.rs` at line 341, The code unconditionally
creates full_config = Arc::new(config.clone()) when learning.enabled is true
even though full_config is only used inside the reflection_enabled branch; to
avoid the wasted allocation, move the Arc::new(config.clone()) expression into
the block where reflection_enabled is true (or lazily create it only if
reflection_enabled) and remove the original unconditional let full_config
declaration; update any references to full_config inside the reflection handling
so they use the newly created Arc from that branch and keep the existing checks
for learning.enabled and reflection_enabled intact.

173-181: Add rustdoc for event_context method.

All other builder methods have documentation comments explaining their purpose, but event_context is missing one. For consistency and discoverability, add a brief doc comment explaining what session_id and channel parameters represent.

📝 Suggested documentation
+    /// Sets the event context for the agent (session identifier and channel).
+    ///
+    /// Used for correlating agent activity with external event streams.
     pub fn event_context(
         mut self,
         session_id: impl Into<String>,
         channel: impl Into<String>,
     ) -> Self {

Based on learnings: "New or changed behavior must ship with matching documentation."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/builder.rs` around lines 173 - 181, Add a rustdoc
comment to the builder method event_context explaining its purpose and the
meaning of its parameters: state that event_context sets the event_session_id
(session_id) used to group events for a user/session and event_channel (channel)
indicating the source or stream for those events; keep it brief, use /// style
above the pub fn event_context(...) signature and mention that both parameters
are converted into Strings and stored in event_session_id and event_channel
respectively.

228-228: Consider adding a setter for context_pipeline.

The builder uses ContextPipeline::default() unconditionally, but there's no corresponding setter method. If the pipeline configuration should always be default, this is fine. However, if future use cases need custom pipeline configurations (e.g., different budget thresholds, disabled stages), a setter would be needed.

If intentional, a brief comment explaining why the pipeline isn't configurable would help future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/agent/builder.rs` at line 228, The builder currently
hardcodes context_pipeline to ContextPipeline::default() (context_pipeline:
ContextPipeline::default()) with no setter; add a fluent setter method (e.g.,
with_context_pipeline or set_context_pipeline) on the builder type that accepts
a ContextPipeline and assigns it to self.context_pipeline (returning Self or
&mut Self consistent with other builder methods) so callers can supply custom
pipeline configurations; alternatively, if the default is intentional, add a
short doc comment above the context_pipeline field or its initialization
explaining why custom pipelines are not supported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/agent/agent/runtime.rs`:
- Around line 287-299: The CLI path in run_interactive currently calls
self.turn(...) directly which bypasses lifecycle events (AgentTurnStarted,
AgentTurnCompleted, AgentError) and custom error/telemetry handling; change
run_interactive to reuse run_single (or the same call path run_single uses) for
each incoming message from rx.recv() so each turn goes through the unified
lifecycle and error handling instead of calling self.turn directly; ensure
errors returned from run_single are logged/handled the same way the non-CLI path
does and keep the existing listen_handle.abort() behavior after the loop.

In `@src/openhuman/agent/agent/turn.rs`:
- Around line 56-60: The current log::info call in turn.rs prints the full
system_prompt (variable system_prompt) which can contain sensitive
user/workspace/memory data; change the logging in the agent_loop so it does NOT
emit the full prompt body at info level — instead log only non-sensitive
metadata such as system_prompt.chars().count(), a truncated/redacted preview
(e.g., first N chars) or a stable hash/fingerprint, and/or a "redacted"
placeholder; update the log::info invocation that references system_prompt to
use one of these safe alternatives and ensure any detailed prompt content is
removed or only logged at a secure debug/audit level.
- Around line 294-306: Agent::turn() currently writes pre-tool assistant text
directly to stdout (the print!("{text}") and stdout flush) which couples core
logic to a terminal; remove those stdout writes from the block that also pushes
the assistant message into history (the call to
self.history.push(ConversationMessage::Chat(ChatMessage::assistant(text.clone()))))
and only keep the history update and logging, and instead surface rendering to
callers (e.g., have the CLI wrapper or caller render the latest history entry or
provide an optional callback/renderer parameter to Agent::turn that can be used
by CLI code to print). Ensure no direct print!/stdout flush remains in turn.rs
so sub-agents and library consumers don’t get terminal output.
- Around line 51-75: The code currently calls fetch_learned_context()
unconditionally at the top of turn handling, but then discards its result on
subsequent turns; move the call so learned context is only fetched when needed
(i.e., inside the if self.history.is_empty() branch before build_system_prompt)
to avoid extra storage reads on later turns. Update references to learned so
build_system_prompt(learned) still receives the fetched value in the first-turn
path and remove the unused prefetch (the binding `let learned =
self.fetch_learned_context().await;` and the throwaway `let _ = learned;`) to
prevent needless memory lookups in agent_loop/turn handling.

---

Nitpick comments:
In `@src/openhuman/agent/agent/builder.rs`:
- Line 341: The code unconditionally creates full_config =
Arc::new(config.clone()) when learning.enabled is true even though full_config
is only used inside the reflection_enabled branch; to avoid the wasted
allocation, move the Arc::new(config.clone()) expression into the block where
reflection_enabled is true (or lazily create it only if reflection_enabled) and
remove the original unconditional let full_config declaration; update any
references to full_config inside the reflection handling so they use the newly
created Arc from that branch and keep the existing checks for learning.enabled
and reflection_enabled intact.
- Around line 173-181: Add a rustdoc comment to the builder method event_context
explaining its purpose and the meaning of its parameters: state that
event_context sets the event_session_id (session_id) used to group events for a
user/session and event_channel (channel) indicating the source or stream for
those events; keep it brief, use /// style above the pub fn event_context(...)
signature and mention that both parameters are converted into Strings and stored
in event_session_id and event_channel respectively.
- Line 228: The builder currently hardcodes context_pipeline to
ContextPipeline::default() (context_pipeline: ContextPipeline::default()) with
no setter; add a fluent setter method (e.g., with_context_pipeline or
set_context_pipeline) on the builder type that accepts a ContextPipeline and
assigns it to self.context_pipeline (returning Self or &mut Self consistent with
other builder methods) so callers can supply custom pipeline configurations;
alternatively, if the default is intentional, add a short doc comment above the
context_pipeline field or its initialization explaining why custom pipelines are
not supported.

In `@src/openhuman/agent/agent/tests.rs`:
- Around line 277-379: Add a new tokio test (e.g.
turn_dispatches_spawn_subagent_in_fork_mode) that mirrors
turn_dispatches_spawn_subagent_through_full_path but sets the SpawnSubagentTool
call arguments to include "mode": "fork"; initialize builtins via
AgentDefinitionRegistry::init_global_builtins(), build the Agent with the same
MockProvider and tools (SpawnSubagentTool::new()), call Agent::turn and assert
the final response equals the expected folded text, then assert the parent's
history contains the spawn_subagent assistant tool call and a tool-result/chat
tool entry containing the sub-agent output (same checks as existing test);
additionally assert the provider response consumption order demonstrates
prefix-replay (i.e. the mocked responses vector is consumed in the expected
sequence: parent tool_call, sub-agent reply, parent folded reply) to cover the
fork-mode KV-cache reuse path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b85e999-f0b4-4212-ae58-6a113e12c0c7

📥 Commits

Reviewing files that changed from the base of the PR and between 430b2b8 and 176688c.

📒 Files selected for processing (13)
  • src/openhuman/agent/agent.rs
  • src/openhuman/agent/agent/builder.rs
  • src/openhuman/agent/agent/mod.rs
  • src/openhuman/agent/agent/runtime.rs
  • src/openhuman/agent/agent/tests.rs
  • src/openhuman/agent/agent/turn.rs
  • src/openhuman/agent/agent/types.rs
  • src/openhuman/agent/context_pipeline/microcompact.rs
  • src/openhuman/agent/context_pipeline/mod.rs
  • src/openhuman/agent/context_pipeline/pipeline.rs
  • src/openhuman/agent/context_pipeline/session_memory.rs
  • src/openhuman/agent/context_pipeline/tool_result_budget.rs
  • src/openhuman/agent/harness/fork_context.rs
💤 Files with no reviewable changes (1)
  • src/openhuman/agent/agent.rs
✅ Files skipped from review due to trivial changes (2)
  • src/openhuman/agent/context_pipeline/mod.rs
  • src/openhuman/agent/context_pipeline/tool_result_budget.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/agent/harness/fork_context.rs

Comment thread src/openhuman/agent/agent/runtime.rs
Comment thread src/openhuman/agent/agent/turn.rs Outdated
Comment thread src/openhuman/agent/agent/turn.rs
Comment thread src/openhuman/agent/agent/turn.rs
Comment thread src/openhuman/agent/agent/turn.rs
Inline comment fixes:
- definition.rs: YAML/TOML inconsistency — module doc, PromptSource, source
  bookkeeping, and load() all now uniformly document the TOML format.
- subagent_runner.rs: render_subagent_system_prompt previously only appended
  PromptSource::Inline bodies, silently dropping PromptSource::File content.
  Thread the preloaded archetype_body through as an explicit &str parameter
  so both source variants render. Drops the unused SystemPromptBuilder +
  tools_for_prompt wiring while we're here.
- prompt.rs: remove DateTimeSection from SystemPromptBuilder::for_subagent.
  Local::now() would make the sub-agent system prompt change per call and
  break KV-prefix cache stability. Document the invariant.

Nitpick fixes:
- agent/turn.rs: drop the no-op mark_extraction_started() call — the
  immediately-following mark_extraction_complete() clears the in-flight
  flag anyway.
- context_pipeline/pipeline.rs: call guard.record_compaction_success() on
  microcompact success so a prior streak of autocompaction failures
  doesn't leave the circuit breaker tripped after a successful reduction.
- context_pipeline/tool_result_budget.rs: remove the unnecessary out.clone()
  in the truncation return — capture final_bytes first, then move out.
- definition_loader.rs: replace brittle reg.len() == 10 with
  assert!(reg.len() > 1) plus the existing targeted .get() checks.
- executor.rs: tracing::warn! on unknown sandbox override values so typos
  surface during development; explicitly accept "none" and empty string
  as valid defaults.
- fork_context.rs: add parent_context_visible_inside_scope test mirroring
  fork_context_visible_inside_scope, with minimal stub Provider/Memory
  impls so the test stays self-contained.
- schemas.rs: drop redundant serde_json::to_value(serde_json::json!(...))
  wrapping in handle_list_definitions + handle_get_definition.
- event_bus/events.rs: add SubagentSpawned/SubagentCompleted/SubagentFailed
  cases to all_variants_have_correct_domain.
- tools/ops.rs: add all_tools_includes_spawn_subagent regression test.
- tools/spawn_subagent.rs: sort/dedup the known skill list in place
  instead of cloning into a second vec.

Tests: 2316 passed / 0 failed (up from 2314; two new tests added).
fmt + clippy clean on all touched files.
…LI improvements

- Added `event_context` method to `AgentBuilder` for setting `session_id` and `channel` for `DomainEvent`s, improving event tagging and correlation.
- Updated `run_interactive` method in `Agent` to dispatch messages through `run_single`, ensuring consistent lifecycle event handling and error sanitization for interactive turns.
…c for full config in reflection hook. This change reduces unnecessary cloning when learning is enabled, improving performance.
- Introduced a new test, `turn_dispatches_spawn_subagent_in_fork_mode`, to validate the behavior of the agent when spawning a sub-agent in fork mode.
- The test ensures that the parent agent correctly processes the sub-agent's output and maintains the expected response sequence.
- Enhanced the test setup with a mock provider and memory configuration to simulate the agent's environment effectively.
@senamakel
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- Updated the `handle_sync` function to log a debug message and return a no-op response when a skill does not implement the `onSync` handler, preventing unnecessary RPC errors for skills that do not require periodic syncs.
- This change improves logging clarity and reduces error noise in logs and dashboards for skills that are not designed to handle sync operations.
@senamakel senamakel merged commit 6465f3d into tinyhumansai:main Apr 10, 2026
7 of 8 checks passed
sanil-23 added a commit to sanil-23/openhuman that referenced this pull request Apr 16, 2026
build_native_assistant_payload in subagent_runner.rs serialised the
assistant tool-call message using {"text": …, "tool_calls": [{
"type":"function","function":{…}}]} — two mismatches vs the format
parse::build_native_assistant_history produces and the backend
jinja template expects:

  1. "text" instead of "content" — the downstream convert_messages
     parser looks for "content" to detect assistant-with-tool-calls
     messages; "text" is not recognised, so the message is treated
     as a plain assistant message.
  2. Nested {"type":"function","function":{"name":…,"arguments":…}}
     wrappers instead of flat {"id":…,"name":…,"arguments":…}.

Result: every sub-agent that made a tool call and entered iteration
1+ hit a 400 from the backend: "jinja template rendering failed.
Message has tool role, but there was no previous assistant message
with a tool call!" — the backend saw a role=tool message without a
recognised assistant-with-tool-calls predecessor.

The fix replaces the broken inline copy with a direct call to
parse::build_native_assistant_history (the same serialiser
tool_loop.rs uses successfully for the orchestrator's tool calls).
The dead build_native_assistant_payload function is removed.

Introduced in PR tinyhumansai#474 (6465f3d, 2026-04-09). Latent since then
because the orchestrator self-heals by retrying via other agents.

E2E verified: skills_agent with gmail toolkit now progresses through
iterations 0→1→2 successfully (previously died at iteration 1 with
the 400 error every time).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
M3gA-Mind pushed a commit that referenced this pull request Apr 16, 2026
…#606)

build_native_assistant_payload in subagent_runner.rs serialised the
assistant tool-call message using {"text": …, "tool_calls": [{
"type":"function","function":{…}}]} — two mismatches vs the format
parse::build_native_assistant_history produces and the backend
jinja template expects:

  1. "text" instead of "content" — the downstream convert_messages
     parser looks for "content" to detect assistant-with-tool-calls
     messages; "text" is not recognised, so the message is treated
     as a plain assistant message.
  2. Nested {"type":"function","function":{"name":…,"arguments":…}}
     wrappers instead of flat {"id":…,"name":…,"arguments":…}.

Result: every sub-agent that made a tool call and entered iteration
1+ hit a 400 from the backend: "jinja template rendering failed.
Message has tool role, but there was no previous assistant message
with a tool call!" — the backend saw a role=tool message without a
recognised assistant-with-tool-calls predecessor.

The fix replaces the broken inline copy with a direct call to
parse::build_native_assistant_history (the same serialiser
tool_loop.rs uses successfully for the orchestrator's tool calls).
The dead build_native_assistant_payload function is removed.

Introduced in PR #474 (6465f3d, 2026-04-09). Latent since then
because the orchestrator self-heals by retrying via other agents.

E2E verified: skills_agent with gmail toolkit now progresses through
iterations 0→1→2 successfully (previously died at iteration 1 with
the 400 error every time).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant