feat(agent): sub-agents, reasoning→agentic routing, layered context pipeline#474
feat(agent): sub-agents, reasoning→agentic routing, layered context pipeline#474senamakel merged 14 commits intotinyhumansai:mainfrom
Conversation
…tions - Refactored the `Agent` struct to use `Arc` for shared ownership of the provider, tools, and tool specifications, enabling efficient memory management and concurrent access. - Introduced a `NullMemoryLoader` to optimize memory usage for sub-agents, allowing them to operate without incurring the cost of memory recall. - Added new methods in the `Agent` implementation to facilitate sharing of the provider, tools, and tool specifications with sub-agents, enhancing their operational efficiency. - Implemented a new `SystemPromptBuilder` method for constructing prompts specifically for sub-agents, ensuring they receive tailored context while minimizing unnecessary information. - Established a framework for loading custom agent definitions from TOML files, allowing for dynamic agent configuration and specialization. - Introduced a `ForkContext` to support efficient sub-agent execution in fork mode, leveraging shared resources for improved performance and reduced token usage.
- Introduced a global `AgentDefinitionRegistry` to manage built-in and custom agent definitions from TOML files, ensuring idempotent initialization. - Added new RPC handlers for listing, fetching, and reloading agent definitions, improving the flexibility of agent management. - Refactored the `Agent` struct to streamline sub-agent execution, including enhancements to the task execution flow and context handling. - Updated the orchestrator configuration to support fork mode for sub-agents, optimizing resource usage and performance. - Improved error handling and logging for agent definition loading and initialization processes, enhancing system reliability.
- Implemented a new asynchronous test to validate the full path of a parent agent issuing a `spawn_subagent` tool call. - The test ensures that the sub-agent's output is correctly folded into the parent's response, verifying the interaction between the parent agent and the sub-agent. - Enhanced the `AgentDefinitionRegistry` to support global initialization of built-in agents, ensuring consistent behavior across tests. - Updated the handling of tool calls and memory configuration to facilitate the new test scenario, improving overall test coverage for agent interactions.
- Introduced a new `category_filter` in `AgentDefinition` to restrict tool visibility based on their category (System or Skill). - Updated the `from_archetype` function to apply the category filter for the `SkillsAgent` archetype. - Modified `SubagentRunOptions` to include a `category_filter_override` for dynamic filtering during sub-agent execution. - Enhanced the `filter_tool_indices` function to incorporate category filtering logic, ensuring tools are correctly filtered based on their defined categories. - Updated relevant tests to validate the new category filtering functionality, improving overall test coverage for agent interactions.
- Introduced a new `context_pipeline` module to manage a layered context reduction strategy, enhancing memory efficiency during agent interactions. - Added stages for tool-result budgeting, history trimming, microcompaction, autocompaction, and session memory extraction, each with specific triggers and cache implications. - Updated the `Agent` struct to include a `context_pipeline` field, ensuring state persistence across turns. - Enhanced the `AgentBuilder` to initialize the context pipeline by default. - Implemented tests to validate the functionality and stability of the context pipeline, ensuring consistent behavior across agent sessions. - Refactored relevant components to integrate the new context management features, improving overall agent performance and memory handling.
…mpaction - Renamed variable for clarity in tool execution result handling. - Implemented a new stage in the context pipeline to apply a byte budget to tool results, ensuring efficient memory usage. - Added logging for budget application, including details on original and final byte sizes. - Integrated microcompaction stages before tool calls to manage history and reduce memory footprint, with appropriate logging for outcomes. - Updated the agent's session memory management to track turn counts, facilitating better resource handling across iterations.
…l call management - Simplified method calls in the `Agent` struct for clarity and efficiency. - Enhanced session memory extraction logic to spawn a background archivist sub-agent when thresholds are met. - Improved context pipeline handling for tool call recording and usage tracking. - Updated documentation and comments for better understanding of session memory extraction process. - Refactored microcompact function for cleaner code structure and readability.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 14 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a declarative sub-agent system, global AgentDefinition registry and TOML loader, an async subagent runner (typed + fork modes) with task-local contexts, a layered context-reduction pipeline (microcompact, tool-result budgeting, session-memory), spawn_subagent tool wiring, and a refactor splitting Agent into modular builder/runtime/turn components. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent
participant ContextPipeline
participant Microcompact
participant Provider
participant SessionMemory
Agent->>ContextPipeline: tick_turn() / record_usage()
Agent->>ContextPipeline: run_before_call(&mut history)
alt CompactionNeeded
ContextPipeline->>Microcompact: microcompact(history, keep_recent)
Microcompact-->>ContextPipeline: MicrocompactStats
ContextPipeline-->>Agent: PipelineOutcome::Microcompacted
else AutocompactionRequested
ContextPipeline-->>Agent: PipelineOutcome::AutocompactionRequested
else ContextExhausted
ContextPipeline-->>Agent: PipelineOutcome::ContextExhausted
else NoOp
ContextPipeline-->>Agent: PipelineOutcome::NoOp
end
Agent->>Provider: chat(messages)
Provider-->>Agent: response + usage
Agent->>ContextPipeline: record_tool_calls(n)
Agent->>SessionMemory: should_extract_session_memory()?
alt Extraction needed
SessionMemory-->>Agent: true
Agent->>Agent: spawn archivist sub-agent (background)
end
sequenceDiagram
participant Parent as Parent Agent
participant SpawnTool as SpawnSubagent Tool
participant Registry as AgentDefinitionRegistry
participant Runner as Subagent Runner
participant Provider as Sub-agent Provider
participant ForkCtx as Fork Context
Parent->>SpawnTool: spawn_subagent(agent_id, prompt, opts)
SpawnTool->>Registry: get(agent_id)
Registry-->>SpawnTool: AgentDefinition
alt Fork Mode
SpawnTool->>ForkCtx: with_fork_context(message_prefix, fork_prompt)
ForkCtx->>Runner: run_subagent(def, prompt, mode=fork)
Runner->>Provider: call(parent_prefix + fork_prompt)
Provider-->>Runner: assistant messages
else Typed Mode
SpawnTool->>Runner: run_subagent(def, prompt, mode=typed)
Runner->>Runner: load archetype prompt, filter tools
Runner->>Provider: call(filtered_messages)
Provider-->>Runner: assistant messages
end
Runner-->>SpawnTool: SubagentRunOutcome
SpawnTool-->>Parent: ToolResult(sub-agent output)
Parent->>Parent: append tool-result to history
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (10)
src/openhuman/tools/ops.rs (1)
78-84: Add a focused test assertingspawn_subagentis present inall_tools_with_runtime.Since this is a key behavioral addition, a direct assertion in the test module would guard against accidental registry regressions.
[suggested fix]
Example test addition
+ #[test] + fn all_tools_includes_spawn_subagent() { + let tmp = TempDir::new().unwrap(); + let security = Arc::new(SecurityPolicy::default()); + let mem_cfg = MemoryConfig { backend: "markdown".into(), ..MemoryConfig::default() }; + let mem: Arc<dyn Memory> = + Arc::from(crate::openhuman::memory::create_memory(&mem_cfg, tmp.path(), None).unwrap()); + let cfg = test_config(&tmp); + + let tools = all_tools( + Arc::new(Config::default()), + &security, + mem, + None, + None, + &BrowserConfig::default(), + &crate::openhuman::config::HttpRequestConfig::default(), + tmp.path(), + &HashMap::new(), + None, + &cfg, + ); + let names: Vec<&str> = tools.iter().map(|t| t.name()).collect(); + assert!(names.contains(&"spawn_subagent")); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/tools/ops.rs` around lines 78 - 84, Add a unit test that asserts the spawn_subagent tool is registered in the runtime tools list: call the all_tools_with_runtime() registry function (or the test helper that returns the Vec/Map of tools) and assert it contains an entry corresponding to SpawnSubagentTool::new() (check for the tool's public identifier "spawn_subagent" or the tool's type/name), failing the test if that identifier is missing; place the test in the existing test module for tools so future regressions to the registry are caught.src/openhuman/event_bus/events.rs (1)
230-233: Consider adding test coverage for new sub-agent event variants.The
all_variants_have_correct_domaintest doesn't include the newSubagentSpawned,SubagentCompleted, andSubagentFailedvariants. While the match arm correctly routes them to"agent", adding explicit test cases would maintain the comprehensive coverage pattern established by existing tests.💡 Example test cases to add
// Add to the `cases` vec in all_variants_have_correct_domain test: ( DomainEvent::SubagentSpawned { parent_session: "p".into(), agent_id: "a".into(), mode: "typed".into(), task_id: "t".into(), prompt_chars: 100, }, "agent", ), ( DomainEvent::SubagentCompleted { parent_session: "p".into(), task_id: "t".into(), agent_id: "a".into(), elapsed_ms: 0, output_chars: 0, iterations: 0, }, "agent", ), ( DomainEvent::SubagentFailed { parent_session: "p".into(), task_id: "t".into(), agent_id: "a".into(), error: "e".into(), }, "agent", ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/event_bus/events.rs` around lines 230 - 233, The test all_variants_have_correct_domain is missing explicit cases for the new DomainEvent variants SubagentSpawned, SubagentCompleted, and SubagentFailed; add entries to the cases Vec in that test with representative values for each of those variants and expect the domain "agent" (e.g., create DomainEvent::SubagentSpawned { ... }, DomainEvent::SubagentCompleted { ... }, DomainEvent::SubagentFailed { ... } paired with "agent") so the test covers the new match arm that returns "agent".src/openhuman/agent/schemas.rs (1)
250-258: Redundantserde_json::to_valuewrappingserde_json::json!.
serde_json::json!already returns aserde_json::Value, so wrapping it into_value()is unnecessary and adds a fallible conversion that can't fail in practice.♻️ Simplified serialization
fn handle_list_definitions(_params: Map<String, Value>) -> ControllerFuture { Box::pin(async { let registry = crate::openhuman::agent::harness::AgentDefinitionRegistry::global() .ok_or_else(|| "AgentDefinitionRegistry not initialised".to_string())?; let defs: Vec<&crate::openhuman::agent::harness::AgentDefinition> = registry.list(); - serde_json::to_value(serde_json::json!({ "definitions": defs })) - .map_err(|e| format!("serialise: {e}")) + Ok(serde_json::json!({ "definitions": defs })) }) }Same applies to
handle_get_definitionat lines 271-272.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/schemas.rs` around lines 250 - 258, Both handle_list_definitions and handle_get_definition are calling serde_json::to_value on a value already produced by serde_json::json!, which is redundant and introduces an unnecessary fallible conversion; update handle_list_definitions (and handle_get_definition) to directly return Ok(serde_json::json!({ "definitions": defs })) (or Ok(serde_json::json!({ "definition": def })) respectively) instead of wrapping json! in serde_json::to_value(...). Keep the existing error propagation from AgentDefinitionRegistry::global() and registry.list() unchanged so the function still returns the same Result/ControllerFuture type.src/openhuman/agent/context_pipeline/tool_result_budget.rs (1)
104-111: Unnecessaryout.clone()before return.The
outString is moved into the tuple on return. The.clone()creates an extra allocation.Remove clone
( - out.clone(), + out, BudgetOutcome { original_bytes, - final_bytes: out.len(), + final_bytes: cut + TRAILER_RESERVED, // or compute before move truncated: true, }, )Since
out.len()is needed forfinal_bytes, you can capture it before moving:+ let final_bytes = out.len(); ( - out.clone(), + out, BudgetOutcome { original_bytes, - final_bytes: out.len(), + final_bytes, truncated: true, }, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/context_pipeline/tool_result_budget.rs` around lines 104 - 111, The return is cloning `out` unnecessarily; compute final_bytes before moving `out` and return the owned `out` into the tuple instead of `out.clone()`. Specifically, capture `let final_bytes = out.len();` (or similar) and then return `(out, BudgetOutcome { original_bytes, final_bytes, truncated: true })`, removing the `out.clone()` call.src/openhuman/agent/harness/executor.rs (1)
405-411: Consider logging unrecognized sandbox values.Unrecognized strings silently fall back to
SandboxMode::None. Atracing::warn!when the value is neither"sandboxed"nor"read_only"nor empty would surface config typos during development.Example
if let Some(sb) = over.sandbox.as_deref() { definition.sandbox_mode = match sb { "sandboxed" => super::definition::SandboxMode::Sandboxed, "read_only" => super::definition::SandboxMode::ReadOnly, - _ => super::definition::SandboxMode::None, + other => { + if !other.is_empty() { + tracing::warn!( + archetype = %archetype, + value = other, + "[orchestrator] unrecognized sandbox override, defaulting to None" + ); + } + super::definition::SandboxMode::None + } }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/executor.rs` around lines 405 - 411, The current match in executor.rs that sets definition.sandbox_mode from over.sandbox (the block using over.sandbox.as_deref()) silently maps any unexpected string to SandboxMode::None; update this to emit a tracing::warn! (including the unrecognized value and context such as the agent or definition id if available) in the default arm when sb is not "sandboxed" or "read_only" and not empty, so typos are surfaced during development while preserving the fallback to super::definition::SandboxMode::None.src/openhuman/tools/spawn_subagent.rs (1)
288-301: Minor: avoid cloning before sort+dedup.
knownis consumed by this function, so you can sort and dedup it in place rather than cloning intounique.Simplified dedup
- let mut unique = known.clone(); - unique.sort(); - unique.dedup(); - if unique.iter().any(|s| s == skill_id) { + known.sort(); + known.dedup(); + if known.iter().any(|s| s == skill_id) { Ok(()) } else { Err(format!( "skill_filter '{skill_id}' does not match any installed skill. Available: {}", - unique.join(", ") + known.join(", ") )) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/tools/spawn_subagent.rs` around lines 288 - 301, The code clones `known` into `unique` before sorting/deduping which is unnecessary because `known` is not used elsewhere; remove the `unique` clone and perform sort+dedup in place on `known` (call known.sort(); known.dedup();) and then use known.iter().any(|s| s == skill_id) for the membership check; the transformation should be applied after you build `known` from engine.all_tools() by splitting def.name on "__".src/openhuman/agent/harness/definition_loader.rs (1)
193-206: Test assumes fixed builtin count.Lines 201-202 assert
reg.len() == 10(9 builtins + 1 custom). If builtins are added/removed, this test will fail. Consider assertingreg.len() > 1and that specific IDs exist rather than exact counts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/definition_loader.rs` around lines 193 - 206, The test registry_load_merges_builtins_and_custom assumes an exact builtin count by asserting reg.len() == 10; change this to a non-brittle assertion (e.g., assert!(reg.len() > 1)) and keep the targeted checks for specific IDs using reg.get("notion_specialist"), reg.get("code_executor"), and reg.get("fork") to ensure the builtins and the custom definition loaded by AgentDefinitionRegistry::load are present without relying on a fixed total count.src/openhuman/agent/agent.rs (1)
1388-1398: Optimistic extraction marking is correct for fire-and-forget.The comment (lines 1388-1392) clearly explains the trade-off: marking complete immediately avoids needing a channel back from the background task. Failed extractions retry after the next threshold crossing, which is acceptable for an idempotent archivist.
Consider whether
mark_extraction_started()adds value if it's immediately followed bymark_extraction_complete()— the started state is never observable externally. If there's no telemetry or conditional logic relying on the "in-flight" state, you could simplify to justmark_extraction_complete().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agent.rs` around lines 1388 - 1398, The call sequence marks extraction started then immediately complete, making mark_extraction_started() effectively no-op; update the code in agent.rs by removing the call to self.context_pipeline.session_memory.mark_extraction_started() and keep only self.context_pipeline.session_memory.mark_extraction_complete(), unless there is telemetry or conditional logic that explicitly depends on the "in-flight" state—if such dependencies exist, retain started and add the necessary observable usage instead.src/openhuman/agent/context_pipeline/pipeline.rs (1)
151-163: Consider callingguard.record_compaction_success()after successful microcompact.When microcompact clears envelopes (line 155), this is a successful compaction that should reset the guard's circuit breaker. Currently, only the caller would need to call this, but the pipeline could do it internally for consistency.
♻️ Reset circuit breaker after successful microcompact
Based on the
ContextGuardsnippet showingrecord_compaction_success()resets the failure counter, consider calling it here:if stats.envelopes_cleared > 0 { + self.guard.record_compaction_success(); tracing::info!( envelopes_cleared = stats.envelopes_cleared, entries_cleared = stats.entries_cleared, bytes_freed = stats.bytes_freed, "[context_pipeline] microcompact fired" ); return PipelineOutcome::Microcompacted(stats); }This ensures the circuit breaker is reset when microcompact succeeds, preventing unnecessary escalation to autocompaction on subsequent turns.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/context_pipeline/pipeline.rs` around lines 151 - 163, When microcompaction succeeds (ContextCheckResult::CompactionNeeded branch) and stats.envelopes_cleared > 0, call the ContextGuard method record_compaction_success() to reset the compaction circuit breaker before returning PipelineOutcome::Microcompacted(stats); locate the microcompact(history, self.config.microcompact_keep_recent) call and insert a guard.record_compaction_success() (using the same guard instance used for the pipeline) immediately after the success check and before the tracing::info/return path so successful microcompaction clears the failure counter.src/openhuman/agent/harness/fork_context.rs (1)
172-206: Tests verify scope boundaries correctly.The tests confirm that both task-locals return
Noneoutside their scopes and thatForkContextis visible insidewith_fork_context. Consider adding a similar visibility test forParentExecutionContextfor completeness.💡 Optional: Add a visibility test for ParentExecutionContext
#[tokio::test] async fn parent_context_visible_inside_scope() { use crate::openhuman::config::{AgentConfig, IdentityConfig}; // Construct a minimal ParentExecutionContext and verify it's accessible inside scope // (would require test helpers for Provider/Memory mocks) }This would mirror the
fork_context_visible_inside_scopetest and ensure symmetric coverage.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/fork_context.rs` around lines 172 - 206, Add a test mirroring fork_context_visible_inside_scope that ensures ParentExecutionContext is visible inside its task-local scope and None outside: create a minimal ParentExecutionContext instance, call with_parent_context(<that instance>, async { let p = current_parent().expect("parent context should be visible"); assert on a couple of fields }); .await, and then assert current_parent().is_none(); reference the ParentExecutionContext type and the with_parent_context and current_parent helpers to locate where to add the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/harness/definition.rs`:
- Around line 7-9: The docs are inconsistent about config format: the module
docstring and the load method mention YAML (`*.yaml`) while
DefinitionSource::File's docstring says "TOML file"; pick the actual supported
format (or state both) and make all docstrings consistent. Update the
module-level comment at the top of definition.rs, the DefinitionSource::File
docstring (the enum variant), and the load method docstring to uniformly state
the correct format (or explicitly list supported formats and precedence),
ensuring the same extension and parser name appear in each place so users are
not confused.
In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 646-658: render_subagent_system_prompt currently only appends
definition.system_prompt when it's PromptSource::Inline, so PromptSource::File
content that was preloaded into archetype_prompt_body is never written into out;
update render_subagent_system_prompt to accept an additional archetype_body:
&str (or Option<&str>) parameter and, when definition.system_prompt is
PromptSource::File, append archetype_body.trim_end() + "\n\n" into out (fall
back to existing Inline behavior otherwise). Also update the callsite that
constructs SystemPromptBuilder::for_subagent to pass &archetype_prompt_body (or
Some(&archetype_prompt_body)) so the loaded archetype prompt is actually
rendered into the system prompt.
In `@src/openhuman/agent/prompt.rs`:
- Around line 59-66: The sub-agent prompt assembly currently injects a dynamic
DateTimeSection using Local::now(), which makes the sub-agent system prompt
change per call and breaks KV-prefix cache stability; remove the DateTimeSection
(and any use of Local::now()) from the narrow prompt builder so the sub-agent
prefix is deterministic, keeping the other sections (workspace, archetype
section, and respect for omit_identity/omit_safety_preamble flags in the
AgentDefinition) intact; if a timestamp is required for display only, replace
DateTimeSection with a deterministic placeholder or configuration-time value
instead of Local::now().
---
Nitpick comments:
In `@src/openhuman/agent/agent.rs`:
- Around line 1388-1398: The call sequence marks extraction started then
immediately complete, making mark_extraction_started() effectively no-op; update
the code in agent.rs by removing the call to
self.context_pipeline.session_memory.mark_extraction_started() and keep only
self.context_pipeline.session_memory.mark_extraction_complete(), unless there is
telemetry or conditional logic that explicitly depends on the "in-flight"
state—if such dependencies exist, retain started and add the necessary
observable usage instead.
In `@src/openhuman/agent/context_pipeline/pipeline.rs`:
- Around line 151-163: When microcompaction succeeds
(ContextCheckResult::CompactionNeeded branch) and stats.envelopes_cleared > 0,
call the ContextGuard method record_compaction_success() to reset the compaction
circuit breaker before returning PipelineOutcome::Microcompacted(stats); locate
the microcompact(history, self.config.microcompact_keep_recent) call and insert
a guard.record_compaction_success() (using the same guard instance used for the
pipeline) immediately after the success check and before the
tracing::info/return path so successful microcompaction clears the failure
counter.
In `@src/openhuman/agent/context_pipeline/tool_result_budget.rs`:
- Around line 104-111: The return is cloning `out` unnecessarily; compute
final_bytes before moving `out` and return the owned `out` into the tuple
instead of `out.clone()`. Specifically, capture `let final_bytes = out.len();`
(or similar) and then return `(out, BudgetOutcome { original_bytes, final_bytes,
truncated: true })`, removing the `out.clone()` call.
In `@src/openhuman/agent/harness/definition_loader.rs`:
- Around line 193-206: The test registry_load_merges_builtins_and_custom assumes
an exact builtin count by asserting reg.len() == 10; change this to a
non-brittle assertion (e.g., assert!(reg.len() > 1)) and keep the targeted
checks for specific IDs using reg.get("notion_specialist"),
reg.get("code_executor"), and reg.get("fork") to ensure the builtins and the
custom definition loaded by AgentDefinitionRegistry::load are present without
relying on a fixed total count.
In `@src/openhuman/agent/harness/executor.rs`:
- Around line 405-411: The current match in executor.rs that sets
definition.sandbox_mode from over.sandbox (the block using
over.sandbox.as_deref()) silently maps any unexpected string to
SandboxMode::None; update this to emit a tracing::warn! (including the
unrecognized value and context such as the agent or definition id if available)
in the default arm when sb is not "sandboxed" or "read_only" and not empty, so
typos are surfaced during development while preserving the fallback to
super::definition::SandboxMode::None.
In `@src/openhuman/agent/harness/fork_context.rs`:
- Around line 172-206: Add a test mirroring fork_context_visible_inside_scope
that ensures ParentExecutionContext is visible inside its task-local scope and
None outside: create a minimal ParentExecutionContext instance, call
with_parent_context(<that instance>, async { let p =
current_parent().expect("parent context should be visible"); assert on a couple
of fields }); .await, and then assert current_parent().is_none(); reference the
ParentExecutionContext type and the with_parent_context and current_parent
helpers to locate where to add the test.
In `@src/openhuman/agent/schemas.rs`:
- Around line 250-258: Both handle_list_definitions and handle_get_definition
are calling serde_json::to_value on a value already produced by
serde_json::json!, which is redundant and introduces an unnecessary fallible
conversion; update handle_list_definitions (and handle_get_definition) to
directly return Ok(serde_json::json!({ "definitions": defs })) (or
Ok(serde_json::json!({ "definition": def })) respectively) instead of wrapping
json! in serde_json::to_value(...). Keep the existing error propagation from
AgentDefinitionRegistry::global() and registry.list() unchanged so the function
still returns the same Result/ControllerFuture type.
In `@src/openhuman/event_bus/events.rs`:
- Around line 230-233: The test all_variants_have_correct_domain is missing
explicit cases for the new DomainEvent variants SubagentSpawned,
SubagentCompleted, and SubagentFailed; add entries to the cases Vec in that test
with representative values for each of those variants and expect the domain
"agent" (e.g., create DomainEvent::SubagentSpawned { ... },
DomainEvent::SubagentCompleted { ... }, DomainEvent::SubagentFailed { ... }
paired with "agent") so the test covers the new match arm that returns "agent".
In `@src/openhuman/tools/ops.rs`:
- Around line 78-84: Add a unit test that asserts the spawn_subagent tool is
registered in the runtime tools list: call the all_tools_with_runtime() registry
function (or the test helper that returns the Vec/Map of tools) and assert it
contains an entry corresponding to SpawnSubagentTool::new() (check for the
tool's public identifier "spawn_subagent" or the tool's type/name), failing the
test if that identifier is missing; place the test in the existing test module
for tools so future regressions to the registry are caught.
In `@src/openhuman/tools/spawn_subagent.rs`:
- Around line 288-301: The code clones `known` into `unique` before
sorting/deduping which is unnecessary because `known` is not used elsewhere;
remove the `unique` clone and perform sort+dedup in place on `known` (call
known.sort(); known.dedup();) and then use known.iter().any(|s| s == skill_id)
for the membership check; the transformation should be applied after you build
`known` from engine.all_tools() by splitting def.name on "__".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 84d8dc14-b0b7-4db0-a728-c4d6fedac753
📒 Files selected for processing (28)
src/core/jsonrpc.rssrc/openhuman/agent/agent.rssrc/openhuman/agent/context_pipeline/microcompact.rssrc/openhuman/agent/context_pipeline/mod.rssrc/openhuman/agent/context_pipeline/pipeline.rssrc/openhuman/agent/context_pipeline/session_memory.rssrc/openhuman/agent/context_pipeline/tool_result_budget.rssrc/openhuman/agent/harness/builtin_definitions.rssrc/openhuman/agent/harness/definition.rssrc/openhuman/agent/harness/definition_loader.rssrc/openhuman/agent/harness/executor.rssrc/openhuman/agent/harness/fork_context.rssrc/openhuman/agent/harness/mod.rssrc/openhuman/agent/harness/subagent_runner.rssrc/openhuman/agent/memory_loader.rssrc/openhuman/agent/mod.rssrc/openhuman/agent/prompt.rssrc/openhuman/agent/schemas.rssrc/openhuman/channels/runtime/startup.rssrc/openhuman/config/schema/agent.rssrc/openhuman/config/schema/orchestrator.rssrc/openhuman/config/schema/types.rssrc/openhuman/event_bus/events.rssrc/openhuman/tools/mod.rssrc/openhuman/tools/ops.rssrc/openhuman/tools/skill_bridge.rssrc/openhuman/tools/spawn_subagent.rssrc/openhuman/tools/traits.rs
- Convert agent.rs (1988 lines) into agent/ folder with six files: * types.rs — Agent + AgentBuilder struct defs * builder.rs — AgentBuilder fluent API + Agent::from_config factory * turn.rs — turn lifecycle, tool dispatch, context pipeline wiring * runtime.rs — public accessors, run_single/run_interactive, helpers * tests.rs — integration tests with shared fakes * mod.rs — glue + top-level `run` convenience function - Drop misc external inspiration references from doc comments in the context_pipeline module and fork_context — the files now stand on their own design language.
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (4)
src/openhuman/agent/agent/tests.rs (1)
277-379: Add a fork-mode regression test.This end-to-end case covers sub-agent dispatch well, but it does not explicitly pin the
mode: "fork"path or assert the prefix-replay behavior introduced in this PR. A dedicated fork-mode test would keep the KV-cache reuse path from regressing silently.Based on learnings: Keep functionality validation in Rust tests for
rust-core.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agent/tests.rs` around lines 277 - 379, Add a new tokio test (e.g. turn_dispatches_spawn_subagent_in_fork_mode) that mirrors turn_dispatches_spawn_subagent_through_full_path but sets the SpawnSubagentTool call arguments to include "mode": "fork"; initialize builtins via AgentDefinitionRegistry::init_global_builtins(), build the Agent with the same MockProvider and tools (SpawnSubagentTool::new()), call Agent::turn and assert the final response equals the expected folded text, then assert the parent's history contains the spawn_subagent assistant tool call and a tool-result/chat tool entry containing the sub-agent output (same checks as existing test); additionally assert the provider response consumption order demonstrates prefix-replay (i.e. the mocked responses vector is consumed in the expected sequence: parent tool_call, sub-agent reply, parent folded reply) to cover the fork-mode KV-cache reuse path.src/openhuman/agent/agent/builder.rs (3)
341-341: Minor:full_configArc allocated even when unused.The
full_configArc is created unconditionally whenlearning.enabledis true, but it's only used inside thereflection_enabledblock (line 362). If reflection is disabled, the Arc allocation and clone are wasted.♻️ Suggested refactor
if config.learning.reflection_enabled { + let full_config = Arc::new(config.clone()); // For cloud reflection, wrap the provider in an Arc. // For local, no provider needed. let reflection_provider: Option<Arc<dyn crate::openhuman::providers::Provider>> =And remove the
let full_config = Arc::new(config.clone());from line 341.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agent/builder.rs` at line 341, The code unconditionally creates full_config = Arc::new(config.clone()) when learning.enabled is true even though full_config is only used inside the reflection_enabled branch; to avoid the wasted allocation, move the Arc::new(config.clone()) expression into the block where reflection_enabled is true (or lazily create it only if reflection_enabled) and remove the original unconditional let full_config declaration; update any references to full_config inside the reflection handling so they use the newly created Arc from that branch and keep the existing checks for learning.enabled and reflection_enabled intact.
173-181: Add rustdoc forevent_contextmethod.All other builder methods have documentation comments explaining their purpose, but
event_contextis missing one. For consistency and discoverability, add a brief doc comment explaining what session_id and channel parameters represent.📝 Suggested documentation
+ /// Sets the event context for the agent (session identifier and channel). + /// + /// Used for correlating agent activity with external event streams. pub fn event_context( mut self, session_id: impl Into<String>, channel: impl Into<String>, ) -> Self {Based on learnings: "New or changed behavior must ship with matching documentation."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agent/builder.rs` around lines 173 - 181, Add a rustdoc comment to the builder method event_context explaining its purpose and the meaning of its parameters: state that event_context sets the event_session_id (session_id) used to group events for a user/session and event_channel (channel) indicating the source or stream for those events; keep it brief, use /// style above the pub fn event_context(...) signature and mention that both parameters are converted into Strings and stored in event_session_id and event_channel respectively.
228-228: Consider adding a setter forcontext_pipeline.The builder uses
ContextPipeline::default()unconditionally, but there's no corresponding setter method. If the pipeline configuration should always be default, this is fine. However, if future use cases need custom pipeline configurations (e.g., different budget thresholds, disabled stages), a setter would be needed.If intentional, a brief comment explaining why the pipeline isn't configurable would help future maintainers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agent/builder.rs` at line 228, The builder currently hardcodes context_pipeline to ContextPipeline::default() (context_pipeline: ContextPipeline::default()) with no setter; add a fluent setter method (e.g., with_context_pipeline or set_context_pipeline) on the builder type that accepts a ContextPipeline and assigns it to self.context_pipeline (returning Self or &mut Self consistent with other builder methods) so callers can supply custom pipeline configurations; alternatively, if the default is intentional, add a short doc comment above the context_pipeline field or its initialization explaining why custom pipelines are not supported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/agent/runtime.rs`:
- Around line 287-299: The CLI path in run_interactive currently calls
self.turn(...) directly which bypasses lifecycle events (AgentTurnStarted,
AgentTurnCompleted, AgentError) and custom error/telemetry handling; change
run_interactive to reuse run_single (or the same call path run_single uses) for
each incoming message from rx.recv() so each turn goes through the unified
lifecycle and error handling instead of calling self.turn directly; ensure
errors returned from run_single are logged/handled the same way the non-CLI path
does and keep the existing listen_handle.abort() behavior after the loop.
In `@src/openhuman/agent/agent/turn.rs`:
- Around line 56-60: The current log::info call in turn.rs prints the full
system_prompt (variable system_prompt) which can contain sensitive
user/workspace/memory data; change the logging in the agent_loop so it does NOT
emit the full prompt body at info level — instead log only non-sensitive
metadata such as system_prompt.chars().count(), a truncated/redacted preview
(e.g., first N chars) or a stable hash/fingerprint, and/or a "redacted"
placeholder; update the log::info invocation that references system_prompt to
use one of these safe alternatives and ensure any detailed prompt content is
removed or only logged at a secure debug/audit level.
- Around line 294-306: Agent::turn() currently writes pre-tool assistant text
directly to stdout (the print!("{text}") and stdout flush) which couples core
logic to a terminal; remove those stdout writes from the block that also pushes
the assistant message into history (the call to
self.history.push(ConversationMessage::Chat(ChatMessage::assistant(text.clone()))))
and only keep the history update and logging, and instead surface rendering to
callers (e.g., have the CLI wrapper or caller render the latest history entry or
provide an optional callback/renderer parameter to Agent::turn that can be used
by CLI code to print). Ensure no direct print!/stdout flush remains in turn.rs
so sub-agents and library consumers don’t get terminal output.
- Around line 51-75: The code currently calls fetch_learned_context()
unconditionally at the top of turn handling, but then discards its result on
subsequent turns; move the call so learned context is only fetched when needed
(i.e., inside the if self.history.is_empty() branch before build_system_prompt)
to avoid extra storage reads on later turns. Update references to learned so
build_system_prompt(learned) still receives the fetched value in the first-turn
path and remove the unused prefetch (the binding `let learned =
self.fetch_learned_context().await;` and the throwaway `let _ = learned;`) to
prevent needless memory lookups in agent_loop/turn handling.
---
Nitpick comments:
In `@src/openhuman/agent/agent/builder.rs`:
- Line 341: The code unconditionally creates full_config =
Arc::new(config.clone()) when learning.enabled is true even though full_config
is only used inside the reflection_enabled branch; to avoid the wasted
allocation, move the Arc::new(config.clone()) expression into the block where
reflection_enabled is true (or lazily create it only if reflection_enabled) and
remove the original unconditional let full_config declaration; update any
references to full_config inside the reflection handling so they use the newly
created Arc from that branch and keep the existing checks for learning.enabled
and reflection_enabled intact.
- Around line 173-181: Add a rustdoc comment to the builder method event_context
explaining its purpose and the meaning of its parameters: state that
event_context sets the event_session_id (session_id) used to group events for a
user/session and event_channel (channel) indicating the source or stream for
those events; keep it brief, use /// style above the pub fn event_context(...)
signature and mention that both parameters are converted into Strings and stored
in event_session_id and event_channel respectively.
- Line 228: The builder currently hardcodes context_pipeline to
ContextPipeline::default() (context_pipeline: ContextPipeline::default()) with
no setter; add a fluent setter method (e.g., with_context_pipeline or
set_context_pipeline) on the builder type that accepts a ContextPipeline and
assigns it to self.context_pipeline (returning Self or &mut Self consistent with
other builder methods) so callers can supply custom pipeline configurations;
alternatively, if the default is intentional, add a short doc comment above the
context_pipeline field or its initialization explaining why custom pipelines are
not supported.
In `@src/openhuman/agent/agent/tests.rs`:
- Around line 277-379: Add a new tokio test (e.g.
turn_dispatches_spawn_subagent_in_fork_mode) that mirrors
turn_dispatches_spawn_subagent_through_full_path but sets the SpawnSubagentTool
call arguments to include "mode": "fork"; initialize builtins via
AgentDefinitionRegistry::init_global_builtins(), build the Agent with the same
MockProvider and tools (SpawnSubagentTool::new()), call Agent::turn and assert
the final response equals the expected folded text, then assert the parent's
history contains the spawn_subagent assistant tool call and a tool-result/chat
tool entry containing the sub-agent output (same checks as existing test);
additionally assert the provider response consumption order demonstrates
prefix-replay (i.e. the mocked responses vector is consumed in the expected
sequence: parent tool_call, sub-agent reply, parent folded reply) to cover the
fork-mode KV-cache reuse path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6b85e999-f0b4-4212-ae58-6a113e12c0c7
📒 Files selected for processing (13)
src/openhuman/agent/agent.rssrc/openhuman/agent/agent/builder.rssrc/openhuman/agent/agent/mod.rssrc/openhuman/agent/agent/runtime.rssrc/openhuman/agent/agent/tests.rssrc/openhuman/agent/agent/turn.rssrc/openhuman/agent/agent/types.rssrc/openhuman/agent/context_pipeline/microcompact.rssrc/openhuman/agent/context_pipeline/mod.rssrc/openhuman/agent/context_pipeline/pipeline.rssrc/openhuman/agent/context_pipeline/session_memory.rssrc/openhuman/agent/context_pipeline/tool_result_budget.rssrc/openhuman/agent/harness/fork_context.rs
💤 Files with no reviewable changes (1)
- src/openhuman/agent/agent.rs
✅ Files skipped from review due to trivial changes (2)
- src/openhuman/agent/context_pipeline/mod.rs
- src/openhuman/agent/context_pipeline/tool_result_budget.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/openhuman/agent/harness/fork_context.rs
Inline comment fixes: - definition.rs: YAML/TOML inconsistency — module doc, PromptSource, source bookkeeping, and load() all now uniformly document the TOML format. - subagent_runner.rs: render_subagent_system_prompt previously only appended PromptSource::Inline bodies, silently dropping PromptSource::File content. Thread the preloaded archetype_body through as an explicit &str parameter so both source variants render. Drops the unused SystemPromptBuilder + tools_for_prompt wiring while we're here. - prompt.rs: remove DateTimeSection from SystemPromptBuilder::for_subagent. Local::now() would make the sub-agent system prompt change per call and break KV-prefix cache stability. Document the invariant. Nitpick fixes: - agent/turn.rs: drop the no-op mark_extraction_started() call — the immediately-following mark_extraction_complete() clears the in-flight flag anyway. - context_pipeline/pipeline.rs: call guard.record_compaction_success() on microcompact success so a prior streak of autocompaction failures doesn't leave the circuit breaker tripped after a successful reduction. - context_pipeline/tool_result_budget.rs: remove the unnecessary out.clone() in the truncation return — capture final_bytes first, then move out. - definition_loader.rs: replace brittle reg.len() == 10 with assert!(reg.len() > 1) plus the existing targeted .get() checks. - executor.rs: tracing::warn! on unknown sandbox override values so typos surface during development; explicitly accept "none" and empty string as valid defaults. - fork_context.rs: add parent_context_visible_inside_scope test mirroring fork_context_visible_inside_scope, with minimal stub Provider/Memory impls so the test stays self-contained. - schemas.rs: drop redundant serde_json::to_value(serde_json::json!(...)) wrapping in handle_list_definitions + handle_get_definition. - event_bus/events.rs: add SubagentSpawned/SubagentCompleted/SubagentFailed cases to all_variants_have_correct_domain. - tools/ops.rs: add all_tools_includes_spawn_subagent regression test. - tools/spawn_subagent.rs: sort/dedup the known skill list in place instead of cloning into a second vec. Tests: 2316 passed / 0 failed (up from 2314; two new tests added). fmt + clippy clean on all touched files.
…LI improvements - Added `event_context` method to `AgentBuilder` for setting `session_id` and `channel` for `DomainEvent`s, improving event tagging and correlation. - Updated `run_interactive` method in `Agent` to dispatch messages through `run_single`, ensuring consistent lifecycle event handling and error sanitization for interactive turns.
…c for full config in reflection hook. This change reduces unnecessary cloning when learning is enabled, improving performance.
- Introduced a new test, `turn_dispatches_spawn_subagent_in_fork_mode`, to validate the behavior of the agent when spawning a sub-agent in fork mode. - The test ensures that the parent agent correctly processes the sub-agent's output and maintains the expected response sequence. - Enhanced the test setup with a mock provider and memory configuration to simulate the agent's environment effectively.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
- Updated the `handle_sync` function to log a debug message and return a no-op response when a skill does not implement the `onSync` handler, preventing unnecessary RPC errors for skills that do not require periodic syncs. - This change improves logging clarity and reduces error noise in logs and dashboards for skills that are not designed to handle sync operations.
build_native_assistant_payload in subagent_runner.rs serialised the
assistant tool-call message using {"text": …, "tool_calls": [{
"type":"function","function":{…}}]} — two mismatches vs the format
parse::build_native_assistant_history produces and the backend
jinja template expects:
1. "text" instead of "content" — the downstream convert_messages
parser looks for "content" to detect assistant-with-tool-calls
messages; "text" is not recognised, so the message is treated
as a plain assistant message.
2. Nested {"type":"function","function":{"name":…,"arguments":…}}
wrappers instead of flat {"id":…,"name":…,"arguments":…}.
Result: every sub-agent that made a tool call and entered iteration
1+ hit a 400 from the backend: "jinja template rendering failed.
Message has tool role, but there was no previous assistant message
with a tool call!" — the backend saw a role=tool message without a
recognised assistant-with-tool-calls predecessor.
The fix replaces the broken inline copy with a direct call to
parse::build_native_assistant_history (the same serialiser
tool_loop.rs uses successfully for the orchestrator's tool calls).
The dead build_native_assistant_payload function is removed.
Introduced in PR tinyhumansai#474 (6465f3d, 2026-04-09). Latent since then
because the orchestrator self-heals by retrying via other agents.
E2E verified: skills_agent with gmail toolkit now progresses through
iterations 0→1→2 successfully (previously died at iteration 1 with
the 400 error every time).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…#606) build_native_assistant_payload in subagent_runner.rs serialised the assistant tool-call message using {"text": …, "tool_calls": [{ "type":"function","function":{…}}]} — two mismatches vs the format parse::build_native_assistant_history produces and the backend jinja template expects: 1. "text" instead of "content" — the downstream convert_messages parser looks for "content" to detect assistant-with-tool-calls messages; "text" is not recognised, so the message is treated as a plain assistant message. 2. Nested {"type":"function","function":{"name":…,"arguments":…}} wrappers instead of flat {"id":…,"name":…,"arguments":…}. Result: every sub-agent that made a tool call and entered iteration 1+ hit a 400 from the backend: "jinja template rendering failed. Message has tool role, but there was no previous assistant message with a tool call!" — the backend saw a role=tool message without a recognised assistant-with-tool-calls predecessor. The fix replaces the broken inline copy with a direct call to parse::build_native_assistant_history (the same serialiser tool_loop.rs uses successfully for the orchestrator's tool calls). The dead build_native_assistant_payload function is removed. Introduced in PR #474 (6465f3d, 2026-04-09). Latent since then because the orchestrator self-heals by retrying via other agents. E2E verified: skills_agent with gmail toolkit now progresses through iterations 0→1→2 successfully (previously died at iteration 1 with the 400 error every time). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
This branch lands four connected pieces of agent-runtime work:
Sub-agents as tools —
spawn_subagentis wired into the parent loop end-to-end. A first-classAgentDefinition+AgentDefinitionRegistryreplaces the stub, built-ins are derived from the eightAgentArchetypes, custom TOML definitions load from<workspace>/agents/*.toml, and a newsubagent_runnerexecutes both typed and fork-cache modes. Tool filtering supports category (systemvsskill), skill prefix, and named allowlists — all composable.Reasoning → agentic routing + KV-cache stability — the main user-facing agent is now pinned to
reasoning-v1(decision/routing tier). When it spawns a sub-agent to execute tool calls, the sub-agent rides onhint:agentic→agentic-v1. Sub-agent output returns as atool_resultand the parent continues on reasoning. To preserve the backend's automatic prefix cache,Agent::turnno longer rebuilds the system prompt on subsequent turns and no longer runs per-message model classification; sub-agent system prompts no longer embedchrono::Local::now().Layered context pipeline — new
agent/context_pipeline/module implementing an ordered reduction chain:ToolResultsenvelopes with[Old tool result content cleared]while preserving theAssistantToolCalls ⇔ ToolResultsAPI invariant.PipelineOutcome::AutocompactionRequestedwith a utilisation percentage when microcompact can't free enough.archivistsub-agent that uses theupdate_memory_mdtool to distil durable facts into the workspaceMEMORY.md. Fire-and-forget on the agentic tier.agent.rssplit — the single 1988-lineagent.rsis now anagent/folder with six focused files:types.rs(struct defs),builder.rs(builder API +from_configfactory),turn.rs(turn lifecycle + tool dispatch + pipeline wiring),runtime.rs(accessors + run helpers + static helpers),tests.rs(integration tests), andmod.rs(glue + top-levelrun).Test plan
cargo test --lib -p openhuman→ 2314 passed / 0 failed (includes 28 newcontext_pipelinetests and a KV-cache stability regression test underagent::agent::tests::system_prompt_and_model_are_byte_stable_across_turns)cargo test --test json_rpc_e2e→ 14 passed / 0 failedcargo fmt --checkcleancargo clippyclean on touched files (agent/agent/**,agent/context_pipeline/**,agent/harness/subagent_runner.rs,config/schema/{types,agent}.rs)cargo run --bin openhuman -- chat "research the ECS design pattern and summarise in 3 bullets"and confirm[context_pipeline],[session_memory], and[spawn_subagent]log lines appear as expected.Notable decisions
ModelSpec::Hint(...)→RouterProvider→ concrete backend tier — no hard-coded model names leak into archetypes.cache_controlblocks — the field was plumbed but never emitted downstream, so the user-visible win is the same either way.agent.rssplit is behaviour-preserving: no public API changes, all existing tests pass unchanged. Field visibility usespub(super)so sibling files can construct/read without widening visibility beyond theagent::agent::module.Summary by CodeRabbit
New Features
Improvements
Configuration
Defaults