Skip to content

feat(core): memory namespaces, recall citations, provider_surfaces RPC#803

Merged
senamakel merged 5 commits intotinyhumansai:mainfrom
jwalin-shah:pr/core-retrieval-v1
Apr 23, 2026
Merged

feat(core): memory namespaces, recall citations, provider_surfaces RPC#803
senamakel merged 5 commits intotinyhumansai:mainfrom
jwalin-shah:pr/core-retrieval-v1

Conversation

@jwalin-shah
Copy link
Copy Markdown
Contributor

@jwalin-shah jwalin-shah commented Apr 22, 2026

Summary

Rust core for memory namespaces, recall citations, and provider_surfaces RPC + store; agent/session + channels wiring; tests updated.

Stack

Merge before (or land first): UI #800, Tauri #801 depend on this RPC/behavior. Meta #802 can follow or parallel per team preference.

Test plan

  • cargo test / CI green on PR

Summary by CodeRabbit

Release Notes

  • New Features
    • Added provider assistive surfaces support for managing provider app events and response queues.
    • Implemented memory citations: citations are now collected from recalled memories and delivered alongside chat responses to provide source attribution.
    • Enhanced memory system with namespace organization for better data organization and isolation (e.g., user profiles, learning observations).

- Memory trait: namespace + RecallOpts; unified store and tool callers
- Agent: MemoryCitation + collect_recall_citations; session turn/runtime wiring
- Channels: web + presentation + proactive chat_done citations; routes/dispatch
- Core: provider_surfaces domain; register in all.rs; socketio payload
- Tests: agent harness/builder/memory_loader stubs for trait shape

Made-with: Cursor
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

Warning

Rate limit exceeded

@jwalin-shah has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 37 minutes and 47 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 37 minutes and 47 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f9fd484-214a-457b-bdef-72afd0c4efdc

📥 Commits

Reviewing files that changed from the base of the PR and between 08c05d0 and d427226.

📒 Files selected for processing (16)
  • src/core/all.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/learning/prompt_sections.rs
  • src/openhuman/memory/store/memory_trait.rs
  • src/openhuman/memory/store/unified/init.rs
  • src/openhuman/memory/traits.rs
  • src/openhuman/mod.rs
  • src/openhuman/provider_surfaces/ops.rs
  • src/openhuman/provider_surfaces/schemas.rs
  • src/openhuman/provider_surfaces/store.rs
  • src/openhuman/provider_surfaces/types.rs
  • src/openhuman/threads/ops.rs
  • src/openhuman/tools/impl/memory/forget.rs
  • src/openhuman/tools/impl/memory/recall.rs
  • src/openhuman/tools/impl/memory/store.rs
📝 Walkthrough

Walkthrough

The PR introduces explicit namespace scoping to the Memory trait API, adds citation tracking for recalled memories with delivery through web channels, and adds a new provider_surfaces subsystem for managing third-party app assistive surfaces. It includes comprehensive trait signature updates, a boot-time migration for legacy keys, and citation collection/filtering logic during agent turns.

Changes

Cohort / File(s) Summary
Memory Trait Refactoring
src/openhuman/memory/traits.rs, src/openhuman/memory/mod.rs, src/openhuman/memory/store/memory_trait.rs
Expanded Memory trait with namespace parameters for store/get/forget/list, replaced recall session_id with RecallOpts struct, added namespace_summaries method, and implemented namespace normalization with per-namespace filtering in UnifiedMemory backend.
Memory Migration & Setup
src/openhuman/memory/store/unified/init.rs
Added idempotent boot-time migration that rewrites legacy composite keys (format: "{namespace}/{actual_key}") into separate namespace/key columns, with duplicate-key protection via anti-join.
Memory Citations
src/openhuman/agent/memory_loader.rs
Introduced MemoryCitation public struct with serialization, added collect_recall_citations function that filters recalled entries by relevance score and truncates snippets to 280 chars.
Agent Session Citations
src/openhuman/agent/harness/session/types.rs, src/openhuman/agent/harness/session/builder.rs, src/openhuman/agent/harness/session/runtime.rs, src/openhuman/agent/harness/session/turn.rs
Extended Agent with last_turn_citations field, initialized in builder, added take_last_turn_citations() accessor, and integrated citation collection into turn execution with default limits (5 citations, 0.4 relevance threshold).
Web Channel Citations
src/core/socketio.rs, src/openhuman/channels/providers/presentation.rs, src/openhuman/channels/providers/web.rs, src/openhuman/channels/proactive.rs
Extended WebChannelEvent with optional citations field, threaded citations through presentation delivery logic, updated run_chat_task to return citations alongside response text, and added citations to proactive events.
Provider Surfaces Subsystem
src/openhuman/mod.rs, src/openhuman/provider_surfaces/...
Added new provider_surfaces module with types (ProviderEvent, RespondQueueItem), in-memory queue storage, operations (ingest_event, list_queue), and RPC handler schemas for third-party app assistive surface integration.
Core Registry Updates
src/core/all.rs
Registered provider_surfaces namespace and controllers in global registry, extended namespace_description with provider_surfaces entry.
Memory API Callsite Updates
src/openhuman/agent/harness/memory_context.rs, src/openhuman/channels/context.rs, src/openhuman/agent/harness/subagent_runner/ops.rs, src/openhuman/channels/tests/..., src/openhuman/learning/..., src/openhuman/tools/impl/memory/..., src/openhuman/migration/core.rs
Updated memory store/recall/list/get/forget calls throughout codebase to use new trait signatures with explicit namespaces (e.g., "learning_observations", "user_profile", "tool_effectiveness") and RecallOpts for recall operations.
Test Memory Implementations
src/openhuman/agent/harness/memory_context.rs, src/openhuman/channels/routes.rs, src/openhuman/channels/tests/common.rs, src/openhuman/tools/impl/system/tool_stats.rs, tests/agent_*.rs
Updated all mock Memory trait implementations (MockMemory, NoopMemory, StubMemory, DummyMemory) to match new signatures with namespace parameters and namespace_summaries method.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant Memory
    participant CitationCollector
    participant WebChannel
    
    Agent->>Agent: Start turn with user message
    Agent->>CitationCollector: collect_recall_citations(user_msg, limit=5, min_score=0.4)
    CitationCollector->>Memory: recall(user_msg, limit, RecallOpts)
    Memory-->>CitationCollector: Vec<MemoryEntry>
    CitationCollector->>CitationCollector: Filter by score, truncate snippets
    CitationCollector-->>Agent: Vec<MemoryCitation>
    Agent->>Agent: Store citations in last_turn_citations
    Agent->>Agent: Continue turn processing
    Agent->>Agent: take_last_turn_citations()
    Agent-->>WebChannel: citations payload
    WebChannel->>WebChannel: Attach citations to WebChannelEvent
    WebChannel-->>Client: Event with citations
Loading
sequenceDiagram
    participant ProviderApp
    participant RpcHandler
    participant Operations
    participant Store
    
    ProviderApp->>RpcHandler: ingest_event(ProviderEvent)
    RpcHandler->>Operations: ops::ingest_event(event)
    Operations->>Store: upsert_queue_item(event)
    Store->>Store: Generate stable ID from provider/account/kind/entity
    Store->>Store: Remove duplicate, insert at front
    Store-->>Operations: RespondQueueItem
    Operations->>Operations: Build ApiEnvelope with counts
    Operations-->>RpcHandler: RpcOutcome
    RpcHandler-->>ProviderApp: JSON response
    
    ProviderApp->>RpcHandler: list_queue()
    RpcHandler->>Operations: ops::list_queue()
    Operations->>Store: list_queue_items()
    Store-->>Operations: Vec<RespondQueueItem> (newest first)
    Operations-->>RpcHandler: ApiEnvelope with items + count
    RpcHandler-->>ProviderApp: JSON response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The PR involves fundamental trait API changes propagated across 40+ files, introduces a boot-time data migration, adds citation collection/filtering logic, and creates a new subsystem. While individual changes are relatively mechanical (parameter additions to memory calls), the breadth and interconnectedness of modifications across core infrastructure demand careful verification of correctness.

Possibly related PRs

Suggested reviewers

  • senamakel

🐰 Citations bloom like clover in the spring,
Namespaces keep memories organized and neat,
Provider surfaces sprouted fresh and green,
Now agent turns fetch wisdom, complete and sweet! 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 48.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title comprehensively and accurately summarizes the three main features added: memory namespaces, recall citations, and provider_surfaces RPC functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
src/openhuman/tools/impl/memory/recall.rs (1)

74-87: ⚠️ Potential issue | 🟠 Major

Use the requested namespace in RecallOpts.

Line 85 ignores the validated namespace, so memory_recall advertises namespace-scoped search but still performs default/global recall. That can miss entries stored with the new API or return memories from the wrong scope.

🐛 Proposed fix
-        // Search with the user query only. Prefixing `namespace` into the recall string adds a
-        // redundant token that matches almost every row (e.g. all `global/...` keys contain "global").
-        // `namespace` is still required by the tool contract; unified memory scopes recall to the
-        // global document namespace until multi-namespace recall is wired through the `Memory` trait.
-        // Phase A: keep legacy global-only recall. Phase B will populate
-        // `RecallOpts::namespace` from the `namespace` tool arg.
+        // Search with the user query only; namespace scoping belongs in RecallOpts,
+        // not in the query text.
+        let recall_opts = crate::openhuman::memory::RecallOpts {
+            namespace: Some(namespace),
+            ..crate::openhuman::memory::RecallOpts::default()
+        };
         match self
             .memory
-            .recall(
-                query,
-                limit,
-                crate::openhuman::memory::RecallOpts::default(),
-            )
+            .recall(query, limit, recall_opts)
             .await
         mem.store(
-            "",
-            "global/lang",
+            "global",
+            "lang",
             "User prefers Rust",
             MemoryCategory::Core,
             None,
@@
         mem.store(
-            "",
-            "global/tz",
+            "global",
+            "tz",
             "Timezone is EST",
             MemoryCategory::Core,
             None,
@@
             mem.store(
-                "",
-                &format!("global/k{i}"),
+                "global",
+                &format!("k{i}"),
                 &format!("Rust fact {i}"),
                 MemoryCategory::Core,
                 None,

Also applies to: 138-178

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/tools/impl/memory/recall.rs` around lines 74 - 87, The recall
call currently passes crate::openhuman::memory::RecallOpts::default(), ignoring
the validated namespace; update the RecallOpts used by memory.recall to include
the validated namespace (e.g., set RecallOpts::namespace =
Some(namespace.clone()) or the equivalent field) so the search is scoped
correctly; apply the same change to the other recall invocation(s) referenced
(the block around lines 138-178) so all memory.recall calls use the provided
namespace instead of the default/global options.
src/openhuman/channels/providers/presentation.rs (1)

85-140: ⚠️ Potential issue | 🟡 Minor

Avoid emitting citations twice for segmented responses.

Segmented delivery sends the same citations on the first chat_segment and again on final chat_done. Since chat_done already carries the full response/state-sync payload, keeping citations there avoids duplicate rendering or client-side de-duping.

Proposed fix
         publish_web_channel_event(WebChannelEvent {
             event: "chat_segment".to_string(),
@@
             delta: None,
             delta_kind: None,
             tool_call_id: None,
-            citations: if i == 0 && !citations.is_empty() {
-                Some(serde_json::json!(citations))
-            } else {
-                None
-            },
+            citations: None,
         });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/channels/providers/presentation.rs` around lines 85 - 140, The
code currently emits identical citations on the first chat_segment and again on
chat_done via publish_web_channel_event/WebChannelEvent, causing duplicate
client-side citations; change the logic so that WebChannelEvent for
"chat_segment" never includes citations (remove the citations block or always
set it to None there) and keep citations only on the final "chat_done" event
(preserve the existing citations serialization in that WebChannelEvent),
referencing the event string checks and the fields named citations,
publish_web_channel_event, and WebChannelEvent to locate where to update.
src/openhuman/agent/harness/session/turn.rs (1)

131-171: ⚠️ Potential issue | 🟠 Major

Collect recall citations before autosaving the current prompt.

The current user_message is stored in memory and then immediately used as the recall query. Backends that index synchronously can return the just-saved "user_msg" as a citation/context entry, producing self-citations for the current request. Move the autosave until after citation/context loading, or scope citation recall to durable memory namespaces only.

Proposed fix
-        if self.auto_save {
-            let _ = self
-                .memory
-                .store(
-                    "",
-                    "user_msg",
-                    user_message,
-                    MemoryCategory::Conversation,
-                    None,
-                )
-                .await;
-        }
-
         log::info!("[agent] loading memory context for user message");
         const MEMORY_CITATION_LIMIT: usize = 5;
         const MEMORY_CITATION_MIN_RELEVANCE: f64 = 0.4;
@@
         let context = self
             .memory_loader
             .load_context(self.memory.as_ref(), user_message)
             .await
             .unwrap_or_default();
+
+        if self.auto_save {
+            let _ = self
+                .memory
+                .store(
+                    "",
+                    "user_msg",
+                    user_message,
+                    MemoryCategory::Conversation,
+                    None,
+                )
+                .await;
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/session/turn.rs` around lines 131 - 171, The
autosave of the incoming prompt (the self.auto_save branch that calls
self.memory.store with key "user_msg") runs before you call
collect_recall_citations and memory_loader.load_context, causing synchronous
indexers to return the just-saved message as a citation; move the autosave block
to after the citation/context-loading logic (i.e., perform the memory.store only
after collect_recall_citations(...) and memory_loader.load_context(...) complete
and after updating self.last_turn_citations), or alternatively modify
collect_recall_citations to ignore entries from the transient "user_msg"
key/durable-only namespaces; update references: self.auto_save,
memory.store(..., "user_msg", ...), collect_recall_citations(...),
memory_loader.load_context(...), and self.last_turn_citations.
src/openhuman/learning/reflection.rs (1)

218-231: ⚠️ Potential issue | 🟡 Minor

Deduplicate user_profile writes from reflection.

UserProfileHook already stores pref/{slug} in the same "user_profile" namespace with a duplicate check. This loop writes the same key pattern unconditionally, so a reflection pass can overwrite existing profile entries or race the profile hook.

Proposed fix
         for pref in &output.user_preferences {
             let slug = slugify(pref);
             let key = format!("pref/{slug}");
+            if self.memory.get("user_profile", &key).await?.is_some() {
+                continue;
+            }
             self.memory
                 .store(
                     "user_profile",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/learning/reflection.rs` around lines 218 - 231, The loop in
reflection.rs unconditionally writes keys like "pref/{slug}" into the
"user_profile" namespace which duplicates what UserProfileHook already stores
and can overwrite/race existing entries; update the reflection logic to avoid
duplicate writes by either removing this loop entirely or, if you must write,
first check for existing entries via the memory API (e.g., an exists/get/fetch
call) before calling self.memory.store for the "user_profile" namespace and the
key format "pref/{slug}", and only call self.memory.store when the key is
absent; reference the same memory.store call and the "pref/{slug}" key pattern
so the change matches existing behavior.
src/openhuman/memory/store/memory_trait.rs (2)

113-171: ⚠️ Potential issue | 🟠 Major

Apply category filtering after episodic entries are merged.

opts.category is applied before the session_id branch, so session-scoped episodic Conversation entries are appended even when the caller requested another category.

🐛 Proposed filter ordering fix
-        if let Some(ref cat) = opts.category {
-            let want = cat.to_string();
-            out.retain(|e| e.category.to_string() == want);
-        }
-
         if let Some(sid) = opts.session_id {
             let episodic_entries = match fts5::episodic_session_entries(&self.conn, sid) {
@@
             out.truncate(limit);
         }
+
+        if let Some(ref cat) = opts.category {
+            let want = cat.to_string();
+            out.retain(|e| e.category.to_string() == want);
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/memory/store/memory_trait.rs` around lines 113 - 171, The
category filter (opts.category) is applied before the session_id branch so
episodic Conversation entries get appended even when a different category was
requested; update the logic in memory_trait.rs (the block that builds
episodic_entries and pushes MemoryEntry with MemoryCategory::Conversation) to
either move the opts.category check after merging episodic entries or re-run the
out.retain(|e| e.category.to_string() == want) after the session_id branch;
ensure the final out is filtered by opts.category before the out.sort_by(...)
and out.truncate(limit) calls so only matching categories remain.

213-255: ⚠️ Potential issue | 🟠 Major

Honor list filters instead of relabeling all rows.

The category and session_id parameters are currently ignored, and every returned row is assigned the requested category or Core. Callers using list(namespace, category, session) can receive unrelated rows with incorrect metadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/memory/store/memory_trait.rs` around lines 213 - 255, The list
method is ignoring the category and session_id filters and overwriting every
MemoryEntry's category and session_id; update the async fn list to (1) filter
the items array early when category.is_some() and/or _session_id.is_some() by
comparing the requested values to the document's stored fields (e.g., look up
document-level fields like "category" and "session_id" / "sessionId" inside each
serde_json Value), (2) when building MemoryEntry, populate category from the
document's category field if present (fallback to MemoryCategory::Core only if
missing) instead of using category.cloned().unwrap_or(...), and populate
session_id from the document's stored session id if present (instead of always
None); also prefer using the document's timestamp field for
MemoryEntry.timestamp rather than the artificial format!("idx-{idx}") when
available, and only use the idx fallback if no timestamp exists.
🧹 Nitpick comments (4)
src/openhuman/migration/core.rs (2)

394-402: Trace collision probing in next_available_key.

This loop can issue multiple backend reads during conflict resolution; trace logs make migration stalls or excessive collisions diagnosable while keeping normal logs quiet.

🔎 Proposed trace points
     loop {
         let candidate = format!("{key}_{idx}");
+        log::trace!(
+            "checking OpenClaw memory conflict candidate key={:?}",
+            candidate
+        );
         if memory.get("", &candidate).await?.is_none() {
+            log::debug!(
+                "selected non-conflicting OpenClaw memory key={:?}",
+                candidate
+            );
             return Ok(candidate);
         }
         idx += 1;
     }

As per coding guidelines, src/**/*.rs: Rust code must use log / tracing at debug or trace level for critical checkpoints, branch decisions, external calls, retries/timeouts, state transitions, and error handling paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/migration/core.rs` around lines 394 - 402, Add trace-level
logging to the collision-probing loop inside next_available_key to make repeated
backend reads visible: before calling memory.get log the candidate being probed
(include {key} and idx), log a trace when memory.get returns Some (collision
detected) and about to increment idx, and log a trace when a free key is found
just before returning. Use the crate's tracing/log macros (e.g., tracing::trace!
or log::trace!) and ensure the module imports the chosen macro so
next_available_key, candidate, memory.get and idx are clearly referenced in the
messages.

94-107: Add debug logs for migration decisions without logging memory content.

The namespace update looks correct for the global memory target, but this branch now performs backend lookup/store plus skip/rename/import decisions without checkpoints. Log only keys/decisions, not entry.content.

🔎 Proposed observability update
+        log::debug!("checking target memory for OpenClaw import key={:?}", key);
         if let Some(existing) = memory.get("", &key).await? {
             if existing.content.trim() == entry.content.trim() {
+                log::debug!("skipping unchanged OpenClaw memory key={:?}", key);
                 stats.skipped_unchanged += 1;
                 continue;
             }
 
             let renamed = next_available_key(memory.as_ref(), &key).await?;
+            log::debug!(
+                "renaming conflicting OpenClaw memory key={:?} renamed_key={:?}",
+                key,
+                renamed
+            );
             key = renamed;
             stats.renamed_conflicts += 1;
         }
 
         memory
             .store("", &key, &entry.content, entry.category, None)
             .await?;
+        log::debug!("imported OpenClaw memory key={:?}", key);
         stats.imported += 1;

As per coding guidelines, src/**/*.rs: Rust code must use log / tracing at debug or trace level for critical checkpoints, branch decisions, external calls, state transitions, and error handling paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/migration/core.rs` around lines 94 - 107, Add debug-level logs
around the migration decision branch: log when memory.get("", &key) returns
Some(existing) and whether the entry was skipped (incrementing
stats.skipped_unchanged) or renamed (calling next_available_key and incrementing
stats.renamed_conflicts), and also log the final store action (memory.store)
with the target key and entry.category; do not log entry.content or any content
payloads. Place logs near the existing memory.get/store calls and
next_available_key invocation, using the key, renamed key, and stats updates as
the identifying context.
src/openhuman/provider_surfaces/rpc.rs (1)

1-4: Empty rpc module — consider removing until it has content.

This file contains only a doc comment but is exported via pub mod rpc; in mod.rs. Since the actual RPC entry points live in schemas.rs (handle_ingest_event/handle_list_queue), this module appears redundant. Either drop it or move the handler wiring here to match the module name.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/provider_surfaces/rpc.rs` around lines 1 - 4, The rpc.rs file
only contains a module doc comment but is publicly exported (pub mod rpc;) while
the actual RPC handlers live in schemas.rs; either remove the empty rpc.rs
module and delete its pub export, or move the RPC wiring into rpc.rs by
implementing the entry points there and delegating or importing the existing
functions handle_ingest_event and handle_list_queue from schemas.rs; update
mod.rs to reflect the chosen approach (remove the pub mod rpc; line if deleting
the module, or ensure mod rpc; points to the file with the handlers if moving
them).
src/openhuman/tools/impl/system/tool_stats.rs (1)

185-192: Make the mock enforce the namespace/category contract.

ToolStatsTool now depends on list(Some("tool_effectiveness"), Some(Custom("tool_effectiveness")), None), but this mock ignores those filters, so the tests won’t catch a wrong namespace or category being passed.

🧪 Proposed test-double tightening
     async fn list(
         &self,
-        _namespace: Option<&str>,
-        _cat: Option<&MemoryCategory>,
-        _s: Option<&str>,
+        namespace: Option<&str>,
+        cat: Option<&MemoryCategory>,
+        session_id: Option<&str>,
     ) -> anyhow::Result<Vec<MemoryEntry>> {
-        Ok(self.entries.lock().values().cloned().collect())
+        Ok(self
+            .entries
+            .lock()
+            .values()
+            .filter(|entry| namespace.is_none_or(|ns| entry.namespace.as_deref() == Some(ns)))
+            .filter(|entry| cat.is_none_or(|wanted| &entry.category == wanted))
+            .filter(|entry| session_id.is_none_or(|sid| entry.session_id.as_deref() == Some(sid)))
+            .cloned()
+            .collect())
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/tools/impl/system/tool_stats.rs` around lines 185 - 192, The
mock list method currently ignores its filters; update the async fn list(&self,
namespace: Option<&str>, cat: Option<&MemoryCategory>, s: Option<&str>) to
enforce the namespace/category contract used by ToolStatsTool: validate that
namespace == Some("tool_effectiveness") and cat ==
Some(MemoryCategory::Custom("tool_effectiveness")) (or return an empty vec / Err
when they don't match), and otherwise return only entries whose entry.namespace
and entry.category match those filters (and apply s as a substring filter if
provided). Reference the function name list, the struct field entries, and
MemoryCategory::Custom to locate where to add the checks and filtering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/memory/store/memory_trait.rs`:
- Around line 89-92: Normalize blank or whitespace-only namespaces to
GLOBAL_NAMESPACE by introducing a helper (e.g., normalize_namespace) and using
it wherever namespace comes from opts or RPC input; treat None or Some(s) with
s.trim().is_empty() the same as GLOBAL_NAMESPACE. Replace direct uses of
opts.namespace (and any places in recall and list around lines 215-222) with the
normalized result before calling query_namespace_ranked, recall, list, store,
get, or forget so calls like query_namespace_ranked(namespace, ...) and other
namespace-aware methods never receive Some("") or whitespace-only strings.

In `@src/openhuman/memory/store/unified/init.rs`:
- Around line 131-143: The migration only updates memory_docs but leaves
vector_chunks pointing to GLOBAL_NAMESPACE; before returning from new() in
init.rs, build the legacy->new namespace mapping for documents being split
(select document_id and the new namespace/key derived from substr/instr on the
existing key) inside the same transaction used for the memory_docs UPDATE, then
run an UPDATE on vector_chunks setting namespace = new_namespace WHERE
document_id IN (those document_ids) so vector_chunks.namespace matches the
migrated memory_docs; locate the migration block around the memory_docs UPDATE
(the code that runs conn.execute with the substr/instr SQL) and add a query to
collect document_ids/new namespaces and an UPDATE on vector_chunks using the
same rusqlite transaction/connection before finishing new().

In `@src/openhuman/memory/traits.rs`:
- Around line 57-60: The doc comment for the recall filter currently says
namespace = None searches "any namespace", but UnifiedMemory actually falls back
to GLOBAL_NAMESPACE; update the contract so callers aren't misled: either change
UnifiedMemory::recall (or its caller) to implement true cross-namespace search
when namespace is None, or (preferable) adjust the docs in traits.rs for the
struct used by Memory::recall to state that None will be treated as
GLOBAL_NAMESPACE (reference GLOBAL_NAMESPACE and UnifiedMemory) and explain how
to request a global vs scoped search; ensure the same corrected wording is
applied to the other doc block around lines 98-105.

In `@src/openhuman/provider_surfaces/ops.rs`:
- Around line 94-128: Tests ingest_event_upserts_queue_item and
list_queue_returns_newest_first share and mutate the global RESPOND_QUEUE (via
store::clear_queue()) causing races under parallel test execution; fix by
running these tests serially—add the serial_test crate and annotate both async
tests (functions ingest_event_upserts_queue_item and
list_queue_returns_newest_first) with #[serial] in addition to #[tokio::test],
so each test acquires the serial_test global lock and prevents interleaving;
alternatively, if you prefer a mutex-based approach, add a module-level static
TEST_LOCK (e.g., a tokio::sync::Mutex<()> or std::sync::Mutex<()> used with
block_in_place) and acquire it at the start of each test before calling
store::clear_queue() to ensure exclusive access to RESPOND_QUEUE.

In `@src/openhuman/provider_surfaces/schemas.rs`:
- Around line 51-62: The FieldSchema entries are inconsistent with
ProviderEvent's serde attributes: make the "requires_attention" FieldSchema
match ProviderEvent::requires_attention (which is #[serde(default)]) by setting
required: false (or remove the serde default on ProviderEvent if you prefer the
field to be required); and make "raw_payload" clearly optional by using
TypeSchema::Option(Box::new(TypeSchema::Json)) (or the optional(...) helper used
elsewhere) and keep required: false so the generated clients/CLI reflect
optionality.

In `@src/openhuman/provider_surfaces/store.rs`:
- Around line 10-20: RESPOND_QUEUE is an unbounded process-global Vec (accessed
via queue(), queue_lock()) which can grow indefinitely and races across parallel
tokio tests (clear_queue()/upsert_queue_item). Fix by making the queue bounded
and test-safe: add a soft cap and TTL eviction when inserting in
upsert_queue_item (drop oldest or oldest-until-ttl) and change RESPOND_QUEUE to
hold a struct with an internal Mutex plus a test-access Mutex/lock or replace
the static with an injectable handle type (e.g., StoreHandle) so tests can
construct isolated instances; alternatively gate test access with a global
test-only std::sync::Mutex and/or annotate tests with #[serial_test::serial] to
prevent concurrent mutations. Ensure you update queue(), queue_lock(), and any
callers (upsert_queue_item, clear_queue) to use the new bounded/evicting
behavior or injected handle.

In `@src/openhuman/tools/impl/memory/forget.rs`:
- Around line 65-68: Try deleting the memory with the new API shape first (call
self.memory.forget(namespace, key).await) and if that returns no
memory/NotFound, then fall back to the legacy packed-key deletion (build
namespaced_key = format!("{}/{}", namespace.trim(), key) and call
self.memory.forget("", &namespaced_key).await). Update the logic around the
existing match on self.memory.forget so the split namespace/key is removed first
and the legacy empty-namespace packed-key call is only attempted as a fallback.

In `@src/openhuman/tools/impl/memory/store.rs`:
- Around line 85-90: The code currently concatenates namespace and key into
namespaced_key and calls self.memory.store("", &namespaced_key, ...) which
writes into the empty namespace; change these calls to pass the real namespace
and the original key to self.memory.store(namespace.trim(), key) (or ensure
namespace is trimmed once and use that variable) instead of using
namespaced_key; update all occurrences that build namespaced_key and call
self.memory.store("", ...) (e.g., the calls around namespaced_key at lines where
namespaced_key is created and where self.memory.store is invoked) so
namespace-scoped operations (get/list/recall) see the stored memories.

---

Outside diff comments:
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 131-171: The autosave of the incoming prompt (the self.auto_save
branch that calls self.memory.store with key "user_msg") runs before you call
collect_recall_citations and memory_loader.load_context, causing synchronous
indexers to return the just-saved message as a citation; move the autosave block
to after the citation/context-loading logic (i.e., perform the memory.store only
after collect_recall_citations(...) and memory_loader.load_context(...) complete
and after updating self.last_turn_citations), or alternatively modify
collect_recall_citations to ignore entries from the transient "user_msg"
key/durable-only namespaces; update references: self.auto_save,
memory.store(..., "user_msg", ...), collect_recall_citations(...),
memory_loader.load_context(...), and self.last_turn_citations.

In `@src/openhuman/channels/providers/presentation.rs`:
- Around line 85-140: The code currently emits identical citations on the first
chat_segment and again on chat_done via
publish_web_channel_event/WebChannelEvent, causing duplicate client-side
citations; change the logic so that WebChannelEvent for "chat_segment" never
includes citations (remove the citations block or always set it to None there)
and keep citations only on the final "chat_done" event (preserve the existing
citations serialization in that WebChannelEvent), referencing the event string
checks and the fields named citations, publish_web_channel_event, and
WebChannelEvent to locate where to update.

In `@src/openhuman/learning/reflection.rs`:
- Around line 218-231: The loop in reflection.rs unconditionally writes keys
like "pref/{slug}" into the "user_profile" namespace which duplicates what
UserProfileHook already stores and can overwrite/race existing entries; update
the reflection logic to avoid duplicate writes by either removing this loop
entirely or, if you must write, first check for existing entries via the memory
API (e.g., an exists/get/fetch call) before calling self.memory.store for the
"user_profile" namespace and the key format "pref/{slug}", and only call
self.memory.store when the key is absent; reference the same memory.store call
and the "pref/{slug}" key pattern so the change matches existing behavior.

In `@src/openhuman/memory/store/memory_trait.rs`:
- Around line 113-171: The category filter (opts.category) is applied before the
session_id branch so episodic Conversation entries get appended even when a
different category was requested; update the logic in memory_trait.rs (the block
that builds episodic_entries and pushes MemoryEntry with
MemoryCategory::Conversation) to either move the opts.category check after
merging episodic entries or re-run the out.retain(|e| e.category.to_string() ==
want) after the session_id branch; ensure the final out is filtered by
opts.category before the out.sort_by(...) and out.truncate(limit) calls so only
matching categories remain.
- Around line 213-255: The list method is ignoring the category and session_id
filters and overwriting every MemoryEntry's category and session_id; update the
async fn list to (1) filter the items array early when category.is_some() and/or
_session_id.is_some() by comparing the requested values to the document's stored
fields (e.g., look up document-level fields like "category" and "session_id" /
"sessionId" inside each serde_json Value), (2) when building MemoryEntry,
populate category from the document's category field if present (fallback to
MemoryCategory::Core only if missing) instead of using
category.cloned().unwrap_or(...), and populate session_id from the document's
stored session id if present (instead of always None); also prefer using the
document's timestamp field for MemoryEntry.timestamp rather than the artificial
format!("idx-{idx}") when available, and only use the idx fallback if no
timestamp exists.

In `@src/openhuman/tools/impl/memory/recall.rs`:
- Around line 74-87: The recall call currently passes
crate::openhuman::memory::RecallOpts::default(), ignoring the validated
namespace; update the RecallOpts used by memory.recall to include the validated
namespace (e.g., set RecallOpts::namespace = Some(namespace.clone()) or the
equivalent field) so the search is scoped correctly; apply the same change to
the other recall invocation(s) referenced (the block around lines 138-178) so
all memory.recall calls use the provided namespace instead of the default/global
options.

---

Nitpick comments:
In `@src/openhuman/migration/core.rs`:
- Around line 394-402: Add trace-level logging to the collision-probing loop
inside next_available_key to make repeated backend reads visible: before calling
memory.get log the candidate being probed (include {key} and idx), log a trace
when memory.get returns Some (collision detected) and about to increment idx,
and log a trace when a free key is found just before returning. Use the crate's
tracing/log macros (e.g., tracing::trace! or log::trace!) and ensure the module
imports the chosen macro so next_available_key, candidate, memory.get and idx
are clearly referenced in the messages.
- Around line 94-107: Add debug-level logs around the migration decision branch:
log when memory.get("", &key) returns Some(existing) and whether the entry was
skipped (incrementing stats.skipped_unchanged) or renamed (calling
next_available_key and incrementing stats.renamed_conflicts), and also log the
final store action (memory.store) with the target key and entry.category; do not
log entry.content or any content payloads. Place logs near the existing
memory.get/store calls and next_available_key invocation, using the key, renamed
key, and stats updates as the identifying context.

In `@src/openhuman/provider_surfaces/rpc.rs`:
- Around line 1-4: The rpc.rs file only contains a module doc comment but is
publicly exported (pub mod rpc;) while the actual RPC handlers live in
schemas.rs; either remove the empty rpc.rs module and delete its pub export, or
move the RPC wiring into rpc.rs by implementing the entry points there and
delegating or importing the existing functions handle_ingest_event and
handle_list_queue from schemas.rs; update mod.rs to reflect the chosen approach
(remove the pub mod rpc; line if deleting the module, or ensure mod rpc; points
to the file with the handlers if moving them).

In `@src/openhuman/tools/impl/system/tool_stats.rs`:
- Around line 185-192: The mock list method currently ignores its filters;
update the async fn list(&self, namespace: Option<&str>, cat:
Option<&MemoryCategory>, s: Option<&str>) to enforce the namespace/category
contract used by ToolStatsTool: validate that namespace ==
Some("tool_effectiveness") and cat ==
Some(MemoryCategory::Custom("tool_effectiveness")) (or return an empty vec / Err
when they don't match), and otherwise return only entries whose entry.namespace
and entry.category match those filters (and apply s as a substring filter if
provided). Reference the function name list, the struct field entries, and
MemoryCategory::Custom to locate where to add the checks and filtering.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c6f4da9a-ddd2-48bc-b205-85a1db480c75

📥 Commits

Reviewing files that changed from the base of the PR and between b3b451f and 08c05d0.

📒 Files selected for processing (41)
  • src/core/all.rs
  • src/core/socketio.rs
  • src/openhuman/agent/harness/memory_context.rs
  • src/openhuman/agent/harness/session/builder.rs
  • src/openhuman/agent/harness/session/runtime.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/session/types.rs
  • src/openhuman/agent/harness/subagent_runner/ops.rs
  • src/openhuman/agent/memory_loader.rs
  • src/openhuman/channels/context.rs
  • src/openhuman/channels/proactive.rs
  • src/openhuman/channels/providers/presentation.rs
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/routes.rs
  • src/openhuman/channels/runtime/dispatch.rs
  • src/openhuman/channels/tests/common.rs
  • src/openhuman/channels/tests/memory.rs
  • src/openhuman/cost/tracker.rs
  • src/openhuman/learning/prompt_sections.rs
  • src/openhuman/learning/reflection.rs
  • src/openhuman/learning/tool_tracker.rs
  • src/openhuman/learning/user_profile.rs
  • src/openhuman/memory/mod.rs
  • src/openhuman/memory/store/memory_trait.rs
  • src/openhuman/memory/store/unified/init.rs
  • src/openhuman/memory/traits.rs
  • src/openhuman/migration/core.rs
  • src/openhuman/mod.rs
  • src/openhuman/provider_surfaces/mod.rs
  • src/openhuman/provider_surfaces/ops.rs
  • src/openhuman/provider_surfaces/rpc.rs
  • src/openhuman/provider_surfaces/schemas.rs
  • src/openhuman/provider_surfaces/store.rs
  • src/openhuman/provider_surfaces/types.rs
  • src/openhuman/tools/impl/memory/forget.rs
  • src/openhuman/tools/impl/memory/recall.rs
  • src/openhuman/tools/impl/memory/store.rs
  • src/openhuman/tools/impl/system/tool_stats.rs
  • tests/agent_builder_public.rs
  • tests/agent_harness_public.rs
  • tests/agent_memory_loader_public.rs

Comment thread src/openhuman/memory/store/memory_trait.rs Outdated
Comment thread src/openhuman/memory/store/unified/init.rs
Comment thread src/openhuman/memory/traits.rs Outdated
Comment on lines +94 to +128
#[tokio::test]
async fn ingest_event_upserts_queue_item() {
store::clear_queue();
let first = ingest_event(sample_event("entity-1")).await.unwrap();
let second = ingest_event(sample_event("entity-1")).await.unwrap();

let first_value = first.into_cli_compatible_json().unwrap();
let second_value = second.into_cli_compatible_json().unwrap();
let first_result = first_value.get("data").unwrap_or(&first_value);
let second_result = second_value.get("data").unwrap_or(&second_value);

assert_eq!(first_result["provider"], "linkedin");
assert_eq!(second_result["entityId"], "entity-1");

let queue = list_queue(EmptyRequest {}).await.unwrap();
let queue_json = queue.into_cli_compatible_json().unwrap();
let data = queue_json.get("data").unwrap_or(&queue_json);
assert_eq!(data["count"], 1);
}

#[tokio::test]
async fn list_queue_returns_newest_first() {
store::clear_queue();
ingest_event(sample_event("entity-1")).await.unwrap();
ingest_event(sample_event("entity-2")).await.unwrap();

let queue = list_queue(EmptyRequest {}).await.unwrap();
let queue_json = queue.into_cli_compatible_json().unwrap();
let data = queue_json.get("data").unwrap_or(&queue_json);
let items = data["items"].as_array().unwrap();

assert_eq!(items.len(), 2);
assert_eq!(items[0]["entityId"], "entity-2");
assert_eq!(items[1]["entityId"], "entity-1");
}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

serial_test crate rust parallel test isolation

💡 Result:

The serial_test crate provides attributes to control test execution order in Rust, enabling serialized (sequential) execution for tests that cannot run in parallel due to shared state, while allowing true parallel tests to run concurrently without interfering with serial ones. Key features: - #[serial]: Ensures tests run sequentially (one at a time). Tests sharing the same key (e.g., #[serial(db_tests)]) are serialized within their group. Ordering is not guaranteed. - #[parallel]: Allows multiple tests to run concurrently with each other, but blocks them from running simultaneously with any #[serial] tests. - #[file_serial] and #[file_parallel]: For separate-process tests (e.g., integration tests, doctests), using file-based locking instead of in-process mutexes. - Module-level application: Attributes on #[cfg(test)] mod blocks apply to all tests inside. - Supports async tests (e.g., #[tokio::test] must precede #[serial]). - inner_attrs: Apply attributes like #[ntest::timeout] only to the test body, not lock acquisition. Isolation mechanism: - Uses mutexes (parking_lot) for #[serial]/#[parallel] to prevent concurrent execution within the same process. - File locks for file_* variants, providing process-level isolation. - Unmarked tests run without guarantees, potentially in parallel with anything. Usage: Add to Cargo.toml [dev-dependencies]: serial_test = "". Import: use serial_test::serial; (and others as needed). Example: use serial_test::{serial, parallel}; #[test] #[serial] fn test_a { / ... / } #[test] #[parallel] fn test_b { / ... */ } // Runs parallel to other parallel, but not serial Latest version: 3.4.0 (Feb 2026), MSRV 1.68, MIT license. Repo: https://github.com/palfrey/serial_test. Highly popular (100M+ downloads, 2K+ dependents).

Citations:


🏁 Script executed:

# Check if the file exists and read the relevant test section
head -200 src/openhuman/provider_surfaces/ops.rs | tail -150

Repository: tinyhumansai/openhuman

Length of output: 4439


🏁 Script executed:

# Find the store module referenced in the tests
fd -type f -name "*.rs" | xargs rg "store::clear_queue" -l

Repository: tinyhumansai/openhuman

Length of output: 292


🏁 Script executed:

# Check the store implementation for RESPOND_QUEUE definition
rg "static.*RESPOND_QUEUE" -A 5

Repository: tinyhumansai/openhuman

Length of output: 506


🏁 Script executed:

# Check Cargo.toml for serial_test dependency
cat Cargo.toml | grep -A 20 "\[dev-dependencies\]"

Repository: tinyhumansai/openhuman

Length of output: 923


🏁 Script executed:

# Verify exact line numbers for the tests
cat -n src/openhuman/provider_surfaces/ops.rs | sed -n '80,140p'

Repository: tinyhumansai/openhuman

Length of output: 2511


🏁 Script executed:

# Check store::clear_queue() implementation to understand what it does
rg "fn clear_queue" -A 10

Repository: tinyhumansai/openhuman

Length of output: 239


Tests share global queue state and will race under parallel execution.

Both ingest_event_upserts_queue_item and list_queue_returns_newest_first call store::clear_queue() and then mutate/read the same static RESPOND_QUEUE. Cargo executes unit tests across threads by default, so these two tests (and any future tests in this module) can interleave, causing flaky assertion failures (e.g., count == 1 assertion fails if the other test inserts concurrently).

Options:

  • Add a module-level static TEST_LOCK: Mutex<()> and acquire it at the top of each test.
  • Use the serial_test crate with #[serial] (supports async tests via #[tokio::test] #[serial]).
  • Refactor the store to accept an injected handle so tests can use isolated instances.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/provider_surfaces/ops.rs` around lines 94 - 128, Tests
ingest_event_upserts_queue_item and list_queue_returns_newest_first share and
mutate the global RESPOND_QUEUE (via store::clear_queue()) causing races under
parallel test execution; fix by running these tests serially—add the serial_test
crate and annotate both async tests (functions ingest_event_upserts_queue_item
and list_queue_returns_newest_first) with #[serial] in addition to
#[tokio::test], so each test acquires the serial_test global lock and prevents
interleaving; alternatively, if you prefer a mutex-based approach, add a
module-level static TEST_LOCK (e.g., a tokio::sync::Mutex<()> or
std::sync::Mutex<()> used with block_in_place) and acquire it at the start of
each test before calling store::clear_queue() to ensure exclusive access to
RESPOND_QUEUE.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to follow-up — not blocking v1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwalin-shah, understood! Would you like me to open a follow-up GitHub issue to track adding #[serial] (or a module-level mutex) to the ingest_event_upserts_queue_item and list_queue_returns_newest_first tests in src/openhuman/provider_surfaces/ops.rs? That way the race condition won't slip through the cracks when the SQLite-backed store lands.


🧠 Learnings used
Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Follow the feature design workflow: Specify → prove in Rust → prove over RPC → surface in the UI → test. Ground specs in existing domains, controller/registry patterns, and JSON-RPC naming (`openhuman.<namespace>_<function>`). No parallel architectures.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : Implement features in this order: specify against the codebase → implement in Rust → JSON-RPC E2E tests → UI in the Tauri app → app unit tests → app E2E

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 437
File: src/openhuman/subconscious/schemas.rs:211-234
Timestamp: 2026-04-08T16:47:02.404Z
Learning: In `src/openhuman/subconscious/schemas.rs`, `handle_status` intentionally derives `last_tick_at`, `total_ticks`, and `task_count` from DB queries (via `store::with_connection`) and hard-codes `consecutive_failures: 0`, rather than reading from the in-memory engine state. This is by design: acquiring the engine mutex in an RPC handler blocks for the full tick duration and was causing the frontend to freeze. DB-derived counts are approximate but acceptable for the UI status bar; do not flag this as a bug.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to tests/json_rpc_e2e.rs : Extend JSON-RPC E2E tests in `tests/json_rpc_e2e.rs` when adding features so RPC methods match what the UI will call. Use `scripts/test-rust-with-mock.sh` to run isolated tests.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Ship unit tests for new/changed behavior before stacking features. Untested code is incomplete.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 437
File: src/openhuman/subconscious/engine.rs:235-240
Timestamp: 2026-04-08T16:46:54.355Z
Learning: In `src/openhuman/subconscious/engine.rs`, the `evaluation_failed` check intentionally only matches evaluations where ALL tasks have `TickDecision::Noop` with a reason starting with `"Evaluation failed:"`. This is by design: `last_tick_at` must only fail to advance when the entire LLM evaluation layer is unavailable (e.g. Ollama down). Individual task execution failures are tracked per-task and do NOT block `last_tick_at` from advancing, because blocking on a single flaky task would cause the situation report to keep looking further back and re-process already-handled tasks. Do not flag this narrow check as a bug.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` files light and export-focused; place operational code in sibling files (`ops.rs`, `store.rs`, `schedule.rs`, `types.rs`) and re-export from `mod.rs`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` export-focused; place operational code in `ops.rs`, `store.rs`, `types.rs`, etc. Add `mod schemas;` and re-export `all_controller_schemas` and `all_registered_controllers`.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must go in a dedicated subdirectory (`openhuman/<domain>/mod.rs` + siblings). Do not add new standalone `*.rs` files at `src/openhuman/` root.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must live in a dedicated subdirectory (its own folder/module); do not add new standalone `*.rs` files directly at `src/openhuman/` root

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : When adding or changing domain behavior, ship matching documentation (rustdoc, code comments, or updates to `AGENTS.md` / architecture docs) before handoff

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

Learnt from: oxoxDev
Repo: tinyhumansai/openhuman PR: 571
File: src/openhuman/local_ai/service/whisper_engine.rs:69-80
Timestamp: 2026-04-14T19:59:04.826Z
Learning: When reviewing Rust code in this repo that uses the upstream `whisper-rs` crate (v0.16.0), do not report `WhisperContextParameters::use_gpu(...)` or `WhisperContextParameters::flash_attn(...)` as missing/invalid APIs. These builder-style methods exist upstream and return `&mut Self`; they are not limited to `WhisperVadContextParams`.

Comment on lines +51 to +62
FieldSchema {
name: "requires_attention",
ty: TypeSchema::Bool,
comment: "Whether the event should enter the respond queue as actionable.",
required: true,
},
FieldSchema {
name: "raw_payload",
ty: TypeSchema::Json,
comment: "Optional provider-specific raw payload for future debugging and enrichment.",
required: false,
},
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Schema required flags don't match ProviderEvent serde attributes.

  • requires_attention is declared required: true here, but ProviderEvent::requires_attention is #[serde(default)] (types.rs Line 26-27). Either drop the default or set required: false — otherwise the registry's validate_params and the struct deserializer disagree about what's acceptable.
  • raw_payload is TypeSchema::Json with required: false. For consistency with other optional fields on this schema (which wrap in TypeSchema::Option(...) via the optional() helper), consider TypeSchema::Option(Box::new(TypeSchema::Json)) so generated clients/CLI help reflect optionality.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/provider_surfaces/schemas.rs` around lines 51 - 62, The
FieldSchema entries are inconsistent with ProviderEvent's serde attributes: make
the "requires_attention" FieldSchema match ProviderEvent::requires_attention
(which is #[serde(default)]) by setting required: false (or remove the serde
default on ProviderEvent if you prefer the field to be required); and make
"raw_payload" clearly optional by using
TypeSchema::Option(Box::new(TypeSchema::Json)) (or the optional(...) helper used
elsewhere) and keep required: false so the generated clients/CLI reflect
optionality.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to follow-up — not blocking v1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwalin-shah, understood — acknowledged for v1. Would you like me to open a GitHub issue to track these two schema consistency fixes for a follow-up PR?

  1. Set requires_attention required: false in the ingest_event schema (to match #[serde(default)] on ProviderEvent).
  2. Change raw_payload to TypeSchema::Option(Box::new(TypeSchema::Json)) so generated clients/CLI reflect optionality.

🧠 Learnings used
Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/*/schemas.rs : For each domain, define controller schema metadata in a dedicated `schemas.rs` module that exports `schemas()`, `all_controller_schemas()`, and `all_registered_controllers()`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/*/rpc.rs : Expose domain functionality to CLI and JSON-RPC through the controller registry (`schemas.rs` + registered handlers); do not add domain-specific branches in `src/core/cli.rs` or `src/core/jsonrpc.rs`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/*/schemas.rs : Each domain must define a per-domain `schemas.rs` file (e.g., `src/openhuman/cron/schemas.rs`) that exports `schemas`, `all_controller_schemas`, `all_registered_controllers`, and `handle_*` functions delegating to domain `rpc.rs`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` export-focused; place operational code in `ops.rs`, `store.rs`, `types.rs`, etc. Add `mod schemas;` and re-export `all_controller_schemas` and `all_registered_controllers`.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must go in a dedicated subdirectory (`openhuman/<domain>/mod.rs` + siblings). Do not add new standalone `*.rs` files at `src/openhuman/` root.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Follow the feature design workflow: Specify → prove in Rust → prove over RPC → surface in the UI → test. Ground specs in existing domains, controller/registry patterns, and JSON-RPC naming (`openhuman.<namespace>_<function>`). No parallel architectures.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : When adding or changing domain behavior, ship matching documentation (rustdoc, code comments, or updates to `AGENTS.md` / architecture docs) before handoff

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must live in a dedicated subdirectory (its own folder/module); do not add new standalone `*.rs` files directly at `src/openhuman/` root

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` files light and export-focused; place operational code in sibling files (`ops.rs`, `store.rs`, `schedule.rs`, `types.rs`) and re-export from `mod.rs`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : Implement features in this order: specify against the codebase → implement in Rust → JSON-RPC E2E tests → UI in the Tauri app → app unit tests → app E2E

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

Learnt from: oxoxDev
Repo: tinyhumansai/openhuman PR: 571
File: src/openhuman/local_ai/service/whisper_engine.rs:69-80
Timestamp: 2026-04-14T19:59:04.826Z
Learning: When reviewing Rust code in this repo that uses the upstream `whisper-rs` crate (v0.16.0), do not report `WhisperContextParameters::use_gpu(...)` or `WhisperContextParameters::flash_attn(...)` as missing/invalid APIs. These builder-style methods exist upstream and return `&mut Self`; they are not limited to `WhisperVadContextParams`.

Comment on lines +10 to +20
static RESPOND_QUEUE: OnceLock<Mutex<Vec<RespondQueueItem>>> = OnceLock::new();

fn queue() -> &'static Mutex<Vec<RespondQueueItem>> {
RESPOND_QUEUE.get_or_init(|| Mutex::new(Vec::new()))
}

fn queue_lock() -> std::sync::MutexGuard<'static, Vec<RespondQueueItem>> {
queue()
.lock()
.unwrap_or_else(|poisoned| poisoned.into_inner())
}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Process-global queue is unbounded and shared across tests.

Two concerns for this initial in-memory store:

  1. RESPOND_QUEUE has no cap. upsert_queue_item dedupes by id but otherwise grows with unique events; add a soft max or TTL before ingest is exposed to real providers.
  2. Because the queue is a static, every #[tokio::test] in ops.rs that calls store::clear_queue() races against other tests running in parallel (cargo runs tests concurrently by default). This will cause intermittent failures once more tests are added. Consider gating test access behind a std::sync::Mutex lock helper, using #[serial_test::serial], or moving the queue into an injectable handle.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/provider_surfaces/store.rs` around lines 10 - 20, RESPOND_QUEUE
is an unbounded process-global Vec (accessed via queue(), queue_lock()) which
can grow indefinitely and races across parallel tokio tests
(clear_queue()/upsert_queue_item). Fix by making the queue bounded and
test-safe: add a soft cap and TTL eviction when inserting in upsert_queue_item
(drop oldest or oldest-until-ttl) and change RESPOND_QUEUE to hold a struct with
an internal Mutex plus a test-access Mutex/lock or replace the static with an
injectable handle type (e.g., StoreHandle) so tests can construct isolated
instances; alternatively gate test access with a global test-only
std::sync::Mutex and/or annotate tests with #[serial_test::serial] to prevent
concurrent mutations. Ensure you update queue(), queue_lock(), and any callers
(upsert_queue_item, clear_queue) to use the new bounded/evicting behavior or
injected handle.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to follow-up — not blocking v1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwalin-shah Understood, happy to defer this to a follow-up. Would you like me to open a GitHub issue to track both the bounded queue cap/TTL and the test-isolation concern so they don't get lost?


🧠 Learnings used
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 437
File: src/openhuman/subconscious/schemas.rs:211-234
Timestamp: 2026-04-08T16:47:02.404Z
Learning: In `src/openhuman/subconscious/schemas.rs`, `handle_status` intentionally derives `last_tick_at`, `total_ticks`, and `task_count` from DB queries (via `store::with_connection`) and hard-codes `consecutive_failures: 0`, rather than reading from the in-memory engine state. This is by design: acquiring the engine mutex in an RPC handler blocks for the full tick duration and was causing the frontend to freeze. DB-derived counts are approximate but acceptable for the UI status bar; do not flag this as a bug.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` files light and export-focused; place operational code in sibling files (`ops.rs`, `store.rs`, `schedule.rs`, `types.rs`) and re-export from `mod.rs`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must go in a dedicated subdirectory (`openhuman/<domain>/mod.rs` + siblings). Do not add new standalone `*.rs` files at `src/openhuman/` root.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : When adding or changing domain behavior, ship matching documentation (rustdoc, code comments, or updates to `AGENTS.md` / architecture docs) before handoff

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must live in a dedicated subdirectory (its own folder/module); do not add new standalone `*.rs` files directly at `src/openhuman/` root

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` export-focused; place operational code in `ops.rs`, `store.rs`, `types.rs`, etc. Add `mod schemas;` and re-export `all_controller_schemas` and `all_registered_controllers`.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Follow the feature design workflow: Specify → prove in Rust → prove over RPC → surface in the UI → test. Ground specs in existing domains, controller/registry patterns, and JSON-RPC naming (`openhuman.<namespace>_<function>`). No parallel architectures.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : Implement features in this order: specify against the codebase → implement in Rust → JSON-RPC E2E tests → UI in the Tauri app → app unit tests → app E2E

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

Learnt from: oxoxDev
Repo: tinyhumansai/openhuman PR: 571
File: src/openhuman/local_ai/service/whisper_engine.rs:69-80
Timestamp: 2026-04-14T19:59:04.826Z
Learning: When reviewing Rust code in this repo that uses the upstream `whisper-rs` crate (v0.16.0), do not report `WhisperContextParameters::use_gpu(...)` or `WhisperContextParameters::flash_attn(...)` as missing/invalid APIs. These builder-style methods exist upstream and return `&mut Self`; they are not limited to `WhisperVadContextParams`.

Comment on lines +6 to +30
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[serde(deny_unknown_fields)]
pub struct ProviderEvent {
pub provider: String,
pub account_id: String,
pub event_kind: String,
pub entity_id: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub thread_id: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub title: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub snippet: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub sender_name: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub sender_handle: Option<String>,
pub timestamp: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub deep_link: Option<String>,
#[serde(default)]
pub requires_attention: bool,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub raw_payload: Option<serde_json::Value>,
}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

ProviderEvent field naming diverges from RespondQueueItem / response types.

ProviderEvent serializes with snake_case (no rename_all) while RespondQueueItem and RespondQueueListResponse use #[serde(rename_all = "camelCase")]. That is consistent with the controller schema inputs (which declare snake_case names), but it means callers send account_id/entity_id and receive accountId/entityId — worth confirming this asymmetry is intentional rather than an oversight, especially given deny_unknown_fields on the input will hard-fail camelCase payloads.

Also, requires_attention here is #[serde(default)] so it accepts missing input, but the corresponding FieldSchema in schemas.rs (Line 51-56) marks it required: true. The registry's validate_params will reject requests without it even though the struct tolerates absence — pick one source of truth.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to follow-up — not blocking v1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwalin-shah, understood — noted as deferred. Would you like me to open a follow-up GitHub issue to track both of these items (the snake_case / camelCase asymmetry between ProviderEvent and the response types, and the requires_attention #[serde(default)] vs required: true schema conflict) so they don't get lost before v2 work begins?


🧠 Learnings used
Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : When adding or changing domain behavior, ship matching documentation (rustdoc, code comments, or updates to `AGENTS.md` / architecture docs) before handoff

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : Implement features in this order: specify against the codebase → implement in Rust → JSON-RPC E2E tests → UI in the Tauri app → app unit tests → app E2E

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must go in a dedicated subdirectory (`openhuman/<domain>/mod.rs` + siblings). Do not add new standalone `*.rs` files at `src/openhuman/` root.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/**/*.rs : New functionality must live in a dedicated subdirectory (its own folder/module); do not add new standalone `*.rs` files directly at `src/openhuman/` root

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: New/changed behavior must ship with matching rustdoc / code comments; update AGENTS.md or architecture docs when rules or user-visible behavior change

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` export-focused; place operational code in `ops.rs`, `store.rs`, `types.rs`, etc. Add `mod schemas;` and re-export `all_controller_schemas` and `all_registered_controllers`.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/*/mod.rs : Keep domain `mod.rs` files light and export-focused; place operational code in sibling files (`ops.rs`, `store.rs`, `schedule.rs`, `types.rs`) and re-export from `mod.rs`

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-22T21:14:11.082Z
Learning: Follow the feature design workflow: Specify → prove in Rust → prove over RPC → surface in the UI → test. Ground specs in existing domains, controller/registry patterns, and JSON-RPC naming (`openhuman.<namespace>_<function>`). No parallel architectures.

Learnt from: CR
Repo: tinyhumansai/openhuman PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-23T06:30:00.454Z
Learning: Applies to src/openhuman/about_app/**/*.rs : When adding, removing, renaming, or materially changing a user-facing feature, update the capability catalog in `src/openhuman/about_app/` to keep the runtime capability catalog as the source of truth

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.

Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.

Learnt from: oxoxDev
Repo: tinyhumansai/openhuman PR: 571
File: src/openhuman/local_ai/service/whisper_engine.rs:69-80
Timestamp: 2026-04-14T19:59:04.826Z
Learning: When reviewing Rust code in this repo that uses the upstream `whisper-rs` crate (v0.16.0), do not report `WhisperContextParameters::use_gpu(...)` or `WhisperContextParameters::flash_attn(...)` as missing/invalid APIs. These builder-style methods exist upstream and return `&mut Self`; they are not limited to `WhisperVadContextParams`.

Comment thread src/openhuman/tools/impl/memory/forget.rs Outdated
Comment thread src/openhuman/tools/impl/memory/store.rs Outdated
- memory/traits.rs: clarify RecallOpts and recall() doc comments to say
  None falls back to GLOBAL_NAMESPACE, not "any namespace"
- memory/store/memory_trait.rs: add normalize_namespace() helper and use
  it in recall()/list() so blank or whitespace-only inputs never silently
  bypass the global namespace (addresses @coderabbitai on memory_trait.rs:92)
- tools/impl/memory/recall.rs: pass the requested namespace via RecallOpts
  instead of RecallOpts::default() so recall is actually namespace-scoped
  (addresses @coderabbitai outside-diff comment on recall.rs:74-87)
- tools/impl/memory/store.rs: store into the actual namespace/key instead
  of encoding namespace into the key as a legacy packed string; update
  tests to use mem.get("global","lang") shape
  (addresses @coderabbitai on tools/impl/memory/store.rs:90)
- tools/impl/memory/forget.rs: try split namespace/key first, fall back
  to legacy packed key so migrated rows are deletable
  (addresses @coderabbitai on tools/impl/memory/forget.rs:68)
- provider_surfaces/schemas.rs: fix requires_attention required:true vs
  #[serde(default)] mismatch; use TypeSchema::Option for raw_payload
  (addresses @coderabbitai on provider_surfaces/schemas.rs:62)
- provider_surfaces/ops.rs: add TEST_MUTEX to serialize tests that mutate
  the global RESPOND_QUEUE and prevent races under parallel test execution
  (addresses @coderabbitai on provider_surfaces/ops.rs:128)
threads::title::collapse_whitespace was used in the test module but not
imported, causing the test build to fail.
@senamakel
Copy link
Copy Markdown
Member

PR Manager - deferred/human items from CodeRabbit review

The following CodeRabbit comments were not auto-fixed and require a human decision:


1. src/openhuman/memory/store/unified/init.rs:131-143 (Major) - Migrate vector_chunks with document namespace split

The boot migration in new() splits memory_docs rows out of GLOBAL_NAMESPACE but leaves vector_chunks rows pointing to GLOBAL_NAMESPACE. After migration, namespace-scoped recall (query_namespace_ranked) can miss migrated memories.

Fix: In the same migration transaction, collect the document_id → new_namespace mapping and run UPDATE vector_chunks SET namespace = <new_ns> WHERE document_id IN (...). This needs careful SQL since it must be idempotent (same guard as the docs migration).


2. src/openhuman/provider_surfaces/store.rs:10-20 (Minor) - Unbounded global queue

RESPOND_QUEUE has no size cap. upsert_queue_item deduplicates by composite ID but otherwise grows unbounded with unique events. Before this is exposed to real provider ingest volume, add a soft max (e.g. 500 items) or TTL-based eviction in upsert_queue_item.


3. src/openhuman/provider_surfaces/types.rs:30 (Minor) - Input/output serde asymmetry

ProviderEvent (input) serializes snake_case while RespondQueueItem / RespondQueueListResponse (output) use rename_all = "camelCase". This is consistent with how the controller schema declares field names, but callers send account_id/entity_id and receive accountId/entityId. Please confirm this asymmetry is intentional before shipping a stable API contract.


Items 1-3 above were classified as defer-human because they involve data-migration risk (1) or product/design decisions (2, 3). All other CodeRabbit actionable items from this review were applied in commit 7342db3e.

- memory/store/unified/init.rs: extend legacy-namespace migration to also
  re-home vector_chunks rows whose document_id now points at a namespace
  other than GLOBAL. Without this, namespace-scoped recall silently
  misses chunks for migrated docs. Idempotent — after both migrations
  run, no chunk under GLOBAL maps to a document in another namespace.

- provider_surfaces/store.rs: soft-cap RESPOND_QUEUE at 500 items with
  FIFO eviction from the tail, bounding in-memory growth under provider
  firehose volume before the SQLite-backed store lands.

- provider_surfaces/types.rs: drop rename_all = "camelCase" on
  RespondQueueItem and RespondQueueListResponse so the response matches
  the declared controller schema and ProviderEvent input (snake_case
  both ways). Tests updated to assert entity_id instead of entityId.
@jwalin-shah
Copy link
Copy Markdown
Contributor Author

@senamakel — green + MERGEABLE, ready for a look when you have a sec.

@@ -0,0 +1,136 @@
//! Core operations for provider assistive surfaces.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (not deferred). Commit 7342db3 adds static TEST_MUTEX: Mutex<()> = Mutex::new(()) in the tests module. Both ingest_event_upserts_queue_item and list_queue_returns_newest_first call _lock = TEST_MUTEX.lock().unwrap_or_else(|p| p.into_inner()) before store::clear_queue(), serializing them against each other so cargo's parallel test runner cannot interleave them. No external serial_test dependency needed.

@@ -0,0 +1,151 @@
//! Controller registry for `provider_surfaces`.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (not deferred). Both items applied in commit 7342db3. (1) requires_attention: required changed from true to false in the ingest_event schema, matching the #[serde(default)] on ProviderEvent::requires_attention — registry validate_params and the struct deserializer now agree. (2) raw_payload: TypeSchema::Json changed to TypeSchema::Option(Box::new(TypeSchema::Json)) so generated CLI help and client codegen surface it as optional, consistent with all other optional inputs using the optional() helper.

@@ -0,0 +1,70 @@
//! Persistence for provider assistive surfaces.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (not deferred). Commit d427226 adds const MAX_QUEUE_ITEMS: usize = 500 and truncates the tail of the queue after each upsert — bounding growth under provider firehose volume before the SQLite-backed store lands. The queue is prepend-ordered so oldest entries fall off the tail.

@@ -0,0 +1,66 @@
//! Shared types for provider assistive surfaces.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (not deferred). Commit d427226 drops #[serde(rename_all = "camelCase")] from both RespondQueueItem and RespondQueueListResponse. Both structs now serialize with snake_case like ProviderEvent and the declared controller schema inputs. The docstring explains the intent: callers send account_id/entity_id and receive account_id/entity_id — a symmetric contract. Test assertions in ops.rs updated from entityId to entity_id in the same commit.

@senamakel senamakel merged commit 5e660f2 into tinyhumansai:main Apr 23, 2026
7 checks passed
@jwalin-shah jwalin-shah deleted the pr/core-retrieval-v1 branch April 23, 2026 21:13
VectorJet pushed a commit to VectorJet/openhuman that referenced this pull request Apr 24, 2026
tinyhumansai#803)

Co-authored-by: Jwalin Shah <jshah1331@gmail.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
jwalin-shah pushed a commit to jwalin-shah/openhuman that referenced this pull request Apr 26, 2026
…ation)

Resolves conflicts after main absorbed:
  - tinyhumansai#835 inline-test extraction to sibling _tests.rs files
  - tinyhumansai#803 memory namespaces (Memory trait gained `namespace` arg + `namespace_summaries`)
  - tinyhumansai#839 memory_tree retrieval RPC handlers
  - install.sh resolver / smoke test changes

Conflict resolutions:
  - src/openhuman/agent/prompts/mod.rs + mod_tests.rs:
      kept branch's snapshot_pair production code (UserFilesSection now prefers
      session-frozen MemorySnapshot over workspace files), then applied tinyhumansai#835's
      extraction by moving the inline mod tests block to mod_tests.rs sibling.
      Tests retain the `curated_snapshot: None` additions from 1976ba4.
  - src/openhuman/agent/harness/subagent_runner/ops.rs + ops_tests.rs: same
      pattern — kept branch's session-snapshot wiring, extracted inline tests
      to ops_tests.rs sibling.
  - src/core/all.rs: kept both webview_apis (main) and life_capture +
      curated_memory (branch) controller registrations.
  - scripts/install.sh: kept branch's clearer "Resolved URL was empty…" diag
      and the test_install.sh resolver-unit-test job in installer-smoke.yml.

Compatibility patches:
  - ops_tests.rs NoopMemory updated to new Memory trait (namespace arg on
    store/get/list/forget, RecallOpts on recall, added namespace_summaries).
  - help/prompt.rs test fixture gets curated_snapshot: None.

cargo build --tests passes locally on darwin-aarch64.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants