test(life_capture): recall@k + MRR aggregator on retrieval eval#950
test(life_capture): recall@k + MRR aggregator on retrieval eval#950jwalin-shah wants to merge 41 commits intotinyhumansai:mainfrom
Conversation
Drafts the wedge for OpenHuman: continuous ingestion of Gmail/Calendar/Slack/ iMessage into a local personal_index.db, with a Today brief and retrieval- tuned chat as the two surfaces. Three composable privacy modes (Convenience / Hybrid / Fully local) sharing one code path. Track 1 unblocks the ship pipeline; Track 2 builds the spine over 4 weeks. Reuses ingestion logic from ~/projects/inbox.
Two implementation plans drafted from the life-capture spec: - Track 1 (ship pipeline): fix Ubuntu installer smoke, land in-flight PRs tinyhumansai#806/tinyhumansai#786/tinyhumansai#788/tinyhumansai#797, wire Tauri auto-updater + signed Mac/Windows builds. - Life-Capture #1 (foundation): SQLite + sqlite-vec personal_index.db, Embedder trait + HostedEmbedder (OpenAI), PII redaction, quoted-thread stripping, hybrid retrieval (vector + keyword + recency), controller schema + RPC. End-to-end test with synthetic items. No ingestion or UI yet — those are subsequent milestone plans.
Refactors scripts/install.sh to expose resolve_asset_url and verify_asset_reachable. Adds scripts/test_install.sh that exercises the resolver against a committed fixture latest.json. Failures now report the resolved URL and the parse error instead of dying silently.
…le, keyword projection column
Adopts the Hermes pattern (NousResearch/hermes-agent, MIT) of an
agent-writable, char-bounded curated memory file pair. F15 builds the
char-bounded store with atomic writes; F16 wires a session-start
snapshot into the system prompt (preserves prefix cache); F17 exposes
memory.{add,replace,remove,read} through the controller dispatch so
both the agent loop and skills can use it.
Sits between TinyHumans synthesis (volatile) and personal_index.db
(raw events) — fills the "deliberately curated facts" gap that
neither covered.
Adds src/openhuman/life_capture/ module tree (stubs for embedder, index, migrations, quote_strip, redact, rpc, schemas, types, plus tests/) per Foundation Plan F1. Wires the module in src/openhuman/mod.rs. Adds sqlite-vec to deps and httpmock as a dev-dep. Reuses existing rusqlite/regex/once_cell/async-trait/tempfile already in tree (Option C — avoids sqlx/rusqlite libsqlite3-sys conflict).
…egex
Previously the country-code group required at least one digit (\+?\d{1,3}),
so '(415) 555-0123' (no leading country code) wouldn't match. Wrapped the
whole prefix in a non-capturing optional group so parenthesized area codes
match without a leading country code.
…te, char-bounded)
… table Loads sqlite_vec via sqlite3_auto_extension exactly once per process so every new connection picks up the vec0 module. PersonalIndex wraps a single rusqlite Connection in Arc<Mutex<_>> for async sharing; open() runs migrations and sets WAL+foreign_keys, open_in_memory() is for tests. vec_version() round-trip and a 1536-dim INSERT + MATCH query both verified.
…_id) dedupe Adds IndexWriter with upsert (ON CONFLICT(source, external_id) DO UPDATE — FTS stays in sync via triggers) and upsert_vector (DELETE + INSERT since vec0 has no ON CONFLICT for virtual table primary keys). Both wrap rusqlite work in spawn_blocking and share the PersonalIndex's Arc<Mutex<Connection>>. Exposes ensure_vec_extension_registered() crate-wide so the migrations test can register vec0 before opening its in-memory connection.
- keyword_search: FTS5 ranked by negated bm25 with snippet markers (« » …), plus the user:local ACL EXISTS clause so the same query shape works for the multi-token team v2 ACL without rewrites. - vector_search: sqlite-vec MATCH with k = ? clause (vec0 KNN requirement) and ORDER BY distance; score = 1/(1 + distance) so callers can blend it with keyword bm25 on the same monotonic scale. Shared ItemRow + into_hit() helper lets both queries reuse the same row shape; rusqlite query_map closures hand-build it because rusqlite has no sqlx FromRow equivalent.
Pulls oversampled candidates (k*3 min 20) from both keyword and vector legs, normalises each independently to [0,1], then re-ranks with 0.55 * vec_norm + 0.35 * kw_norm + 0.10 * exp(-age_days/30). Documents present in only one leg get 0 for the missing signal but still compete on the others; only items with neither signal are dropped. Same-vector twin items now break their tie by recency.
…rch) Adds two controllers exposed via the registry: - life_capture.get_stats — total items, per-source counts, last-ingest ts. - life_capture.search — hybrid (vector + keyword + recency) search with embed-then-rank, optional k (default 10, capped at 100). Runtime state (PersonalIndex + dyn Embedder) lives in life_capture::runtime behind a tokio OnceCell because handlers are stateless fn(Map<String,Value>) and have no per-call context object. F14 will call runtime::init() at app startup; until then handlers return a structured 'not initialised' error so the failure mode is loud, not silent. Schemas registered in core/all.rs alongside cron.
… → search) FakeEmbedder hashes input bytes into a deterministic sparse 1536-dim vector so the same text round-trips identically through vec0 — keeps the e2e test hermetic, no network call. Verifies that: - quote-strip drops the 'On … wrote:' block before indexing, - redact masks the email address before indexing, - the upserted item resurfaces via hybrid_search with both transformations preserved on the returned text.
After Config::load_or_init succeeds, open the personal_index.db at config.workspace_dir and register the life_capture runtime. Embedder is env-gated: OPENHUMAN_EMBEDDINGS_KEY > OPENAI_API_KEY OPENHUMAN_EMBEDDINGS_URL (default: https://api.openai.com/v1) OPENHUMAN_EMBEDDINGS_MODEL (default: text-embedding-3-small) If no key is set we still open the index file (so Plan #2's ingestion worker can write to it) but skip runtime registration — controllers then return the 'not initialised' error from runtime::get() instead of silently calling a misconfigured embedder. Config-schema integration deferred to a follow-up; env-driven keeps F14 non-invasive while we land the rest of the foundation.
…ection snapshot_pair(memory, user) returns a MemorySnapshot containing plain strings — no reference back to the stores — so taking a snapshot at session start and reusing it across turns gives a stable system-prompt prefix and lets the LLM prefix-cache hit on every subsequent turn. Plan #1 also calls for wiring this into chat.rs' OpenClaw context loader, but openhuman assembles agent prompts per-agent under src/openhuman/agent/agents/*/prompt.rs (not the plan's stale src-tauri/src/commands/chat.rs path); the prompt-builder integration is deferred to Plan #2 alongside the agent-context refactor.
… controllers Adds four controllers under the memory_curated namespace (the 'memory' namespace is already owned by the long-term memory subsystem): - memory_curated.read — read MEMORY.md or USER.md - memory_curated.add — append entry, char-bounded - memory_curated.replace — substring replace, char-bounded - memory_curated.remove — drop entries containing a needle 'file' input is a typed enum (memory|user) so adapters reject anything else at validation time. Runtime state lives in curated_memory::runtime behind a tokio OnceCell, mirroring life_capture's pattern: startup in jsonrpc.rs constructs both stores at <workspace>/memories/ with Hermes' char limits (memory: 2200, user: 1375) and registers the runtime. Handlers return 'not initialised' until init runs, so failure is loud.
…ings Findings from Codex + Gemini second-opinion review: 1. IndexWriter::upsert orphaned vectors on re-ingest (Codex). On (source, external_id) conflict the row's id was kept but the caller's fresh UUID was used for upsert_vector — the vector wrote under an id no row joined to. Fix: explicit SELECT-then-UPDATE-or-INSERT in the same transaction; mutates caller's Item.id to the canonical id (so the next upsert_vector lands on the right row) and orphan-deletes any vector already written under the wrong id. Signature change: upsert(&[Item]) -> upsert(&mut [Item]). 2. upsert_vector DELETE+INSERT was outside a transaction (Gemini) — a failed INSERT permanently lost the item's vector. Now wrapped in tx. 3. Runtime over-gated get_stats on the embedder (Codex). Split runtime into separate INDEX + EMBEDDER OnceCells: get_stats only requires index, search requires both. Index initialises whenever the workspace dir is reachable; embedder is env-gated as before. 4. Startup race (both reviewers): runtime init lived inside the post- serve tokio::spawn block, so axum::serve was already accepting before the OnceCells were set. Hoisted both bootstraps (curated_memory and life_capture) into helpers called inline before axum::serve. 5. runtime::get error message lied (Codex) — said 'set embeddings.api_key in config' but startup actually reads env vars. Fixed text. Bonus: rename controller namespace memory_curated -> curated_memory (Codex preference; nothing depends on it yet so renamed before clients do). Adds regression test for #1 (reingest_with_fresh_uuid_keeps_vector_findable). 4696 lib tests pass.
…emory namespace rename Pre-push hook auto-applied cargo fmt across the foundation files; also fixed the schemas.rs docstring still referencing the old memory_curated namespace.
Critical: - curated_memory/store.rs: propagate read errors instead of `unwrap_or_default`, which could rewrite MEMORY.md / USER.md from an empty baseline on transient I/O or permission failures. Reject empty needle in replace/remove. - curated_memory/rpc.rs: belt-and-suspenders empty-needle guard at the RPC boundary (remove with "" deletes every entry). - life_capture/index.rs: upsert_vector rejects orphan ids (items row missing), which would have inserted vectors that never join in vector_search. - life_capture/embedder.rs: validate response length matches input, indices are contiguous 0..n, and every vector matches dim() — prevents silent misalignment or wrong-dimensional vectors flowing into the 1536-wide sqlite-vec table. Added 30s request + 10s connect timeouts. Retrieval hardening: - life_capture/index.rs: keyword_search now escapes FTS5 operators by tokenizing on whitespace and wrapping each token as a quoted literal. Prevents errors or unintended matches from stray quotes, AND/OR/NEAR, column specifiers in user input. - life_capture/index.rs: hybrid_search applies q.sources / q.since / q.until via post-filter (single consistent pass across keyword+vector fusion). - life_capture/rpc.rs: handle_search validates embedder dim and response length against the fixed 1536-wide index column — clear RPC error instead of a cryptic sqlite-vec failure when the embeddings model is swapped. Docs: - docs/event-bus.md: remove erroneous `.await` on register_native_global (it is sync; the handler closure is async). - life-capture design spec: soften PII redaction claim — only regex is implemented today; light NER is flagged as a future enhancement.
Rendered prompts now surface runtime curated-memory writes: - `PromptContext` gains `curated_snapshot: Option<&MemorySnapshot>`. - `UserFilesSection` prefers the snapshot over the workspace-file loader when one is attached, and injects `USER.md` alongside `MEMORY.md` using a byte-compatible `inject_snapshot_content` helper. - `Session` carries `Option<Arc<MemorySnapshot>>`, populated by `ensure_curated_snapshot` on the first turn from `curated_memory::runtime::get()`. Reused across turns so prompt bytes stay frozen (KV-cache prefix contract) while mid-session `curated_memory.add/replace/remove` writes land on the NEXT session. - `ParentExecutionContext` inherits the snapshot so sub-agents render identical `MEMORY.md`/`USER.md` blocks as the parent. - Legacy workspace-file fallback preserved for embeds that don't initialise the curated-memory runtime (pure unit tests).
Loads a 32-item corpus across gmail/calendar/imessage/slack into an in-memory PersonalIndex and runs 12 queries through keyword / vector / hybrid paths, asserting must_contain / must_not_contain within per-query top-K prefixes. The vector leg uses a deterministic FNV-1a feature-hashing embedder (1536 dims, L2-normalized) so the test is offline and reproducible; real embedder swaps behind one call. Fixture reserves `relevant` and `pending` fields so recall@k / MRR / nDCG bolt onto the same JSON later. q-src-01 is marked pending: hybrid_search currently ignores Query.sources/since/until. Flip to false once filtering lands.
Previously every read grabbed the same `Arc<Mutex<Connection>>` as the writer, so `IndexReader` calls serialised behind in-flight ingests even though WAL would allow parallel readers. Split `PersonalIndex` into: - `writer: Arc<Mutex<Connection>>` — unchanged single-writer model. - `reader_pool: Option<Arc<r2d2::Pool<SqliteConnectionManager>>>` — four-connection read pool on file-backed indexes, built alongside the writer after migrations run. Each pooled connection gets `query_only=ON` as a belt-and-suspenders guard; sqlite-vec is loaded automatically via the process-wide auto-extension. `IndexReader` routes `keyword_search` / `vector_search` through a new `with_read_conn` helper that picks the pool when present and falls back to the writer lock otherwise. In-memory handles keep the single-connection layout — shared-cache URIs buy nothing at test-fixture scale and would force every test to grow a unique name.
…ndings - Move bootstrap_curated_memory + bootstrap_life_capture out of src/core/jsonrpc.rs into their respective domain runtime modules (curated_memory::runtime::bootstrap, life_capture::runtime::bootstrap). jsonrpc.rs now calls the thin orchestration entrypoints instead of owning domain construction details, per CLAUDE.md controller-registry separation rule. (addresses @coderabbitai on src/core/jsonrpc.rs:670) - Add tracing debug! at entry/exit of embed_batch in HostedEmbedder, logging text count, URL, model, and returned vector count. (addresses @coderabbitai on src/openhuman/life_capture/embedder.rs:68) - Add tracing debug!/trace! to keyword_search, vector_search, and hybrid_search: entry logs query_len/k/vec_dim, exit logs hit count, and hybrid_search emits trace! per-hit fused score components (kw_rrf, vc_rrf, recency, final score) without logging raw query text. (addresses @coderabbitai on src/openhuman/life_capture/index.rs:624) - Add tracing debug!/warn! to handle_get_stats and handle_search: entry/exit debug! with text_len, k, total_items, hit count; warn! on all error branches (dim mismatch, embed_batch failure, empty vec, hybrid_search failure, task panic). (addresses @coderabbitai on src/openhuman/life_capture/rpc.rs:117)
…CodeRabbit items - src/core/all.rs: add namespace_description() match arms for life_capture and curated_memory namespaces (item 1) - docs/superpowers/specs/2026-04-22-life-capture-layer-design.md: add 'text' language tag to fenced code block to fix MD040 (item 2) - AGENTS.md: fix cargo test example to use -p openhuman_core (the actual library crate name) instead of -p openhuman (item 3) - src/openhuman/life_capture/redact.rs: tighten PHONE regex to require a mandatory separator after the area code when no country code is present, preventing 8-digit invoice/order IDs from being redacted; add test asserting 8-digit IDs are NOT redacted (item 4) - src/openhuman/life_capture/migrations/0001_init.sql: drop unused text_keyword column — FTS triggers only populate subject+text and nothing writes to text_keyword, so the column is dead weight (item 5) - scripts/install.sh: move SOURCE_ONLY early-exit guard to immediately after the flag is parsed (before variable definitions, color setup, function definitions, and arg-parsing loop) so sourcing with --source-only is a true no-op with no side effects (item 8) Items 6 (store.rs empty needle + error propagation), 7 (embedder timeout + logging + validation), 9 (upsert_vector orphan check), 11 (keyword_search FTS injection), 12 (hybrid_search filters), and 13 (rpc.rs dimension check) were already addressed in prior commits. https://claude.ai/code/session_01BD65GFgzQtMcdemXNeBjwv
… 10) ItemRow::into_hit previously coerced malformed DB rows to silent defaults (uuid::Uuid::nil(), Source::Gmail, chrono default timestamp), masking data corruption. Change the return type to anyhow::Result<Hit> and propagate errors: - Unknown source string -> Err with context showing the bad value - Malformed UUID in items.id -> Err with context - Out-of-range unix timestamp -> Err with message Both callsites (keyword_search, vector_search) now collect with .collect::<anyhow::Result<Vec<_>>>()? so a single bad row surfaces as an RPC error rather than a silently wrong Hit. Items 9 (upsert_vector orphan guard), 11 (FTS injection via fts5_quote), 12 (hybrid_search filter application), and 13 (rpc.rs dimension-mismatch check) were already implemented in prior commits on this branch. https://claude.ai/code/session_01BD65GFgzQtMcdemXNeBjwv
cargo fmt pedantry on life_capture/{index,redact}.rs (no behavior change).
Move install.sh --source-only early-exit back to *after* function
definitions. The earlier relocation (b0048c2) returned before
resolve_asset_url was defined, breaking scripts/test_install.sh on the
macOS smoke job with "command not found".
Transport file shouldn't own domain-specific construction details. Adds core::all::bootstrap_controller_runtimes(ws) that holds the domain list (curated_memory + life_capture today) and is called once from jsonrpc.rs before axum::serve. Ordering invariant preserved. Addresses CodeRabbit PR tinyhumansai#817 src/core/jsonrpc.rs:670.
Rust's `regex` crate doesn't support look-around; clippy's `invalid_regex` deny-lint blocked the build. Mandatory separators in the phone pattern already guard against mid-number matches (e.g. 8-digit invoice IDs), and `redact()` runs CC before PHONE so long digit runs are replaced first.
- curated_memory/store: seed new store from legacy root-level MEMORY.md on first
upgrade, truncated to char_limit. Prevents user memory loss across the file
layout change.
- life_capture/index: skip keyword_search early when sanitized query is empty
to avoid SQLite empty-MATCH errors / accidental full scans.
- life_capture/rpc: wrap search response as {"hits": [...]} for forward-compat
with paging/metadata fields.
- install.sh: verify asset URL is reachable (with DRY_RUN respected) before
attempting download so failures surface earlier.
- test_install.sh: stop swallowing source-only error output.
…ation) Resolves conflicts after main absorbed: - tinyhumansai#835 inline-test extraction to sibling _tests.rs files - tinyhumansai#803 memory namespaces (Memory trait gained `namespace` arg + `namespace_summaries`) - tinyhumansai#839 memory_tree retrieval RPC handlers - install.sh resolver / smoke test changes Conflict resolutions: - src/openhuman/agent/prompts/mod.rs + mod_tests.rs: kept branch's snapshot_pair production code (UserFilesSection now prefers session-frozen MemorySnapshot over workspace files), then applied tinyhumansai#835's extraction by moving the inline mod tests block to mod_tests.rs sibling. Tests retain the `curated_snapshot: None` additions from 1976ba4. - src/openhuman/agent/harness/subagent_runner/ops.rs + ops_tests.rs: same pattern — kept branch's session-snapshot wiring, extracted inline tests to ops_tests.rs sibling. - src/core/all.rs: kept both webview_apis (main) and life_capture + curated_memory (branch) controller registrations. - scripts/install.sh: kept branch's clearer "Resolved URL was empty…" diag and the test_install.sh resolver-unit-test job in installer-smoke.yml. Compatibility patches: - ops_tests.rs NoopMemory updated to new Memory trait (namespace arg on store/get/list/forget, RecallOpts on recall, added namespace_summaries). - help/prompt.rs test fixture gets curated_snapshot: None. cargo build --tests passes locally on darwin-aarch64.
Adds graded relevance labels to the 10 in-scope queries in
`tests/fixtures/life_capture/corpus.json` (q-src-01 stays pending,
q-neg-01 has no relevant items by design) and computes recall@1/3/5
+ MRR per query, mean over the suite, and per-kind breakdowns.
Output runs under `--nocapture`:
[retrieval_eval] per-query metrics (10q):
id kind R@1 R@3 R@5 MRR
q-kw-01 keyword 0.50 0.50 0.50 1.00
...
MEAN 0.77 0.91 0.93 1.00
mean-keyword 0.75 0.88 0.88 1.00
mean-semantic 0.57 0.87 0.93 1.00
mean-mixed 1.00 1.00 1.00 1.00
This gives us a number that any retrieval change has to move. The
fixture already reserved a `relevant: [{ext_id, grade}]` field, so
no schema change — grades are stored but not yet used (binary
relevance only). Switching to nDCG later requires no fixture rewrite.
`assert_*` smoke checks are unchanged. Empty `relevant` => the query
is silently excluded from recall/MRR aggregation.
📝 WalkthroughWalkthroughThis PR introduces two new domains— Changes
Sequence DiagramssequenceDiagram
participant User as User/App
participant Runtime as Life Capture<br/>Runtime
participant Index as Personal<br/>Index
participant Embedder as Embedder<br/>(OpenAI)
participant DB as SQLite +<br/>sqlite-vec
User->>Runtime: bootstrap(workspace_dir)
Runtime->>Index: open(db_path)
Index->>DB: migrate() + init schema
Index->>DB: register vec0 extension
Runtime->>Embedder: new(api_key, model)
Runtime->>Runtime: store singletons
User->>Runtime: search("what was that?", k=10)
Runtime->>Embedder: embed_batch(["what was that?"])
Embedder-->>Runtime: [1536D vector]
Runtime->>Index: hybrid_search(text, vector, k=10)
Index->>DB: keyword_search (FTS5)
Index->>DB: vector_search (sqlite-vec distance)
Index->>Index: RRF + recency ranking
Index-->>Runtime: [Hit{score,snippet,item}]
Runtime-->>User: {hits: [...]}
sequenceDiagram
participant Agent as Agent<br/>Session
participant Turn as Turn<br/>Runtime
participant CuratedMem as Curated<br/>Memory Runtime
participant Store as Memory<br/>Store
participant PromptCtx as Prompt<br/>Context
Agent->>Agent: build() → new Agent
Agent->>Turn: first turn setup
Turn->>CuratedMem: get() runtime singleton
CuratedMem->>Store: snapshot_pair()
Store-->>CuratedMem: MemorySnapshot{memory,user}
Turn->>Turn: store snapshot in agent
Turn->>PromptCtx: build context
PromptCtx->>PromptCtx: curated_snapshot = Some(snapshot)
PromptCtx->>Turn: ready for prompt rendering
Turn->>Turn: render system prompt
Turn-->>Agent: exec turn with frozen snapshot
alt Sub-agent Call
Turn->>Agent: spawn_subagent()
Agent->>Agent: ParentExecutionContext{curated_snapshot}
Agent-->>Turn: sub-agent inherits snapshot
end
sequenceDiagram
participant Runtime as Curated<br/>Memory Runtime
participant RPC as RPC<br/>Handler
participant Store as Memory<br/>Store
participant FS as Filesystem
Runtime->>Runtime: bootstrap(workspace_dir)
Runtime->>Store: open(memories/memory)
Store->>FS: create/migrate MEMORY.md
Store-->>Runtime: MemoryStore(memory)
Runtime->>Store: open(memories/user)
Store->>FS: create/migrate USER.md
Store-->>Runtime: MemoryStore(user)
Client->>RPC: handle_add(file="memory", entry="...")
RPC->>Store: add(entry)
Store->>FS: append (atomic temp+rename)
Store-->>RPC: Ok(())
RPC-->>Client: {file, ok: true}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openhuman/agent/debug/mod.rs (1)
381-396:⚠️ Potential issue | 🟡 MinorVerify the dump still matches production when
integrations_agentis spawned as a subagent with inherited parent context.The module doc and
DumpedPrompt::textcomment claim the dump is byte-identical to what the LLM sees on turn 1. However,integrations_agentis never a turn-1 agent in production — it's always spawned as a typed subagent viaspawn_subagent(agent_id="integrations_agent", toolkit=…)by the orchestrator or other agents. In that subagent scenario (persubagent_runner/ops.rs:533), the child inheritsparent.curated_snapshot.as_deref()and renders byte-identicalMEMORY.md/USER.mdblocks per theParentExecutionContextcontract.The debug dump creates a fresh
Agent::from_config_for_agent(…)without parent context and hardcodescurated_snapshot: None, so it shows a baseline prompt that never occurs in production. If the goal is to reflect what a live spawnedintegrations_agentactually receives, either:
- Accept an optional
parent_context: Option<&ParentExecutionContext>parameter torender_integrations_agentand threadparent.curated_snapshotthrough, or- Update the module doc to clarify that the dump shows the base prompt before parent-context inheritance (distinct from the production subagent runtime).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/debug/mod.rs` around lines 381 - 396, The debug dump builds a PromptContext with curated_snapshot: None which doesn't match production for integrations_agent spawned as a subagent; modify render_integrations_agent (or the function that calls Agent::from_config_for_agent) to accept an Option<&ParentExecutionContext> parent_context and, when present, set PromptContext.curated_snapshot = parent_context.curated_snapshot.as_deref(), ensuring you thread the parent_context through to where PromptContext is constructed (referencing PromptContext, render_integrations_agent, Agent::from_config_for_agent, and curated_snapshot); alternatively, if you prefer not to change the API, update the module doc and DumpedPrompt::text comment to explicitly state the dump is the base prompt before parent-context inheritance.
🧹 Nitpick comments (16)
src/openhuman/life_capture/redact.rs (1)
34-45: Best-effort CC pattern can match ISBN-13 / similar structured IDs.
\b(?:\d[ \-]?){12,18}\d\bwill redact e.g. an ISBN-13 like978-3-16-148410-0or a 14-digit GTIN as<CC>. That's typically fine for a personal-indexing redactor (false-positive PII redaction is preferred over leaking), but worth a one-line note in the doc-comment so future readers don't try to "fix" it as a bug.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/redact.rs` around lines 34 - 45, Add a one-line clarifying comment to the redact function's doc-comment explaining that the CC regex (static CC) is intentionally broad and may match ISBN-13/GTIN or other long numeric identifiers (e.g., "978-3-16-148410-0"), and that false-positive redaction is preferred to leaking PII; update the comment near the CC declaration or above pub fn redact to mention this so future readers won't treat those matches as a bug.src/openhuman/agent/agents/morning_briefing/prompt.rs (1)
64-64: LGTM.Same mechanical
curated_snapshot: Nonetest-fixture update.Side note (optional, applies repo-wide, not blocking): the six agents (archivist, tool_maker, trigger_triage, morning_briefing, critic, integrations_agent's
ctx_with) now share an identicalPromptContexttest fixture body moduloagent_id. A smalltest_supporthelper that returns a baselinePromptContextwould collapse all of these and make the next field addition a one-line change instead of N. Not for this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agents/morning_briefing/prompt.rs` at line 64, The test fixture update simply sets curated_snapshot: None in the PromptContext instance used by the morning_briefing agent; keep that change but also consider (non-blocking) extracting a shared test helper function (e.g., a test_support::baseline_prompt_context) that returns a baseline PromptContext to avoid duplicated fixtures across agents (archivist, tool_maker, trigger_triage, morning_briefing, critic, integrations_agent::ctx_with); for this PR just ensure the curated_snapshot: None assignment is applied to the PromptContext used in morning_briefing's tests and leave the refactor to a separate change.src/core/jsonrpc.rs (1)
667-675: Include the underlying error in the bootstrap-failure log.
Err(_)discards the error; when this branch ever fires (corrupt config, IO error, schema migration), an operator only sees the generic message. Surface{err}so the cause is greppable in logs alongside the existing[core]prefix you already use elsewhere on this path.♻️ Proposed diff
- match crate::openhuman::config::Config::load_or_init().await { - Ok(config) => { - crate::core::all::bootstrap_controller_runtimes(&config.workspace_dir).await; - } - Err(_) => log::warn!( - "[core] config load failed during runtime bootstrap; \ - controllers with pre-serve runtimes will return 'not initialised'" - ), - } + match crate::openhuman::config::Config::load_or_init().await { + Ok(config) => { + crate::core::all::bootstrap_controller_runtimes(&config.workspace_dir).await; + } + Err(err) => log::warn!( + "[core] config load failed during runtime bootstrap ({err}); \ + controllers with pre-serve runtimes will return 'not initialised'" + ), + }Based on learnings: "Add substantial, development-oriented logs on new/changed flows for end-to-end debugging; use structured, grep-friendly context".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/jsonrpc.rs` around lines 667 - 675, The Err arm of the async match over crate::openhuman::config::Config::load_or_init() currently discards the error; change Err(_) to capture the error (e.g., Err(err)) and include that error in the log message emitted from the failing branch so the operator can see the underlying cause. Update the log::warn! call that follows the failed load so it appends the captured error (using a readable formatter such as {:?} or {}) while preserving the existing "[core]" context and keep bootstrap_controller_runtimes(&config.workspace_dir).await called only on Ok(config).scripts/test_install.sh (1)
23-27: Optional: assert specific exit code (3) for missing platform.
resolve_asset_urlis documented to return exit code3specifically on missing platform (vs2parse error /4unreachable). Asserting only "non-zero" lets a future regression where a parse error masquerades as a missing-platform failure slip through.♻️ Tighten the assertion
-# Also test a missing platform produces exit code 3 -if resolve_asset_url "$FIXTURE" "linux" "aarch64" 2>/dev/null; then - echo "FAIL: expected non-zero exit for missing platform linux-aarch64" - exit 1 -fi +# Missing platform should produce the documented exit code 3. +set +e +resolve_asset_url "$FIXTURE" "linux" "aarch64" 2>/dev/null +rc=$? +set -e +if [[ "$rc" -ne 3 ]]; then + echo "FAIL: expected exit code 3 for missing platform linux-aarch64, got $rc" + exit 1 +fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test_install.sh` around lines 23 - 27, The test should assert the exact exit code 3 from resolve_asset_url for a missing platform instead of just non-zero; run resolve_asset_url "$FIXTURE" "linux" "aarch64", capture its exit status (e.g., ret=$?), and fail the test unless ret == 3, emitting a clear message if it differs; reference the existing test invocation of resolve_asset_url and the FIXTURE variable to locate where to replace the generic non-zero check with this exact-status assertion.src/openhuman/agent/prompts/mod.rs (1)
949-980: Optional: extract shared truncation helper.
inject_snapshot_contentandinject_workspace_file_capped(lines 1011-1029) now share the exact char-bounded truncation + header + truncation-marker logic. Worth pulling that into a smallinject_labeled_block(prompt, label, content, max_chars)helper so the two callers can't drift on truncation semantics.♻️ Sketch
+fn inject_labeled_block(prompt: &mut String, label: &str, content: &str, max_chars: usize) { + let trimmed = content.trim(); + if trimmed.is_empty() { + return; + } + let _ = writeln!(prompt, "### {label}\n"); + let truncated = if trimmed.chars().count() > max_chars { + trimmed + .char_indices() + .nth(max_chars) + .map(|(idx, _)| &trimmed[..idx]) + .unwrap_or(trimmed) + } else { + trimmed + }; + prompt.push_str(truncated); + if truncated.len() < trimmed.len() { + let _ = writeln!( + prompt, + "\n\n[... truncated at {max_chars} chars — use `read` for full file]\n" + ); + } else { + prompt.push_str("\n\n"); + } +}
inject_snapshot_contentbecomes a thin wrapper;inject_workspace_file_cappedcalls it after the file read.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/prompts/mod.rs` around lines 949 - 980, inject_snapshot_content and inject_workspace_file_capped duplicate the same char-bounded truncation, header and truncation-marker logic; extract that shared logic into a small helper function (e.g., inject_labeled_block(prompt: &mut String, label: &str, content: &str, max_chars: usize)) and have both inject_snapshot_content and inject_workspace_file_capped call it. The helper should preserve current behavior: skip empty/whitespace content, write the "### {label}\n" header, truncate by Unicode character count using char_indices().nth(max_chars) to slice safely, append the truncated text, and emit the same "\n\n[... truncated at {max_chars} chars — use `read` for full file]\n" marker when truncated (otherwise append "\n\n"). Ensure existing callers pass the same max_chars and that string trimming and newline behavior remain identical.tests/life_capture_retrieval_eval.rs (1)
308-343: Minor: spell out therecall_at_1cap in the doc comment.For queries where
relevant.len() > 1,recall_at_1is bounded above by1/|relevant|, so the MEAN R@1 mechanically can't reach 1.0 even when the top hit is always a relevant doc (which is what the baselineMRR=1.00already implies). Worth a sentence onQueryMetricsso future readers don't readR@1=0.77as "23% of top-1 results are wrong" when it can also mean "queries average ~1.3 relevant docs and we only inspect one slot."This is purely a clarity nit — the math is correct.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/life_capture_retrieval_eval.rs` around lines 308 - 343, Add a one-sentence clarification to the QueryMetrics struct doc comment next to the recall_at_1 field (or the struct-level comment) stating that for queries with more than one relevant document recall_at_1 is bounded above by 1/|relevant| (so mean R@1 cannot reach 1.0 even if the top hit is always relevant), and optionally note that MRR=1.00 implies top-hit relevance while R@1 is constrained by the average number of relevant docs; update the comment near QueryMetrics and the recall_at_1 reference accordingly.scripts/install.sh (1)
229-249: Optional: foldLATEST_VERSIONextraction intoresolve_asset_url.
resolve_from_latest_jsonnow invokespython3twice on the samelatest.json— once viaresolve_asset_urland once at lines 245-249 to fish outversion. A single python helper that printsversion\nurl(and parses both in onejson.load) avoids the double parse and the second open-file path.Not blocking — the second call is best-effort with
|| true.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/install.sh` around lines 229 - 249, The script currently calls python3 twice on the same latest.json: once via resolve_asset_url and again to set LATEST_VERSION; change resolve_asset_url to parse latest.json once and return both the asset URL and the version (e.g. output "url|version" or two newline-separated values), then update the caller so ASSET_URL/ASSET_NAME are derived from resolve_asset_url's first value and LATEST_VERSION from its second; update resolve_asset_url's callers and error handling to preserve the existing exit codes/behavior.src/openhuman/life_capture/migrations.rs (1)
30-47: Consider usingunchecked_transaction()instead of manual transaction handling for consistency.
rusqlite::Connection::unchecked_transaction()is already used throughout the codebase and provides implicit rollback onDrop. This simplifies the code and avoids the closure pattern while maintaining the same atomicity guarantees.♻️ Suggested change
- // Run the migration SQL and record it atomically. - conn.execute_batch("BEGIN")?; - let result = (|| -> Result<()> { - conn.execute_batch(sql)?; - conn.execute( - "INSERT INTO _life_capture_migrations(name, applied_at) \ - VALUES (?1, CAST(strftime('%s','now') AS INTEGER))", - rusqlite::params![name], - )?; - Ok(()) - })(); - match result { - Ok(()) => conn.execute_batch("COMMIT")?, - Err(e) => { - let _ = conn.execute_batch("ROLLBACK"); - return Err(e); - } - } + let tx = conn.unchecked_transaction()?; + tx.execute_batch(sql)?; + tx.execute( + "INSERT INTO _life_capture_migrations(name, applied_at) \ + VALUES (?1, CAST(strftime('%s','now') AS INTEGER))", + rusqlite::params![name], + )?; + tx.commit()?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/migrations.rs` around lines 30 - 47, Replace the manual BEGIN/COMMIT/ROLLBACK and closure pattern with rusqlite's unchecked_transaction() to mirror the rest of the codebase: start an unchecked transaction via conn.unchecked_transaction(), run the migration SQL and the INSERT into _life_capture_migrations using the transaction handle (instead of conn.execute_batch/conn.execute), and then call tx.commit()? on success so Drop-based rollback handles failures; update references to the closure and explicit ROLLBACK/COMMIT logic around conn.execute_batch and the inner INSERT to use the transaction API instead.src/openhuman/life_capture/rpc.rs (1)
78-83: Optional: simplify&Arc<dyn Embedder>to&dyn Embedder.
handle_searchdoesn't need to clone theArc, only call methods on it, so the borrow is sufficient and slightly more idiomatic. Callers would change from&embedder(whereembedder: Arc<dyn Embedder>) toembedder.as_ref()or&*embedder.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/rpc.rs` around lines 78 - 83, Change the handle_search signature to take a borrowed trait object instead of a reference to an Arc: replace the parameter type "&Arc<dyn Embedder>" with "&dyn Embedder" in the function signature (pub async fn handle_search(..., embedder: &dyn Embedder, ...)). Update all call sites that currently pass "&embedder" where embedder: Arc<dyn Embedder> to pass "embedder.as_ref()" or "&*embedder" (or simply "embedder" if they already have &Arc) so the function receives a &dyn Embedder; no other logic needs to change inside handle_search since it only calls methods on the embedder trait.src/openhuman/curated_memory/runtime.rs (1)
34-65: Recommended: extract char limits to named constants.
2200and1375are unexplained magic numbers right at the boundary where stores are opened. A pair of named constants would document intent (which file gets which budget and why), and would also be reusable from tests/migrations.♻️ Proposed refactor
+const MEMORY_CHAR_LIMIT: usize = 2200; +const USER_CHAR_LIMIT: usize = 1375; + pub async fn bootstrap(workspace_dir: &std::path::Path) { let mem_dir = workspace_dir.join("memories"); let memory_store = crate::openhuman::curated_memory::MemoryStore::open( &mem_dir, crate::openhuman::curated_memory::MemoryFile::Memory, - 2200, + MEMORY_CHAR_LIMIT, ); let user_store = crate::openhuman::curated_memory::MemoryStore::open( &mem_dir, crate::openhuman::curated_memory::MemoryFile::User, - 1375, + USER_CHAR_LIMIT, );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/curated_memory/runtime.rs` around lines 34 - 65, The bootstrap function uses unexplained magic numbers 2200 and 1375 when calling MemoryStore::open for MemoryFile::Memory and MemoryFile::User; extract these into clearly named constants (e.g., MEMORY_CHAR_LIMIT, USER_CHAR_LIMIT) near the top of the module and replace the numeric literals in bootstrap so the intent and reuse from tests/migrations are clear; update any references to MemoryStore::open in this file to use the new constants and ensure they are public if tests/migrations outside the module need access.src/openhuman/curated_memory/store.rs (2)
79-93: Minor:replaceerror drops the actual limit value, unlikeadd.
add()reportsformat!("char limit {} exceeded", self.char_limit), butreplace()returns the bare string"char limit". The asymmetry makes the error harder to act on (no number, no verb). Same fix would apply to a futureremove()if it ever gains a limit check.♻️ Proposed change
- if next.chars().count() > self.char_limit { - return Err(std::io::Error::new(std::io::ErrorKind::Other, "char limit")); - } + if next.chars().count() > self.char_limit { + return Err(std::io::Error::new( + std::io::ErrorKind::Other, + format!("char limit {} exceeded", self.char_limit), + )); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/curated_memory/store.rs` around lines 79 - 93, The replace() method's error message for exceeding char_limit is terse; update the Err returned in replace() so it matches add() by including the limit value and a verb (e.g. format!("char limit {} exceeded", self.char_limit)); locate the Err created after the next.chars().count() > self.char_limit check in replace() and replace the bare "char limit" string with a formatted message using self.char_limit (and apply the same pattern to any future remove() limit checks).
14-56: Optional:MemoryStore::opendoes syncstd::fsI/O from async callers.
bootstrap(...)incurated_memory/runtime.rsisasync fn, but it calls this syncopen(...), which performscreate_dir_all,read_to_string, andwritedirectly on the runtime thread. It only runs at startup and the files are tiny, so impact is minimal — but switching totokio::fswould keep the module consistent with the async I/O used inread/add/replace/removeand avoid surprising future callers who invokeopenlater in a request path. Also, the migration silently dropsread_to_stringerrors via.ok(); alog::warn!onErrwould help diagnose seed-migration failures.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/curated_memory/store.rs` around lines 14 - 56, MemoryStore::open is doing blocking std::fs I/O from async callers (e.g., curated_memory::runtime::bootstrap) and also swallows read_to_string errors; change open to be async (pub async fn open(...)->std::io::Result<Self>) and replace blocking calls with tokio::fs equivalents (tokio::fs::create_dir_all, tokio::fs::read_to_string, tokio::fs::write) awaiting each call, keep the same migration/truncation logic for seed, and update any callers (notably bootstrap) to await the new async open; additionally, when the legacy read_to_string returns Err, emit log::warn! including the error instead of silently dropping it, while preserving the rest of the flow and existing symbols (open, bootstrap, and the read/add/replace/remove async API consistency).src/openhuman/life_capture/schemas.rs (1)
165-173: Minor:as usizecast can overflow on 32-bit, though clamp downstream masks it.
read_optional_u64(...).unwrap_or(10) as usizetruncates rather than rejects values larger thanusize::MAX. The downstreamk.clamp(1, 100)inrpc::handle_searchmakes this safe today, but it would be clearer to validate at the controller boundary so behavior is independent of downstream clamping.♻️ Optional defensive validation
- let k = read_optional_u64(¶ms, "k")?.unwrap_or(10) as usize; + let k_u64 = read_optional_u64(¶ms, "k")?.unwrap_or(10); + let k = usize::try_from(k_u64).unwrap_or(usize::MAX).min(100).max(1);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/schemas.rs` around lines 165 - 173, In handle_search, the expression read_optional_u64(...).unwrap_or(10) as usize can overflow on 32-bit targets; validate the optional u64 before casting to usize instead of relying on rpc::handle_search's clamp. Change the logic in handle_search (where read_optional_u64 is used) to: obtain the Option<u64>, use unwrap_or(10) to get a u64, then convert via try_into() (or check value <= usize::MAX) and return an Err string if it doesn't fit; pass the safely converted usize into rpc::handle_search so the controller enforces valid bounds independent of rpc::handle_search's k.clamp(1,100).src/openhuman/life_capture/embedder.rs (1)
22-26: Optional: don't silently fall back to a timeout-less client.
unwrap_or_else(|_| reqwest::Client::new())produces a client with no timeouts if the configured builder ever fails. That defeats the purpose of the timeouts and can manifest as request-thread hangs. Prefer fail-fast at startup:♻️ Proposed change
- let http = reqwest::Client::builder() + let http = reqwest::Client::builder() .timeout(std::time::Duration::from_secs(30)) .connect_timeout(std::time::Duration::from_secs(10)) .build() - .unwrap_or_else(|_| reqwest::Client::new()); + .expect("[life_capture] failed to build embedder HTTP client");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/embedder.rs` around lines 22 - 26, The current code builds the reqwest client into `http` using `reqwest::Client::builder()` but silently falls back to `reqwest::Client::new()` via `unwrap_or_else(|_| ...)`, which yields a client without the configured timeouts; change this to fail-fast instead: remove the silent fallback on `.build()` and propagate or panic on error (e.g., replace `unwrap_or_else(|_| reqwest::Client::new())` with an explicit `expect`/`?`-style handling) so that `http` is never created without the specified timeouts and startup fails clearly if the builder cannot construct the client.src/openhuman/life_capture/index.rs (2)
169-178: Silent fallback to emptysourceon serde failure can corrupt lookups.If
serde_json::to_value(item.source)returnsError a non-string value,sourcebecomes""and the subsequentSELECT … WHERE source = ?1 AND external_id = ?2matches an unintended row class instead of failing loudly. Same shape forauthor_json/metadata_json: a serialize failure silently writes literals"null"/"{}". For aSourceenum with#[serde(rename_all = …)]this is currently a no-op, but it’s a quiet footgun the day someone adds a non-unit variant.♻️ Surface the error instead of swallowing it
- let source = serde_json::to_value(item.source) - .ok() - .and_then(|v| v.as_str().map(str::to_string)) - .unwrap_or_default(); - let author_json = item - .author - .as_ref() - .map(|a| serde_json::to_string(a).unwrap_or_else(|_| "null".into())); - let metadata_json = serde_json::to_string(&item.metadata) - .unwrap_or_else(|_| "{}".into()); + let source = match serde_json::to_value(item.source) + .context("serialize Item::source")? + { + serde_json::Value::String(s) => s, + other => anyhow::bail!("Source did not serialize to a JSON string: {other:?}"), + }; + let author_json = item + .author + .as_ref() + .map(serde_json::to_string) + .transpose() + .context("serialize Item::author")?; + let metadata_json = serde_json::to_string(&item.metadata) + .context("serialize Item::metadata")?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/index.rs` around lines 169 - 178, The current code silently replaces serialization failures for source, author_json, and metadata_json (serde_json::to_value(item.source).ok().and_then(...).unwrap_or_default() and the unwrap_or_else fallbacks) with empty/"null"/"{}" which can corrupt lookups (the SELECT WHERE source = ?1 ... may match the wrong row); instead change the code to propagate/return a serialization error when serde_json::to_value(item.source) or serde_json::to_string(...) for item.author/item.metadata fails (e.g., use the ? operator or map_err to convert into the function's Result error type) so the operation fails fast and no silent defaults are written. Ensure the variables named source, author_json, and metadata_json are produced only after successful serialization and that callers handle the Result accordingly.
571-616: Hybrid post-filtering can under-fill results whensources/since/untilare restrictive.
oversample = (q.k * 3).max(20)pulls≤2*oversampleunique candidates into the fusion map;apply_query_filtersthen drops anything outside the requested sources/time window before truncating toq.k. With a narrow filter (e.g.,sources=[Slack]while most candidates are Gmail), the caller can receive far fewer thanq.khits even when plenty of qualifying rows exist further down the rank list.Either push these filters into the SQL of
keyword_search/vector_search(so oversampling is over the filtered candidate space), or growoversampleadaptively when filters are non-empty. SQL pushdown is the cleaner option and removes a Rust-side scan over hits you’ll discard.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/life_capture/index.rs` around lines 571 - 616, Hybrid search currently oversamples unfiltered candidates then drops many via apply_query_filters, causing under-filled results when q.sources/q.since/q.until are restrictive; update the search to push those filters into the candidate-side queries or adapt oversample. Specifically, change keyword_search and vector_search to accept the query filters (e.g., pass q or q.sources/q.since/q.until) and apply them in their SQL WHERE clauses so oversample is applied to the filtered corpus; if you prefer the interim fix, detect non-empty q.sources or time bounds in hybrid_search and increase oversample (e.g., multiply by a factor or use an adaptive loop to grow oversample until you have ≥q.k filtered hits) instead of only using (q.k * 3).max(20); adjust the call sites where keyword_search/vector_search are invoked and remove redundant Rust-side scanning that discards filtered rows.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@AGENTS.md`:
- Line 459: The README/guide currently instructs running "cargo test -p
openhuman_core" which is incorrect because openhuman_core is a lib target, not a
package; update the command to "cargo test -p openhuman" (or remove -p entirely
and use just "cargo test" if you prefer running tests for the whole workspace)
so the -p flag references the actual package name ("openhuman") instead of the
lib name ("openhuman_core"); change the occurrence of cargo test -p
openhuman_core to cargo test -p openhuman in the document text.
In `@Cargo.toml`:
- Around line 58-61: The init closure used to initialize the SQLite pool only
sets PRAGMA via c.pragma_update (the closure that configures connections) but
never loads the sqlite-vec extension; add a call to sqlite_vec::load(&conn)?
inside that same init closure (where c.pragma_update(None, "query_only", "ON")
is invoked) so each new connection has the extension registered (use the
existing conn variable and propagate the error as needed); this ensures
sqlite_vec::load is called for r2d2_sqlite/rusqlite pooled connections before
they are returned.
In `@scripts/install.sh`:
- Around line 6-14: The top-level "set -euo pipefail" leaks into the caller when
this script is sourced; change the install guard so that either (a) do not apply
"set -euo pipefail" until after you check SOURCE_ONLY (the --source-only loop)
and only enable it when SOURCE_ONLY is not set, or (b) save the caller's shell
options (e.g., via "old_opts=$(set +o)" or "old_status=$(set -o)" style) before
applying "set -euo pipefail" and restore them before returning; update the code
around the SOURCE_ONLY check and the install-body entry points (references: the
SOURCE_ONLY variable, the --source-only argument handling loop, and the
top-level set -euo pipefail) so sourcing with --source-only does not mutate the
parent shell options.
In `@src/openhuman/agent/harness/subagent_runner/ops_tests.rs`:
- Around line 211-214: The function signature for make_parent is currently split
across multiple lines causing rustfmt failure; collapse the signature to a
single line so rustfmt accepts it — update the declaration of fn make_parent to
place provider: Arc<dyn Provider>, tools: Vec<Box<dyn Tool>> and the return type
-> ParentExecutionContext all on one line (keeping the existing body unchanged)
and ensure the symbols make_parent, Provider, Tool, and ParentExecutionContext
are used exactly as in the file.
- Around line 123-125: The multi-line use declaration and the multi-line
function signature cause rustfmt failures; change the multi-line use of
crate::openhuman::providers to a single-line import (use
crate::openhuman::providers::{ChatRequest as PChatRequest, ChatResponse,
Provider, ToolCall};) and collapse the make_parent function signature (the
make_parent definition spanning lines 211–214) into a single line signature to
match rustfmt expectations; update only the formatting, keeping identifiers and
types unchanged.
In `@src/openhuman/curated_memory/rpc.rs`:
- Around line 31-39: The handle_add function must reject empty entries: before
calling pick(&file, rt) and store.add(&entry).await, check that entry is not
empty (e.g., entry.is_empty() or entry.trim().is_empty() per project convention)
and return an Err(String) with a clear message (consistent with other handlers
like remove/replace, e.g., format!("add: empty entry") or "entry must be
non-empty") instead of appending a stray separator; only call store.add when the
entry is non-empty.
In `@src/openhuman/life_capture/embedder.rs`:
- Around line 12-34: HostedEmbedder currently hard-codes dim() to 1536 while
new() accepts any model string, causing mismatch with INDEX_VECTOR_DIM; update
HostedEmbedder to carry the embedding dimension as configuration (add a dim:
usize field and expose a dim() -> usize method), change HostedEmbedder::new(...)
to accept a dim parameter (or a Model→dim resolver in runtime.rs that maps
OPENHUMAN_EMBEDDINGS_MODEL to the correct dimension), and update sites that
construct HostedEmbedder (runtime.rs) and any checks in rpc::handle_search to
use the new dim field so the pre-flight comparison against INDEX_VECTOR_DIM is
accurate; ensure embed_batch callers still work with the actual returned
dimension.
In `@src/openhuman/life_capture/index.rs`:
- Around line 1-336: The file is too large and should be split into focused
modules: keep the public types and process-wide init in index/mod.rs
(PersonalIndex, ensure_vec_extension_registered, ReaderPool, READER_POOL_SIZE),
move writer logic into index/writer.rs (IndexWriter, its upsert and
upsert_vector, and any writer-side helpers like Source/serde_json fallbacks),
move read/query logic into index/reader.rs (IndexReader, ItemRow,
with_read_conn, fts5_quote, apply_query_filters and reader helpers), and
relocate each #[cfg(test)] mod next to the code it tests; update mod
declarations and use statements accordingly so all referenced symbols
(PersonalIndex, IndexWriter, IndexReader, ItemRow, with_read_conn, fts5_quote,
apply_query_filters, VEC_REGISTERED/ensure_vec_extension_registered) remain
accessible.
In `@src/openhuman/life_capture/runtime.rs`:
- Around line 96-106: The current env lookup treats an empty
OPENHUMAN_EMBEDDINGS_KEY as a valid value and thus never falls back to
OPENAI_API_KEY; update the retrieval so empty strings are treated as unset: call
std::env::var("OPENHUMAN_EMBEDDINGS_KEY") and std::env::var("OPENAI_API_KEY")
but filter out empty results (e.g. map to Option and .filter(|s| !s.is_empty())
or convert empty Ok("") into None) before applying the precedence logic that
assigns to the local api_key and the subsequent let Some(api_key) = api_key else
{ ... } block so a blank OPENHUMAN_EMBEDDINGS_KEY will correctly fall back to
OPENAI_API_KEY.
In `@tests/life_capture_retrieval_eval.rs`:
- Around line 197-226: run_query currently requests only q.k hits so R@5
collapses to R@q.k for queries with k<5; change the search call to request a
fetch_k = q.k.max(5) (use fetch_k when calling reader.keyword_search,
vector_search and hybrid_search) and still return hit_ext_ids(&hits) (leave any
budget-aware assertions like must_contain_in_top using q.k unchanged). Apply the
same change in the other occurrence identified around lines 318-343 (use fetch_k
= q.k.max(5) for retrievals, but keep downstream checks that rely on q.k).
---
Outside diff comments:
In `@src/openhuman/agent/debug/mod.rs`:
- Around line 381-396: The debug dump builds a PromptContext with
curated_snapshot: None which doesn't match production for integrations_agent
spawned as a subagent; modify render_integrations_agent (or the function that
calls Agent::from_config_for_agent) to accept an Option<&ParentExecutionContext>
parent_context and, when present, set PromptContext.curated_snapshot =
parent_context.curated_snapshot.as_deref(), ensuring you thread the
parent_context through to where PromptContext is constructed (referencing
PromptContext, render_integrations_agent, Agent::from_config_for_agent, and
curated_snapshot); alternatively, if you prefer not to change the API, update
the module doc and DumpedPrompt::text comment to explicitly state the dump is
the base prompt before parent-context inheritance.
---
Nitpick comments:
In `@scripts/install.sh`:
- Around line 229-249: The script currently calls python3 twice on the same
latest.json: once via resolve_asset_url and again to set LATEST_VERSION; change
resolve_asset_url to parse latest.json once and return both the asset URL and
the version (e.g. output "url|version" or two newline-separated values), then
update the caller so ASSET_URL/ASSET_NAME are derived from resolve_asset_url's
first value and LATEST_VERSION from its second; update resolve_asset_url's
callers and error handling to preserve the existing exit codes/behavior.
In `@scripts/test_install.sh`:
- Around line 23-27: The test should assert the exact exit code 3 from
resolve_asset_url for a missing platform instead of just non-zero; run
resolve_asset_url "$FIXTURE" "linux" "aarch64", capture its exit status (e.g.,
ret=$?), and fail the test unless ret == 3, emitting a clear message if it
differs; reference the existing test invocation of resolve_asset_url and the
FIXTURE variable to locate where to replace the generic non-zero check with this
exact-status assertion.
In `@src/core/jsonrpc.rs`:
- Around line 667-675: The Err arm of the async match over
crate::openhuman::config::Config::load_or_init() currently discards the error;
change Err(_) to capture the error (e.g., Err(err)) and include that error in
the log message emitted from the failing branch so the operator can see the
underlying cause. Update the log::warn! call that follows the failed load so it
appends the captured error (using a readable formatter such as {:?} or {}) while
preserving the existing "[core]" context and keep
bootstrap_controller_runtimes(&config.workspace_dir).await called only on
Ok(config).
In `@src/openhuman/agent/agents/morning_briefing/prompt.rs`:
- Line 64: The test fixture update simply sets curated_snapshot: None in the
PromptContext instance used by the morning_briefing agent; keep that change but
also consider (non-blocking) extracting a shared test helper function (e.g., a
test_support::baseline_prompt_context) that returns a baseline PromptContext to
avoid duplicated fixtures across agents (archivist, tool_maker, trigger_triage,
morning_briefing, critic, integrations_agent::ctx_with); for this PR just ensure
the curated_snapshot: None assignment is applied to the PromptContext used in
morning_briefing's tests and leave the refactor to a separate change.
In `@src/openhuman/agent/prompts/mod.rs`:
- Around line 949-980: inject_snapshot_content and inject_workspace_file_capped
duplicate the same char-bounded truncation, header and truncation-marker logic;
extract that shared logic into a small helper function (e.g.,
inject_labeled_block(prompt: &mut String, label: &str, content: &str, max_chars:
usize)) and have both inject_snapshot_content and inject_workspace_file_capped
call it. The helper should preserve current behavior: skip empty/whitespace
content, write the "### {label}\n" header, truncate by Unicode character count
using char_indices().nth(max_chars) to slice safely, append the truncated text,
and emit the same "\n\n[... truncated at {max_chars} chars — use `read` for full
file]\n" marker when truncated (otherwise append "\n\n"). Ensure existing
callers pass the same max_chars and that string trimming and newline behavior
remain identical.
In `@src/openhuman/curated_memory/runtime.rs`:
- Around line 34-65: The bootstrap function uses unexplained magic numbers 2200
and 1375 when calling MemoryStore::open for MemoryFile::Memory and
MemoryFile::User; extract these into clearly named constants (e.g.,
MEMORY_CHAR_LIMIT, USER_CHAR_LIMIT) near the top of the module and replace the
numeric literals in bootstrap so the intent and reuse from tests/migrations are
clear; update any references to MemoryStore::open in this file to use the new
constants and ensure they are public if tests/migrations outside the module need
access.
In `@src/openhuman/curated_memory/store.rs`:
- Around line 79-93: The replace() method's error message for exceeding
char_limit is terse; update the Err returned in replace() so it matches add() by
including the limit value and a verb (e.g. format!("char limit {} exceeded",
self.char_limit)); locate the Err created after the next.chars().count() >
self.char_limit check in replace() and replace the bare "char limit" string with
a formatted message using self.char_limit (and apply the same pattern to any
future remove() limit checks).
- Around line 14-56: MemoryStore::open is doing blocking std::fs I/O from async
callers (e.g., curated_memory::runtime::bootstrap) and also swallows
read_to_string errors; change open to be async (pub async fn
open(...)->std::io::Result<Self>) and replace blocking calls with tokio::fs
equivalents (tokio::fs::create_dir_all, tokio::fs::read_to_string,
tokio::fs::write) awaiting each call, keep the same migration/truncation logic
for seed, and update any callers (notably bootstrap) to await the new async
open; additionally, when the legacy read_to_string returns Err, emit log::warn!
including the error instead of silently dropping it, while preserving the rest
of the flow and existing symbols (open, bootstrap, and the
read/add/replace/remove async API consistency).
In `@src/openhuman/life_capture/embedder.rs`:
- Around line 22-26: The current code builds the reqwest client into `http`
using `reqwest::Client::builder()` but silently falls back to
`reqwest::Client::new()` via `unwrap_or_else(|_| ...)`, which yields a client
without the configured timeouts; change this to fail-fast instead: remove the
silent fallback on `.build()` and propagate or panic on error (e.g., replace
`unwrap_or_else(|_| reqwest::Client::new())` with an explicit `expect`/`?`-style
handling) so that `http` is never created without the specified timeouts and
startup fails clearly if the builder cannot construct the client.
In `@src/openhuman/life_capture/index.rs`:
- Around line 169-178: The current code silently replaces serialization failures
for source, author_json, and metadata_json
(serde_json::to_value(item.source).ok().and_then(...).unwrap_or_default() and
the unwrap_or_else fallbacks) with empty/"null"/"{}" which can corrupt lookups
(the SELECT WHERE source = ?1 ... may match the wrong row); instead change the
code to propagate/return a serialization error when
serde_json::to_value(item.source) or serde_json::to_string(...) for
item.author/item.metadata fails (e.g., use the ? operator or map_err to convert
into the function's Result error type) so the operation fails fast and no silent
defaults are written. Ensure the variables named source, author_json, and
metadata_json are produced only after successful serialization and that callers
handle the Result accordingly.
- Around line 571-616: Hybrid search currently oversamples unfiltered candidates
then drops many via apply_query_filters, causing under-filled results when
q.sources/q.since/q.until are restrictive; update the search to push those
filters into the candidate-side queries or adapt oversample. Specifically,
change keyword_search and vector_search to accept the query filters (e.g., pass
q or q.sources/q.since/q.until) and apply them in their SQL WHERE clauses so
oversample is applied to the filtered corpus; if you prefer the interim fix,
detect non-empty q.sources or time bounds in hybrid_search and increase
oversample (e.g., multiply by a factor or use an adaptive loop to grow
oversample until you have ≥q.k filtered hits) instead of only using (q.k *
3).max(20); adjust the call sites where keyword_search/vector_search are invoked
and remove redundant Rust-side scanning that discards filtered rows.
In `@src/openhuman/life_capture/migrations.rs`:
- Around line 30-47: Replace the manual BEGIN/COMMIT/ROLLBACK and closure
pattern with rusqlite's unchecked_transaction() to mirror the rest of the
codebase: start an unchecked transaction via conn.unchecked_transaction(), run
the migration SQL and the INSERT into _life_capture_migrations using the
transaction handle (instead of conn.execute_batch/conn.execute), and then call
tx.commit()? on success so Drop-based rollback handles failures; update
references to the closure and explicit ROLLBACK/COMMIT logic around
conn.execute_batch and the inner INSERT to use the transaction API instead.
In `@src/openhuman/life_capture/redact.rs`:
- Around line 34-45: Add a one-line clarifying comment to the redact function's
doc-comment explaining that the CC regex (static CC) is intentionally broad and
may match ISBN-13/GTIN or other long numeric identifiers (e.g.,
"978-3-16-148410-0"), and that false-positive redaction is preferred to leaking
PII; update the comment near the CC declaration or above pub fn redact to
mention this so future readers won't treat those matches as a bug.
In `@src/openhuman/life_capture/rpc.rs`:
- Around line 78-83: Change the handle_search signature to take a borrowed trait
object instead of a reference to an Arc: replace the parameter type "&Arc<dyn
Embedder>" with "&dyn Embedder" in the function signature (pub async fn
handle_search(..., embedder: &dyn Embedder, ...)). Update all call sites that
currently pass "&embedder" where embedder: Arc<dyn Embedder> to pass
"embedder.as_ref()" or "&*embedder" (or simply "embedder" if they already have
&Arc) so the function receives a &dyn Embedder; no other logic needs to change
inside handle_search since it only calls methods on the embedder trait.
In `@src/openhuman/life_capture/schemas.rs`:
- Around line 165-173: In handle_search, the expression
read_optional_u64(...).unwrap_or(10) as usize can overflow on 32-bit targets;
validate the optional u64 before casting to usize instead of relying on
rpc::handle_search's clamp. Change the logic in handle_search (where
read_optional_u64 is used) to: obtain the Option<u64>, use unwrap_or(10) to get
a u64, then convert via try_into() (or check value <= usize::MAX) and return an
Err string if it doesn't fit; pass the safely converted usize into
rpc::handle_search so the controller enforces valid bounds independent of
rpc::handle_search's k.clamp(1,100).
In `@tests/life_capture_retrieval_eval.rs`:
- Around line 308-343: Add a one-sentence clarification to the QueryMetrics
struct doc comment next to the recall_at_1 field (or the struct-level comment)
stating that for queries with more than one relevant document recall_at_1 is
bounded above by 1/|relevant| (so mean R@1 cannot reach 1.0 even if the top hit
is always relevant), and optionally note that MRR=1.00 implies top-hit relevance
while R@1 is constrained by the average number of relevant docs; update the
comment near QueryMetrics and the recall_at_1 reference accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dbaedfd5-9e5a-4435-bb3d-54cffc304752
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (65)
.github/workflows/installer-smoke.ymlAGENTS.mdCargo.tomldocs/superpowers/plans/2026-04-22-life-capture-01-foundation.mddocs/superpowers/plans/2026-04-22-track1-ship-pipeline.mddocs/superpowers/specs/2026-04-22-life-capture-layer-design.mdscripts/fixtures/latest.jsonscripts/install.shscripts/test_install.shsrc/core/all.rssrc/core/jsonrpc.rssrc/openhuman/agent/agents/archivist/prompt.rssrc/openhuman/agent/agents/code_executor/prompt.rssrc/openhuman/agent/agents/critic/prompt.rssrc/openhuman/agent/agents/help/prompt.rssrc/openhuman/agent/agents/integrations_agent/prompt.rssrc/openhuman/agent/agents/loader.rssrc/openhuman/agent/agents/morning_briefing/prompt.rssrc/openhuman/agent/agents/orchestrator/prompt.rssrc/openhuman/agent/agents/planner/prompt.rssrc/openhuman/agent/agents/researcher/prompt.rssrc/openhuman/agent/agents/summarizer/prompt.rssrc/openhuman/agent/agents/tool_maker/prompt.rssrc/openhuman/agent/agents/tools_agent/prompt.rssrc/openhuman/agent/agents/trigger_reactor/prompt.rssrc/openhuman/agent/agents/trigger_triage/prompt.rssrc/openhuman/agent/agents/welcome/prompt.rssrc/openhuman/agent/debug/mod.rssrc/openhuman/agent/harness/fork_context.rssrc/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/runtime.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/agent/harness/subagent_runner/ops.rssrc/openhuman/agent/harness/subagent_runner/ops_tests.rssrc/openhuman/agent/prompts/mod.rssrc/openhuman/agent/prompts/mod_tests.rssrc/openhuman/agent/prompts/types.rssrc/openhuman/agent/triage/escalation.rssrc/openhuman/agent/triage/evaluator.rssrc/openhuman/curated_memory/mod.rssrc/openhuman/curated_memory/rpc.rssrc/openhuman/curated_memory/runtime.rssrc/openhuman/curated_memory/schemas.rssrc/openhuman/curated_memory/store.rssrc/openhuman/curated_memory/types.rssrc/openhuman/learning/prompt_sections.rssrc/openhuman/life_capture/embedder.rssrc/openhuman/life_capture/index.rssrc/openhuman/life_capture/migrations.rssrc/openhuman/life_capture/migrations/0001_init.sqlsrc/openhuman/life_capture/migrations/0002_vec.sqlsrc/openhuman/life_capture/mod.rssrc/openhuman/life_capture/quote_strip.rssrc/openhuman/life_capture/redact.rssrc/openhuman/life_capture/rpc.rssrc/openhuman/life_capture/runtime.rssrc/openhuman/life_capture/schemas.rssrc/openhuman/life_capture/tests/e2e.rssrc/openhuman/life_capture/tests/mod.rssrc/openhuman/life_capture/types.rssrc/openhuman/mod.rstests/agent_harness_public.rstests/fixtures/life_capture/corpus.jsontests/life_capture_retrieval_eval.rs
|
|
||
| 1. **Specify against the current codebase** — Ground the design in **existing** domains, controller/registry patterns, and JSON-RPC naming (`openhuman.<namespace>_<function>`). Reuse or extend documented flows in [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) and sibling guides; avoid parallel architectures. | ||
| 2. **Implement in Rust** — Add domain logic under `src/openhuman/<domain>/`, wire **schemas + registered handlers** into the shared registry, and land **unit tests** in the crate (`cargo test -p openhuman`, focused modules) until the feature is correct in isolation. | ||
| 2. **Implement in Rust** — Add domain logic under `src/openhuman/<domain>/`, wire **schemas + registered handlers** into the shared registry, and land **unit tests** in the crate (`cargo test -p openhuman_core`, focused modules) until the feature is correct in isolation. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm whether any Cargo.toml in the repo declares a package named `openhuman_core`,
# vs. just the lib target name. Also list all workspace members and package names.
fd -t f 'Cargo.toml' --hidden --exclude target | while read -r f; do
echo "=== $f ==="
awk '
/^\[package\]/ {p=1; next}
/^\[/ {p=0}
p && /^[[:space:]]*name[[:space:]]*=/ {print "package.name:", $0}
' "$f"
awk '
/^\[lib\]/ {l=1; next}
/^\[/ {l=0}
l && /^[[:space:]]*name[[:space:]]*=/ {print "lib.name:", $0}
' "$f"
awk '
/^\[workspace\]/ {w=1; next}
/^\[/ {w=0}
w {print "workspace:", $0}
' "$f"
doneRepository: tinyhumansai/openhuman
Length of output: 247
Fix the incorrect cargo test command. Line 459 shows cargo test -p openhuman_core, but the repository only has packages named openhuman and OpenHuman. The -p flag requires a package name, not a lib target name. openhuman_core is the lib name within the openhuman package, so the command will fail with "package ID specification did not match any packages". Change to cargo test -p openhuman.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@AGENTS.md` at line 459, The README/guide currently instructs running "cargo
test -p openhuman_core" which is incorrect because openhuman_core is a lib
target, not a package; update the command to "cargo test -p openhuman" (or
remove -p entirely and use just "cargo test" if you prefer running tests for the
whole workspace) so the -p flag references the actual package name ("openhuman")
instead of the lib name ("openhuman_core"); change the occurrence of cargo test
-p openhuman_core to cargo test -p openhuman in the document text.
| rusqlite = { version = "0.37", features = ["bundled"] } | ||
| r2d2 = "0.8" | ||
| r2d2_sqlite = "0.31" | ||
| sqlite-vec = "0.1" |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What rusqlite version does r2d2_sqlite 0.31.0 depend on?
💡 Result:
r2d2_sqlite 0.31.0 depends on rusqlite ^0.37.
Citations:
🏁 Script executed:
#!/bin/bash
# Confirm only one rusqlite version ends up in the lock graph and that
# it matches across rusqlite + r2d2_sqlite re-export.
if [ -f Cargo.lock ]; then
echo "=== rusqlite versions in Cargo.lock ==="
awk '/^\[\[package\]\]/{p=0} /name = "rusqlite"/{p=1} p' Cargo.lock | head -40
echo "=== r2d2_sqlite version in Cargo.lock ==="
awk '/^\[\[package\]\]/{p=0} /name = "r2d2_sqlite"/{p=1} p' Cargo.lock | head -10
fi
# Check call-site for extension loading hooks (load_extension / sqlite_vec::load).
rg -nP --type=rust -C2 '\b(load_extension|sqlite_vec::load|SqliteConnectionManager|with_init)\b'Repository: tinyhumansai/openhuman
Length of output: 2079
🏁 Script executed:
#!/bin/bash
# Get full init closure context (more lines)
sed -n '85,105p' src/openhuman/life_capture/index.rs
# Check if sqlite_vec::load is called anywhere
rg -n 'sqlite_vec::load' --type=rust
# Check feature flags in Cargo.toml for bundled and load_extension
rg -n 'features|load_extension|loadable_extension' Cargo.tomlRepository: tinyhumansai/openhuman
Length of output: 4159
🌐 Web query:
Does sqlite-vec require explicit sqlite_vec::load() call or does it auto-register when loaded as an extension?
💡 Result:
sqlite-vec requires an explicit load call to register its functions and virtual tables. It does not auto-register when loaded as an extension. For direct SQLite CLI use, execute .load ./vec0 (or the path to vec0.so/dll) before creating vec0 virtual tables. Official repo sample: .load ./vec0 followed by create virtual table ... using vec0(...). Sources 1,17. In Python bindings (pip install sqlite-vec), call sqlite_vec.load(db) after db.enable_load_extension(True). Official Python docs explicitly state: "use the sqlite_vec.load function to load sqlite-vec SQL functions into a SQLite connection." Sources 2,10. In Rust bindings (cargo add sqlite-vec), explicitly register via unsafe { sqlite3_auto_extension(Some(std::mem::transmute(sqlite3_vec_init as *const ))); } using rusqlite's ffi. This uses SQLite's auto-extension mechanism but requires deliberate caller action (one-time global registration). No evidence of fully automatic registration without code. Official Rust example and docs confirm this explicit step. Sources 23,24. SQLite extensions generally require explicit loading via .load, load_extension, or equivalent API calls; they do not auto-register on mere presence. sqlite-vec follows this, with language-specific wrappers (like sqlite_vec::load) providing convenience but still requiring invocation. Source 3. No sources indicate auto-registration without explicit load/register call.
Citations:
- 1: https://github.com/asg017/sqlite-vec
- 2: https://github.com/asg017/sqlite-vec/blob/main/site/using/python.md
- 3: https://sqlite.org/loadext.html
- 4: https://dev.to/aairom/embedded-intelligence-how-sqlite-vec-delivers-fast-local-vector-search-for-ai-3dpb
- 5: https://github.com/asg017/sqlite-vec/blob/main/tests/test-loadable.py
- 6: https://github.com/asg017/sqlite-vec/blob/main/examples/simple-rust/demo.rs
- 7: https://alexgarcia.xyz/sqlite-vec/rust.html
🌐 Web query:
Does rusqlite bundled feature support load_extension for dynamically loading SQLite extensions?
💡 Result:
Yes, the rusqlite "bundled" feature supports load_extension for dynamically loading SQLite extensions.
Citations:
- 1: https://docs.rs/crate/rusqlite/0.26.1/features
- 2: https://docs.rs/crate/rusqlite/latest
- 3: https://stackoverflow.com/questions/72987497/bus-error-on-usage-of-rusqlite-with-spatialite-extension
- 4: https://docs.rs/rusqlite/latest/rusqlite/struct.LoadExtensionGuard.html
- 5: https://lib.rs/crates/rusqlite/features
- 6: https://github.com/rusqlite/rusqlite/blob/master/examples/load_extension.rs
The sqlite-vec extension is not being loaded; vector queries will fail at runtime.
r2d2_sqlite = 0.31 correctly pairs with rusqlite = 0.37 (verified), and the bundled feature supports load_extension. However, the init closure at lines 89–91 only calls c.pragma_update(None, "query_only", "ON") and does not call sqlite_vec::load(&conn). The comment suggesting the extension "registers on the connection at open time" is incorrect—sqlite-vec requires an explicit load call and will not auto-register. Any attempt to query vector tables will fail at runtime because the extension is not available on any connection from the pool. Add sqlite_vec::load(&conn)?; to the init closure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Cargo.toml` around lines 58 - 61, The init closure used to initialize the
SQLite pool only sets PRAGMA via c.pragma_update (the closure that configures
connections) but never loads the sqlite-vec extension; add a call to
sqlite_vec::load(&conn)? inside that same init closure (where
c.pragma_update(None, "query_only", "ON") is invoked) so each new connection has
the extension registered (use the existing conn variable and propagate the error
as needed); this ensures sqlite_vec::load is called for r2d2_sqlite/rusqlite
pooled connections before they are returned.
| set -euo pipefail | ||
|
|
||
| # Allow tests to source this file without executing the install flow. | ||
| SOURCE_ONLY=0 | ||
| for _arg in "$@"; do | ||
| if [[ "$_arg" == "--source-only" ]]; then | ||
| SOURCE_ONLY=1 | ||
| fi | ||
| done |
There was a problem hiding this comment.
set -euo pipefail leaks to the caller when this file is sourced.
The --source-only guard at lines 8-14/361-364 only short-circuits the install body, but set -euo pipefail on line 6 already mutated the parent shell's options by the time the early return runs. A test that sources install.sh --source-only will silently inherit errexit/nounset/pipefail, which can change the behavior of subsequent test code (e.g. unset vars become fatal).
Consider either gating the set on SOURCE_ONLY or saving/restoring the prior shell options.
🛡️ Sketch of the gate
-set -euo pipefail
-
-# Allow tests to source this file without executing the install flow.
-SOURCE_ONLY=0
-for _arg in "$@"; do
- if [[ "$_arg" == "--source-only" ]]; then
- SOURCE_ONLY=1
- fi
-done
+# Allow tests to source this file without executing the install flow.
+SOURCE_ONLY=0
+for _arg in "$@"; do
+ if [[ "$_arg" == "--source-only" ]]; then
+ SOURCE_ONLY=1
+ fi
+done
+if [[ "${SOURCE_ONLY}" != "1" ]]; then
+ set -euo pipefail
+fiAlso applies to: 361-364
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/install.sh` around lines 6 - 14, The top-level "set -euo pipefail"
leaks into the caller when this script is sourced; change the install guard so
that either (a) do not apply "set -euo pipefail" until after you check
SOURCE_ONLY (the --source-only loop) and only enable it when SOURCE_ONLY is not
set, or (b) save the caller's shell options (e.g., via "old_opts=$(set +o)" or
"old_status=$(set -o)" style) before applying "set -euo pipefail" and restore
them before returning; update the code around the SOURCE_ONLY check and the
install-body entry points (references: the SOURCE_ONLY variable, the
--source-only argument handling loop, and the top-level set -euo pipefail) so
sourcing with --source-only does not mutate the parent shell options.
| use crate::openhuman::providers::{ | ||
| ChatRequest as PChatRequest, ChatResponse, Provider, ToolCall, | ||
| }; |
There was a problem hiding this comment.
cargo fmt --check fails — collapse to single-line use.
The reformatted multi-line use exceeds nothing and rustfmt collapses it; CI Type Check is currently failing on this. Same fix needed at the make_parent signature below (Line 211-214). As per coding guidelines: "Run cargo fmt and cargo check on all Rust code changes before merging".
🛠️ Proposed fix
-use crate::openhuman::providers::{
- ChatRequest as PChatRequest, ChatResponse, Provider, ToolCall,
-};
+use crate::openhuman::providers::{ChatRequest as PChatRequest, ChatResponse, Provider, ToolCall};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| use crate::openhuman::providers::{ | |
| ChatRequest as PChatRequest, ChatResponse, Provider, ToolCall, | |
| }; | |
| use crate::openhuman::providers::{ChatRequest as PChatRequest, ChatResponse, Provider, ToolCall}; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/subagent_runner/ops_tests.rs` around lines 123 -
125, The multi-line use declaration and the multi-line function signature cause
rustfmt failures; change the multi-line use of crate::openhuman::providers to a
single-line import (use crate::openhuman::providers::{ChatRequest as
PChatRequest, ChatResponse, Provider, ToolCall};) and collapse the make_parent
function signature (the make_parent definition spanning lines 211–214) into a
single line signature to match rustfmt expectations; update only the formatting,
keeping identifiers and types unchanged.
| fn make_parent( | ||
| provider: Arc<dyn Provider>, | ||
| tools: Vec<Box<dyn Tool>>, | ||
| ) -> ParentExecutionContext { |
There was a problem hiding this comment.
cargo fmt --check fails — collapse make_parent signature.
Same root cause as the use block above; rustfmt fits this signature on one line.
🛠️ Proposed fix
-fn make_parent(
- provider: Arc<dyn Provider>,
- tools: Vec<Box<dyn Tool>>,
-) -> ParentExecutionContext {
+fn make_parent(provider: Arc<dyn Provider>, tools: Vec<Box<dyn Tool>>) -> ParentExecutionContext {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn make_parent( | |
| provider: Arc<dyn Provider>, | |
| tools: Vec<Box<dyn Tool>>, | |
| ) -> ParentExecutionContext { | |
| fn make_parent(provider: Arc<dyn Provider>, tools: Vec<Box<dyn Tool>>) -> ParentExecutionContext { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/subagent_runner/ops_tests.rs` around lines 211 -
214, The function signature for make_parent is currently split across multiple
lines causing rustfmt failure; collapse the signature to a single line so
rustfmt accepts it — update the declaration of fn make_parent to place provider:
Arc<dyn Provider>, tools: Vec<Box<dyn Tool>> and the return type ->
ParentExecutionContext all on one line (keeping the existing body unchanged) and
ensure the symbols make_parent, Provider, Tool, and ParentExecutionContext are
used exactly as in the file.
| pub async fn handle_add( | ||
| rt: &CuratedMemoryRuntime, | ||
| file: String, | ||
| entry: String, | ||
| ) -> Result<RpcOutcome<Value>, String> { | ||
| let store = pick(&file, rt)?; | ||
| store.add(&entry).await.map_err(|e| format!("add: {e}"))?; | ||
| Ok(RpcOutcome::new(json!({ "file": file, "ok": true }), vec![])) | ||
| } |
There was a problem hiding this comment.
Minor: handle_add should reject empty entry.
store.add("") will append ENTRY_SEP to a non-empty file, leaving a stray empty entry between separators. That entry can never be matched by remove(needle) (every non-empty needle "doesn't contain" it) and bloats the file toward char_limit. Aligns with how replace/remove already validate non-empty needle.
🛡️ Proposed validation
pub async fn handle_add(
rt: &CuratedMemoryRuntime,
file: String,
entry: String,
) -> Result<RpcOutcome<Value>, String> {
+ if entry.trim().is_empty() {
+ return Err("entry must not be empty".into());
+ }
let store = pick(&file, rt)?;
store.add(&entry).await.map_err(|e| format!("add: {e}"))?;
Ok(RpcOutcome::new(json!({ "file": file, "ok": true }), vec![]))
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub async fn handle_add( | |
| rt: &CuratedMemoryRuntime, | |
| file: String, | |
| entry: String, | |
| ) -> Result<RpcOutcome<Value>, String> { | |
| let store = pick(&file, rt)?; | |
| store.add(&entry).await.map_err(|e| format!("add: {e}"))?; | |
| Ok(RpcOutcome::new(json!({ "file": file, "ok": true }), vec![])) | |
| } | |
| pub async fn handle_add( | |
| rt: &CuratedMemoryRuntime, | |
| file: String, | |
| entry: String, | |
| ) -> Result<RpcOutcome<Value>, String> { | |
| if entry.trim().is_empty() { | |
| return Err("entry must not be empty".into()); | |
| } | |
| let store = pick(&file, rt)?; | |
| store.add(&entry).await.map_err(|e| format!("add: {e}"))?; | |
| Ok(RpcOutcome::new(json!({ "file": file, "ok": true }), vec![])) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/curated_memory/rpc.rs` around lines 31 - 39, The handle_add
function must reject empty entries: before calling pick(&file, rt) and
store.add(&entry).await, check that entry is not empty (e.g., entry.is_empty()
or entry.trim().is_empty() per project convention) and return an Err(String)
with a clear message (consistent with other handlers like remove/replace, e.g.,
format!("add: empty entry") or "entry must be non-empty") instead of appending a
stray separator; only call store.add when the entry is non-empty.
| #[derive(Clone)] | ||
| pub struct HostedEmbedder { | ||
| base_url: String, | ||
| api_key: String, | ||
| model: String, | ||
| http: reqwest::Client, | ||
| } | ||
|
|
||
| impl HostedEmbedder { | ||
| pub fn new(base_url: String, api_key: String, model: String) -> Self { | ||
| let http = reqwest::Client::builder() | ||
| .timeout(std::time::Duration::from_secs(30)) | ||
| .connect_timeout(std::time::Duration::from_secs(10)) | ||
| .build() | ||
| .unwrap_or_else(|_| reqwest::Client::new()); | ||
| Self { | ||
| base_url, | ||
| api_key, | ||
| model, | ||
| http, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Hard-coded dim() = 1536 is a footgun when model is configurable.
HostedEmbedder::new(...) accepts an arbitrary model string (and runtime.rs reads it from OPENHUMAN_EMBEDDINGS_MODEL), but dim() is hard-coded. If a user sets the env var to e.g. text-embedding-3-large (3072) or text-embedding-ada-002 (1536), the dim() answer is decoupled from reality. The pre-flight check in rpc::handle_search (embedder.dim() != INDEX_VECTOR_DIM) will silently pass even when the actual returned vector won't fit the index, and the failure will surface only mid-request with the less-actionable "embedding at index 0 has dim 3072 (expected 1536)" error from embed_batch.
Either pin the model to one whose dim is known to match the index, or carry the dim as configuration:
♻️ Proposed fix — accept dim in the constructor
pub struct HostedEmbedder {
base_url: String,
api_key: String,
model: String,
+ dim: usize,
http: reqwest::Client,
}
impl HostedEmbedder {
- pub fn new(base_url: String, api_key: String, model: String) -> Self {
+ pub fn new(base_url: String, api_key: String, model: String, dim: usize) -> Self {
let http = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(30))
.connect_timeout(std::time::Duration::from_secs(10))
.build()
.unwrap_or_else(|_| reqwest::Client::new());
Self {
base_url,
api_key,
model,
+ dim,
http,
}
}
}
@@
- fn dim(&self) -> usize {
- 1536
- }
+ fn dim(&self) -> usize {
+ self.dim
+ }runtime.rs would then need a small model→dim resolver (or a third env var).
Also applies to: 106-108
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/life_capture/embedder.rs` around lines 12 - 34, HostedEmbedder
currently hard-codes dim() to 1536 while new() accepts any model string, causing
mismatch with INDEX_VECTOR_DIM; update HostedEmbedder to carry the embedding
dimension as configuration (add a dim: usize field and expose a dim() -> usize
method), change HostedEmbedder::new(...) to accept a dim parameter (or a
Model→dim resolver in runtime.rs that maps OPENHUMAN_EMBEDDINGS_MODEL to the
correct dimension), and update sites that construct HostedEmbedder (runtime.rs)
and any checks in rpc::handle_search to use the new dim field so the pre-flight
comparison against INDEX_VECTOR_DIM is accurate; ensure embed_batch callers
still work with the actual returned dimension.
| use anyhow::Context; | ||
| use once_cell::sync::OnceCell; | ||
| use r2d2_sqlite::SqliteConnectionManager; | ||
| use rusqlite::{ffi, Connection, OpenFlags, OptionalExtension}; | ||
| use std::path::Path; | ||
| use std::sync::Arc; | ||
| use tokio::sync::Mutex; | ||
| use tracing::{debug, trace}; | ||
|
|
||
| /// r2d2 pool of read-only SQLite connections. Used for file-backed indexes | ||
| /// in production so WAL actually buys us concurrent readers; in-memory test | ||
| /// indexes keep the single-connection layout since a shared-cache URI adds | ||
| /// ceremony for no throughput win at test-fixture scale. | ||
| pub(crate) type ReaderPool = r2d2::Pool<SqliteConnectionManager>; | ||
|
|
||
| /// Personal index — SQLite database with FTS5 + sqlite-vec virtual tables loaded. | ||
| /// | ||
| /// Internally split into a dedicated writer connection (serialised behind | ||
| /// `tokio::sync::Mutex`) and an optional pool of read-only connections. Writes | ||
| /// go through the writer (single-writer SQLite model); searches use the pool | ||
| /// when present so concurrent RPC callers don't block on each other or on a | ||
| /// long-running ingest. In-memory handles leave the pool `None` — both | ||
| /// readers and writers share the single connection, matching the pre-pool | ||
| /// behaviour tests depend on. | ||
| pub struct PersonalIndex { | ||
| /// Writer connection — always present. Also used as the sole reader when | ||
| /// `reader_pool` is `None` (in-memory handles). | ||
| pub writer: Arc<Mutex<Connection>>, | ||
| /// Optional read-only connection pool. `Some` for file-backed indexes; | ||
| /// `None` for in-memory ones so the writer's schema is visible to readers | ||
| /// without a shared-cache URI. | ||
| pub(crate) reader_pool: Option<Arc<ReaderPool>>, | ||
| } | ||
|
|
||
| static VEC_REGISTERED: OnceCell<()> = OnceCell::new(); | ||
|
|
||
| /// Register `sqlite3_vec_init` as a SQLite auto-extension exactly once per process. | ||
| /// Every connection opened after this point loads the vec0 module automatically. | ||
| pub(crate) fn ensure_vec_extension_registered() { | ||
| VEC_REGISTERED.get_or_init(|| unsafe { | ||
| let init: unsafe extern "C" fn() = sqlite_vec::sqlite3_vec_init; | ||
| let entry: unsafe extern "C" fn( | ||
| *mut ffi::sqlite3, | ||
| *mut *mut std::os::raw::c_char, | ||
| *const ffi::sqlite3_api_routines, | ||
| ) -> std::os::raw::c_int = std::mem::transmute(init as *const ()); | ||
| let rc = ffi::sqlite3_auto_extension(Some(entry)); | ||
| if rc != ffi::SQLITE_OK { | ||
| panic!("sqlite3_auto_extension(sqlite_vec) failed: rc={rc}"); | ||
| } | ||
| }); | ||
| } | ||
|
|
||
| /// Default pool size — small enough to stay within SQLite's default | ||
| /// connection fan-out without heroics, large enough that a handful of | ||
| /// concurrent RPC searches don't queue up behind each other while an | ||
| /// ingest holds the writer. Override is deliberately not exposed: the | ||
| /// numbers here are only loosely tuned and are best treated as an | ||
| /// implementation detail, not a knob. | ||
| const READER_POOL_SIZE: u32 = 4; | ||
|
|
||
| impl PersonalIndex { | ||
| /// Open (or create) the personal index at `path`. Loads sqlite-vec, runs | ||
| /// migrations on the writer, then spins up a read-only r2d2 pool against | ||
| /// the same file so searches can run concurrently with an in-flight | ||
| /// write. | ||
| pub async fn open(path: &Path) -> anyhow::Result<Self> { | ||
| ensure_vec_extension_registered(); | ||
| let path_buf = path.to_path_buf(); | ||
| let (writer, pool) = | ||
| tokio::task::spawn_blocking(move || -> anyhow::Result<(Connection, ReaderPool)> { | ||
| // Writer: exclusive RW connection, WAL journal so readers | ||
| // don't block on it, migrations run here. | ||
| let writer = Connection::open(&path_buf).context("open sqlite writer")?; | ||
| writer.pragma_update(None, "journal_mode", "WAL")?; | ||
| writer.pragma_update(None, "foreign_keys", "ON")?; | ||
| super::migrations::run(&writer).context("run life_capture migrations")?; | ||
|
|
||
| // Reader pool: read-only connections on the same file. Each | ||
| // pooled connection gets `query_only = 1` as a belt-and- | ||
| // suspenders guard so a mis-routed write here fails loudly | ||
| // instead of racing the writer. Foreign-key enforcement is | ||
| // writer-only in SQLite, so readers skip it. | ||
| // | ||
| // `SQLITE_OPEN_READ_ONLY` alone would also work, but keeping | ||
| // the connection RW + `query_only` lets sqlite-vec register | ||
| // on the connection at open time (the extension's own init | ||
| // writes to the sqlite_sequence table on first use). | ||
| let manager = SqliteConnectionManager::file(&path_buf).with_init(|c| { | ||
| c.pragma_update(None, "query_only", "ON")?; | ||
| Ok(()) | ||
| }); | ||
| let pool = r2d2::Pool::builder() | ||
| .max_size(READER_POOL_SIZE) | ||
| .build(manager) | ||
| .context("build reader pool")?; | ||
| Ok((writer, pool)) | ||
| }) | ||
| .await | ||
| .context("open task panicked")??; | ||
| Ok(Self { | ||
| writer: Arc::new(Mutex::new(writer)), | ||
| reader_pool: Some(Arc::new(pool)), | ||
| }) | ||
| } | ||
|
|
||
| /// Open an in-memory index (for tests). Single connection shared between | ||
| /// reader and writer — keeps the fixture path free of shared-cache URI | ||
| /// ceremony that buys no real concurrency at test-fixture scale. | ||
| pub async fn open_in_memory() -> anyhow::Result<Self> { | ||
| ensure_vec_extension_registered(); | ||
| let conn = tokio::task::spawn_blocking(|| -> anyhow::Result<Connection> { | ||
| let conn = Connection::open_with_flags( | ||
| ":memory:", | ||
| OpenFlags::SQLITE_OPEN_READ_WRITE | ||
| | OpenFlags::SQLITE_OPEN_CREATE | ||
| | OpenFlags::SQLITE_OPEN_NO_MUTEX, | ||
| ) | ||
| .context("open in-memory sqlite")?; | ||
| conn.pragma_update(None, "foreign_keys", "ON")?; | ||
| super::migrations::run(&conn).context("run life_capture migrations")?; | ||
| Ok(conn) | ||
| }) | ||
| .await | ||
| .context("open task panicked")??; | ||
| Ok(Self { | ||
| writer: Arc::new(Mutex::new(conn)), | ||
| reader_pool: None, | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| /// Writer for the personal index. Upserts items keyed by (source, external_id) | ||
| /// so re-ingesting the same source row updates in place; vectors are written | ||
| /// separately via `upsert_vector` since embeddings come from a remote call. | ||
| pub struct IndexWriter { | ||
| conn: Arc<Mutex<Connection>>, | ||
| } | ||
|
|
||
| impl IndexWriter { | ||
| pub fn new(idx: &PersonalIndex) -> Self { | ||
| Self { | ||
| conn: Arc::clone(&idx.writer), | ||
| } | ||
| } | ||
|
|
||
| /// Upsert items by (source, external_id). FTS rows are kept in sync via | ||
| /// the SQL triggers defined in 0001_init.sql. | ||
| /// | ||
| /// Mutates `items[i].id` to the canonical (existing) UUID when a row with | ||
| /// the same (source, external_id) already exists — so callers can write the | ||
| /// vector under the correct id without orphaning the previous one. Cascades | ||
| /// `item_vectors` deletes for any caller-supplied id that loses the race. | ||
| pub async fn upsert( | ||
| &self, | ||
| items: &mut [crate::openhuman::life_capture::types::Item], | ||
| ) -> anyhow::Result<()> { | ||
| if items.is_empty() { | ||
| return Ok(()); | ||
| } | ||
| // Clone across the blocking boundary, write canonical ids back below. | ||
| let mut owned = items.to_vec(); | ||
| let conn = Arc::clone(&self.conn); | ||
| let result: anyhow::Result<Vec<crate::openhuman::life_capture::types::Item>> = | ||
| tokio::task::spawn_blocking(move || { | ||
| let mut guard = conn.blocking_lock(); | ||
| let tx = guard.transaction().context("begin upsert tx")?; | ||
| for item in owned.iter_mut() { | ||
| let source = serde_json::to_value(item.source) | ||
| .ok() | ||
| .and_then(|v| v.as_str().map(str::to_string)) | ||
| .unwrap_or_default(); | ||
| let author_json = item | ||
| .author | ||
| .as_ref() | ||
| .map(|a| serde_json::to_string(a).unwrap_or_else(|_| "null".into())); | ||
| let metadata_json = serde_json::to_string(&item.metadata) | ||
| .unwrap_or_else(|_| "{}".into()); | ||
| let new_id = item.id.to_string(); | ||
|
|
||
| // Look up the existing canonical id, if any. | ||
| let existing_id: Option<String> = tx | ||
| .query_row( | ||
| "SELECT id FROM items WHERE source = ?1 AND external_id = ?2", | ||
| rusqlite::params![source, item.external_id], | ||
| |row| row.get(0), | ||
| ) | ||
| .optional() | ||
| .context("lookup existing item by (source, external_id)")?; | ||
|
|
||
| match existing_id { | ||
| Some(canonical) => { | ||
| // Existing row wins. If the caller supplied a | ||
| // different id, drop any vector they may have | ||
| // already written under that wrong id. | ||
| if canonical != new_id { | ||
| tx.execute( | ||
| "DELETE FROM item_vectors WHERE item_id = ?1", | ||
| rusqlite::params![new_id], | ||
| ) | ||
| .context("orphan-clean stale vector by caller id")?; | ||
| } | ||
| tx.execute( | ||
| "UPDATE items \ | ||
| SET ts = ?1, author_json = ?2, subject = ?3, \ | ||
| text = ?4, metadata_json = ?5 \ | ||
| WHERE id = ?6", | ||
| rusqlite::params![ | ||
| item.ts.timestamp(), | ||
| author_json, | ||
| item.subject.as_deref(), | ||
| item.text, | ||
| metadata_json, | ||
| canonical, | ||
| ], | ||
| ) | ||
| .context("update existing item row")?; | ||
| // Mutate the caller's item so subsequent | ||
| // upsert_vector writes under the canonical id. | ||
| item.id = uuid::Uuid::parse_str(&canonical) | ||
| .context("existing items.id is not a valid uuid")?; | ||
| } | ||
| None => { | ||
| tx.execute( | ||
| "INSERT INTO items(id, source, external_id, ts, author_json, subject, text, metadata_json) \ | ||
| VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8)", | ||
| rusqlite::params![ | ||
| new_id, | ||
| source, | ||
| item.external_id, | ||
| item.ts.timestamp(), | ||
| author_json, | ||
| item.subject.as_deref(), | ||
| item.text, | ||
| metadata_json, | ||
| ], | ||
| ) | ||
| .context("insert new item row")?; | ||
| } | ||
| } | ||
| } | ||
| tx.commit().context("commit upsert tx")?; | ||
| Ok(owned) | ||
| }) | ||
| .await | ||
| .context("upsert task panicked")?; | ||
|
|
||
| let updated = result?; | ||
| // Move canonical ids back into the caller's slice. | ||
| for (slot, item) in items.iter_mut().zip(updated.into_iter()) { | ||
| *slot = item; | ||
| } | ||
| Ok(()) | ||
| } | ||
|
|
||
| /// Replace the vector for an item atomically: DELETE + INSERT inside a | ||
| /// single transaction so a failed INSERT doesn't permanently remove the | ||
| /// item's vector. (vec0 doesn't support ON CONFLICT on its primary key.) | ||
| pub async fn upsert_vector(&self, item_id: &uuid::Uuid, vector: &[f32]) -> anyhow::Result<()> { | ||
| let id = item_id.to_string(); | ||
| let v_json = serde_json::to_string(vector).context("serialize vector")?; | ||
| let conn = Arc::clone(&self.conn); | ||
| tokio::task::spawn_blocking(move || -> anyhow::Result<()> { | ||
| let mut guard = conn.blocking_lock(); | ||
| let tx = guard.transaction().context("begin upsert_vector tx")?; | ||
| let exists: i64 = tx | ||
| .query_row( | ||
| "SELECT COUNT(1) FROM items WHERE id = ?1", | ||
| rusqlite::params![id], | ||
| |row| row.get(0), | ||
| ) | ||
| .context("check item exists")?; | ||
| if exists == 0 { | ||
| anyhow::bail!("cannot upsert vector for unknown item {id}"); | ||
| } | ||
| tx.execute( | ||
| "DELETE FROM item_vectors WHERE item_id = ?1", | ||
| rusqlite::params![id], | ||
| ) | ||
| .context("delete prior vector")?; | ||
| tx.execute( | ||
| "INSERT INTO item_vectors(item_id, embedding) VALUES (?1, ?2)", | ||
| rusqlite::params![id, v_json], | ||
| ) | ||
| .context("insert vector")?; | ||
| tx.commit().context("commit upsert_vector tx")?; | ||
| Ok(()) | ||
| }) | ||
| .await | ||
| .context("upsert_vector task panicked")? | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
|
|
||
| #[tokio::test] | ||
| async fn vec_extension_loads_and_reports_version() { | ||
| let idx = PersonalIndex::open_in_memory().await.expect("open"); | ||
| let conn = idx.writer.lock().await; | ||
| let version: String = conn | ||
| .query_row("SELECT vec_version()", [], |row| row.get(0)) | ||
| .expect("vec_version"); | ||
| assert!( | ||
| version.starts_with('v'), | ||
| "unexpected vec_version: {version}" | ||
| ); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn vec0_table_accepts_insert_and_returns_match() { | ||
| let idx = PersonalIndex::open_in_memory().await.expect("open"); | ||
| let conn = idx.writer.lock().await; | ||
|
|
||
| let v: Vec<f32> = (0..1536).map(|i| (i as f32) * 0.001).collect(); | ||
| let v_json = serde_json::to_string(&v).unwrap(); | ||
|
|
||
| conn.execute( | ||
| "INSERT INTO item_vectors(item_id, embedding) VALUES (?1, ?2)", | ||
| rusqlite::params!["00000000-0000-0000-0000-000000000001", v_json], | ||
| ) | ||
| .expect("insert vec"); | ||
|
|
||
| let id: String = conn | ||
| .query_row( | ||
| "SELECT item_id FROM item_vectors \ | ||
| WHERE embedding MATCH ?1 \ | ||
| ORDER BY distance LIMIT 1", | ||
| rusqlite::params![v_json], | ||
| |row| row.get(0), | ||
| ) | ||
| .expect("vec MATCH query"); | ||
| assert_eq!(id, "00000000-0000-0000-0000-000000000001"); | ||
| } | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
File exceeds the ~500-line module-size guideline (≈927 lines).
This module bundles PersonalIndex setup, the writer, the reader, FTS sanitization, query post-filtering, and four test modules in one file. Splitting reduces cognitive load and keeps each unit sharply focused.
Suggested layout under src/openhuman/life_capture/:
index/mod.rs— re-exports +PersonalIndex,ensure_vec_extension_registered,ReaderPool, pool sizing.index/writer.rs—IndexWriter+ writer-side helpers (theSourceserialization,serde_jsonfallbacks).index/reader.rs—IndexReader,ItemRow,with_read_conn,fts5_quote,apply_query_filters.- Existing
#[cfg(test)] modblocks move next to the code they exercise.
As per coding guidelines: "Rust source files should be kept to approximately ≤500 lines per file; split modules when growing beyond this size".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/life_capture/index.rs` around lines 1 - 336, The file is too
large and should be split into focused modules: keep the public types and
process-wide init in index/mod.rs (PersonalIndex,
ensure_vec_extension_registered, ReaderPool, READER_POOL_SIZE), move writer
logic into index/writer.rs (IndexWriter, its upsert and upsert_vector, and any
writer-side helpers like Source/serde_json fallbacks), move read/query logic
into index/reader.rs (IndexReader, ItemRow, with_read_conn, fts5_quote,
apply_query_filters and reader helpers), and relocate each #[cfg(test)] mod next
to the code it tests; update mod declarations and use statements accordingly so
all referenced symbols (PersonalIndex, IndexWriter, IndexReader, ItemRow,
with_read_conn, fts5_quote, apply_query_filters,
VEC_REGISTERED/ensure_vec_extension_registered) remain accessible.
| let api_key = std::env::var("OPENHUMAN_EMBEDDINGS_KEY") | ||
| .or_else(|_| std::env::var("OPENAI_API_KEY")) | ||
| .ok(); | ||
| let Some(api_key) = api_key else { | ||
| log::info!( | ||
| "[life_capture] embedder not configured (set OPENAI_API_KEY or \ | ||
| OPENHUMAN_EMBEDDINGS_KEY); life_capture.search will return \ | ||
| 'embedder not configured', life_capture.get_stats remains available" | ||
| ); | ||
| return; | ||
| }; |
There was a problem hiding this comment.
Minor: empty OPENHUMAN_EMBEDDINGS_KEY shadows OPENAI_API_KEY.
std::env::var(...) returns Ok("") when the variable is set but empty, so .or_else doesn't trigger. A user with OPENHUMAN_EMBEDDINGS_KEY= (e.g. exported with no value in a shell init) and a valid OPENAI_API_KEY will hit the auth-failure path instead of falling back. Filter out empty strings to make precedence intuitive:
🛡️ Proposed change
- let api_key = std::env::var("OPENHUMAN_EMBEDDINGS_KEY")
- .or_else(|_| std::env::var("OPENAI_API_KEY"))
- .ok();
+ let api_key = std::env::var("OPENHUMAN_EMBEDDINGS_KEY")
+ .ok()
+ .filter(|s| !s.is_empty())
+ .or_else(|| std::env::var("OPENAI_API_KEY").ok())
+ .filter(|s| !s.is_empty());📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let api_key = std::env::var("OPENHUMAN_EMBEDDINGS_KEY") | |
| .or_else(|_| std::env::var("OPENAI_API_KEY")) | |
| .ok(); | |
| let Some(api_key) = api_key else { | |
| log::info!( | |
| "[life_capture] embedder not configured (set OPENAI_API_KEY or \ | |
| OPENHUMAN_EMBEDDINGS_KEY); life_capture.search will return \ | |
| 'embedder not configured', life_capture.get_stats remains available" | |
| ); | |
| return; | |
| }; | |
| let api_key = std::env::var("OPENHUMAN_EMBEDDINGS_KEY") | |
| .ok() | |
| .filter(|s| !s.is_empty()) | |
| .or_else(|| std::env::var("OPENAI_API_KEY").ok()) | |
| .filter(|s| !s.is_empty()); | |
| let Some(api_key) = api_key else { | |
| log::info!( | |
| "[life_capture] embedder not configured (set OPENAI_API_KEY or \ | |
| OPENHUMAN_EMBEDDINGS_KEY); life_capture.search will return \ | |
| 'embedder not configured', life_capture.get_stats remains available" | |
| ); | |
| return; | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/life_capture/runtime.rs` around lines 96 - 106, The current env
lookup treats an empty OPENHUMAN_EMBEDDINGS_KEY as a valid value and thus never
falls back to OPENAI_API_KEY; update the retrieval so empty strings are treated
as unset: call std::env::var("OPENHUMAN_EMBEDDINGS_KEY") and
std::env::var("OPENAI_API_KEY") but filter out empty results (e.g. map to Option
and .filter(|s| !s.is_empty()) or convert empty Ok("") into None) before
applying the precedence logic that assigns to the local api_key and the
subsequent let Some(api_key) = api_key else { ... } block so a blank
OPENHUMAN_EMBEDDINGS_KEY will correctly fall back to OPENAI_API_KEY.
| async fn run_query(reader: &IndexReader, q: &FixtureQuery) -> Result<Vec<String>> { | ||
| let hits = match q.kind { | ||
| QueryKind::Keyword => reader | ||
| .keyword_search(&q.text, q.k) | ||
| .await | ||
| .with_context(|| format!("keyword_search failed for {}", q.id))?, | ||
| QueryKind::Semantic => { | ||
| let v = embed(&q.text); | ||
| reader | ||
| .vector_search(&v, q.k) | ||
| .await | ||
| .with_context(|| format!("vector_search failed for {}", q.id))? | ||
| } | ||
| QueryKind::Mixed => { | ||
| let v = embed(&q.text); | ||
| let query = Query { | ||
| text: q.text.clone(), | ||
| k: q.k, | ||
| sources: q.sources.clone(), | ||
| since: q.since, | ||
| until: q.until, | ||
| }; | ||
| reader | ||
| .hybrid_search(&query, &v) | ||
| .await | ||
| .with_context(|| format!("hybrid_search failed for {}", q.id))? | ||
| } | ||
| }; | ||
| Ok(hit_ext_ids(&hits)) | ||
| } |
There was a problem hiding this comment.
R@5 collapses to R@q.k when the query asks for fewer than 5 hits.
run_query only retrieves q.k results, and found_at clamps with n = k.min(hits_ext.len()). So for any fixture query with k < 5, the reported recall_at_5 is identical to recall_at_k — positions 4-5 are never inspected, even when more relevant items exist there. Today the baseline survives (R@5 > R@3 → at least some queries set k≥5), but as new queries are added with smaller k the per-kind/MEAN R@5 numbers will silently understate recall.
Decoupling the retrieval cutoff from the metric cutoff fixes this without changing the fixture:
♻️ Suggested fix in run_query
async fn run_query(reader: &IndexReader, q: &FixtureQuery) -> Result<Vec<String>> {
+ // Always retrieve enough to score the largest reported R@k, regardless of q.k.
+ let retrieve_k = q.k.max(5);
let hits = match q.kind {
QueryKind::Keyword => reader
- .keyword_search(&q.text, q.k)
+ .keyword_search(&q.text, retrieve_k)
.await
.with_context(|| format!("keyword_search failed for {}", q.id))?,
QueryKind::Semantic => {
let v = embed(&q.text);
reader
- .vector_search(&v, q.k)
+ .vector_search(&v, retrieve_k)
.await
.with_context(|| format!("vector_search failed for {}", q.id))?
}
QueryKind::Mixed => {
let v = embed(&q.text);
let query = Query {
text: q.text.clone(),
- k: q.k,
+ k: retrieve_k,
sources: q.sources.clone(),
since: q.since,
until: q.until,
};
reader
.hybrid_search(&query, &v)
.await
.with_context(|| format!("hybrid_search failed for {}", q.id))?
}
};
Ok(hit_ext_ids(&hits))
}Note that this changes what must_contain_in_top and must_not_contain_in_top see — they should keep using q.k (already do, via q.must_contain_in_top.unwrap_or(q.k)), so the assertions remain query-budget-faithful while the metrics get an honest top-5.
Also applies to: 318-343
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/life_capture_retrieval_eval.rs` around lines 197 - 226, run_query
currently requests only q.k hits so R@5 collapses to R@q.k for queries with k<5;
change the search call to request a fetch_k = q.k.max(5) (use fetch_k when
calling reader.keyword_search, vector_search and hybrid_search) and still return
hit_ext_ids(&hits) (leave any budget-aware assertions like must_contain_in_top
using q.k unchanged). Apply the same change in the other occurrence identified
around lines 318-343 (use fetch_k = q.k.max(5) for retrievals, but keep
downstream checks that rely on q.k).
Summary
Stacks on #817. Adds graded relevance labels + recall@k/MRR aggregator on top of the eval scaffolding that #817 introduced.
tests/fixtures/life_capture/corpus.jsonwithrelevant: [{ext_id, grade}]arrays. (q-src-01 stays pending, q-neg-01 is purely negative-assertion.)score_query+print_metrics_summarytotests/life_capture_retrieval_eval.rs. Per-query + mean recall@1/3/5 + MRR, plus per-kind breakdown.assert_*smoke checks unchanged.Baseline (from this run)
Numbers are deterministic (toy embedder is feature-hashing, no model). Now any retrieval change has to move them — regressions become a CI signal.
Note on diff size
This PR currently shows ~all of #817's commits because it's stacked. Once #817 merges to main, this PR collapses to just the 2 files actually changed here.
Test plan
cargo test --test life_capture_retrieval_eval -- --nocapture— 4 passed, metrics printSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests