test(foundation): coverage matrix + strategy + gap-fill batch (#773) by oxoxDev · Pull Request #980 · tinyhumansai/openhuman

oxoxDev · 2026-04-27T11:55:16Z

Summary

Make test coverage measurable: every leaf of the canonical product feature list (sections 0-13, ~200 features) now has an explicit row in docs/TEST-COVERAGE-MATRIX.md with status (covered / partial / missing / not driver-automatable) and a path to its test source.
Codify how we test: docs/TESTING-STRATEGY.md defines the layers (Rust unit, Rust integration, Vitest unit, WDIO E2E, manual smoke), a "where does my test go?" decision tree, and the failure-path requirement.
Ship 8 new tests against the highest-value gaps surfaced by the matrix: 5 WDIO E2E specs, 2 Vitest units, 1 Rust integration test — moving 13 feature leaves from ❌ → ✅.
Track the rest of the epic via 7 sub-issues ([Test Infra] Tighten PR template + add CI enforcement for tests/docs #965-[Test] Release-cut manual smoke checklist for OS-level features #971) linked as native sub-issues of Enforce unit tests, feature tests, and documentation for core foundation #773.

Problem

Repo already had 447 Rust unit-test modules, 96 Vitest files, and 28 WDIO specs, but there was no canonical mapping from product features to tests. "Tests + docs by default" was an aspiration in the PR template, not a contract — gaps weren't visible, and test placement decisions were ad-hoc.

The provided 200-leaf feature list defines the surface that needs to be made measurable. A single PR that fills every gap would be unreviewable, so this PR is the BACKBONE: matrix + strategy + a focused batch of high-value gap-fill tests. The remaining gap categories are tracked under sub-issues so contributors can pick them up independently.

Solution

Backbone artefacts

docs/TEST-COVERAGE-MATRIX.md — 200-row table with ✅ / 🟡 / ❌ / 🚫 legend, RU / RI / VU / WD / MS layer codes, and explicit follow-up issue references on every missing row.
docs/TESTING-STRATEGY.md — layer definitions, decision tree, mock policy, determinism rules, and a list of OS-level surfaces that legitimately can't be driver-automated and require manual smoke at release-cut ([Test] Release-cut manual smoke checklist for OS-level features #971).

New tests in this PR (13 leaves moved ❌ → ✅)

Layer	File	Feature IDs
WDIO E2E	`app/test/e2e/specs/whatsapp-flow.spec.ts`	10.1.2
WDIO E2E	`app/test/e2e/specs/slack-flow.spec.ts`	10.1.4
WDIO E2E	`app/test/e2e/specs/memory-roundtrip.spec.ts`	8.1.1 / 8.1.2 / 8.1.3
WDIO E2E	`app/test/e2e/specs/insights-dashboard.spec.ts`	11.1.3 / 11.2.1 / 11.2.2 / 11.2.3
WDIO E2E	`app/test/e2e/specs/autocomplete-flow.spec.ts`	5.2.1 / 5.2.2 / 5.2.3 (settings surface)
Vitest	`app/src/components/intelligence/__tests__/utils.test.ts`	11.1.2
Vitest	`app/src/features/autocomplete/__tests__/useAutocompleteSkillStatus.test.tsx`	5.2 (skill-status branches)
Rust integration	`tests/memory_roundtrip_e2e.rs`	8.1.1 / 8.1.2 / 8.1.3 (Rust contract)

Every spec/test asserts at least one failure-path or edge case in addition to the happy path, per the strategy doc.

Tracked separately (epic sub-issues of #773)

[Test Infra] Tighten PR template + add CI enforcement for tests/docs #965 — Tighten PR template + CI enforcement of checklist + matrix-update guard
[Docs] README sweep for foundational core domains #966 — Foundational core domain READMEs (agent / memory / channels / skills / cron / config / event_bus / app_state / security / accessibility / local_ai / encryption)
[Test] E2E coverage for system tools (Filesystem, Shell, Git, Browser) #967 — System-tools E2E (Filesystem / Shell / Git / Browser)
[Test] Auth multi-provider edge cases + permission re-sync #968 — Auth multi-provider linking edges + refresh-token rotation + permission re-sync
[Test] Settings dev-options + data-management coverage #969 — Settings dev-options + data-management coverage
[Test] Rewards & Progression coverage #970 — Rewards & Progression coverage
[Test] Release-cut manual smoke checklist for OS-level features #971 — Release-cut manual smoke checklist for OS-level features (TCC / Gatekeeper / DMG / code-sign)

Submission Checklist

Unit tests — 2 new Vitest suites + 1 new Rust integration test added; full local Vitest run passes (934 / 4 skipped / 1 todo); cargo test --test memory_roundtrip_e2e passes 2/2.
E2E / integration — 5 new WDIO specs added under app/test/e2e/specs/. Mirror existing patterns (mock backend, triggerAuthDeepLinkBypass, waitForApp/waitForWebView). Mac2 cleanly skipped where the rail / sidebar mapping is not yet exposed to Appium helpers — Linux CI is the source of truth. Local pnpm Vitest + cargo gates green; WDIO build was attempted via Docker locally but the amd64-only CI image deadlocked under arm64 emulation, so verification is deferred to CI Linux runners (which run native amd64).
Doc comments — Each spec has a top-of-file comment block explaining goal, deferred follow-ups, and platform skips. Strategy + matrix docs are top-level docs.
Inline comments — Where assertions hinge on non-obvious affordances (FALLBACK provider source, status-label vocabulary, post-clear failure-path), the spec calls out the reason.

Impact

Desktop/CLI: no runtime behaviour changes — pure test + documentation surface.
CI: 5 additional WDIO specs added to the all-flows run on Linux. Each takes ~30-60s in steady state; total impact ~3-5min on the existing E2E job. New Vitest suites add <1s each. The new Rust integration test runs in pnpm test:rust (~0.04s per test).
Performance: no production code touched, no migrations, no security implications.
Compatibility: docs/TESTING-STRATEGY.md reflects the post-yarn→pnpm migration commands.

Closes: Enforce unit tests, feature tests, and documentation for core foundation #773 (epic)
Sub-issues opened in this PR: [Test Infra] Tighten PR template + add CI enforcement for tests/docs #965, [Docs] README sweep for foundational core domains #966, [Test] E2E coverage for system tools (Filesystem, Shell, Git, Browser) #967, [Test] Auth multi-provider edge cases + permission re-sync #968, [Test] Settings dev-options + data-management coverage #969, [Test] Rewards & Progression coverage #970, [Test] Release-cut manual smoke checklist for OS-level features #971 (linked as native sub-issues of Enforce unit tests, feature tests, and documentation for core foundation #773)
Follow-up PR(s)/TODOs: gap-fill PRs against each sub-issue above.

Summary by CodeRabbit

Tests
- Added unit and E2E coverage for actionable items, autocomplete settings, Insights, Slack and WhatsApp account flows, memory round-trip (JS and Rust), plus stability fixtures and deterministic ordering/assertions.
Documentation
- Added a repository testing strategy and a test-coverage matrix to standardize test placement and coverage expectations.
Chores
- Exposed the application store on the window in dev contexts to aid debugging and test helpers.

…yhumansai#773) Add `docs/TEST-COVERAGE-MATRIX.md` mapping every leaf of the canonical product feature list (sections 0-13, ~200 features) to its test sources with explicit status — covered, partial, missing, or not driver-automatable. Drives gap-fill work tracked in sub-issues tinyhumansai#965-tinyhumansai#971. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nsai#773) Add `docs/TESTING-STRATEGY.md` defining the test layers used in this repo (Rust unit, Rust integration, Vitest unit, WDIO E2E, manual smoke), with a decision tree for "where does my test go", failure-path requirement, mock policy, determinism rules, and an explicit list of surfaces that are not driver-automatable and require manual smoke at release-cut. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rip (tinyhumansai#773) Add `tests/memory_roundtrip_e2e.rs` exercising the full `doc_put` → `memory_recall_memories` → `clear_namespace` lifecycle against a per-test `OPENHUMAN_WORKSPACE` tempdir, covering happy path (8.1.1 / 8.1.2) and the failure-path edge (8.1.3) where a cleared namespace must no longer return the canary content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…i#773) Add `app/src/components/intelligence/__tests__/utils.test.ts` covering `groupItemsByTime`, `filterItems`, and `getItemStats` — including the priority-then-recency ordering inside a time bucket and the search-term-misses-everything failure path. Closes coverage gap on feature 11.1.2 (actionable item extraction). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…inyhumansai#773) Add `app/src/features/autocomplete/__tests__/useAutocompleteSkillStatus.test.tsx` covering all six branches the hook surfaces — no-status, unsupported, running (overrides stale errors), error, enabled-but-not-running, and disabled — with their corresponding status labels and CTAs. Closes coverage gap on feature 5.2.2 (debounce / skill-status surface). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

) Add `app/test/e2e/specs/whatsapp-flow.spec.ts` asserting that the Accounts page exposes WhatsApp Web in the Add Account modal with the expected label and description (10.1.2 connection happy path) and that selecting it dismisses the modal. Real WhatsApp QR-code login is out of scope and tracked in the cross-channel follow-up. Mirrors the patterns in `channels-smoke.spec.ts` / `gmail-flow.spec.ts` — mock backend via `startMockServer`, auth via `triggerAuthDeepLinkBypass`, Mac2 skipped because the Accounts rail labels are not mapped for Appium. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add `app/test/e2e/specs/slack-flow.spec.ts` asserting that the Accounts page exposes Slack as an addable provider with its label and description (10.1.4) and that selecting it dismisses the picker. Real Slack OAuth (workspace selection, scope grant, message ingest) is deferred to the cross-channel follow-up. Mac2 skipped — Accounts rail labels not mapped for Appium helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ansai#773) Add `app/test/e2e/specs/memory-roundtrip.spec.ts` driving the memory JSON-RPC API (`openhuman.memory_init` / `memory_doc_put` / `memory_recall_memories` / `memory_clear_namespace`) end-to-end through the running Tauri shell + sidecar. Asserts the canary content is recalled after store (8.1.1 + 8.1.2) and is absent after a clear (8.1.3 failure-path edge). Companion to the Rust integration test in `tests/memory_roundtrip_e2e.rs` which proves the Rust contract in isolation; this spec proves the UI ⇄ Tauri ⇄ sidecar wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add `app/test/e2e/specs/insights-dashboard.spec.ts` asserting the /intelligence route mounts and renders the Memory tab (11.2.1), accepts input on the search affordance (11.2.3), and surfaces at least one provider source filter chip (11.2.2). Backend-side memory wiring is covered by the round-trip spec; this spec is bounded to the dashboard surface so failures are easy to attribute. Mac2 skipped — Intelligence sidebar mapping not yet exposed to Appium. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add `app/test/e2e/specs/autocomplete-flow.spec.ts` asserting the /settings/autocomplete panel mounts, surfaces a status label produced by `useAutocompleteSkillStatus` (Active / Offline / Error / Unsupported), and renders the matching CTA (Enable / Manage / Retry / Details). Real inline-suggestion firing in third-party text fields requires macOS TCC permissions and is parked on the manual smoke checklist (tinyhumansai#971). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-27T11:55:32Z

📝 Walkthrough

Walkthrough

Adds many new tests (Vitest unit, WDIO E2E, Rust integration), two testing documents, a small E2E helper, and a dev-only exposure of the browser store to window.__OPENHUMAN_STORE__. Tests cover intelligence utilities, autocomplete status, insights, memory round-trips, and account integrations.

Changes

Cohort / File(s)	Summary
Intelligence unit tests `app/src/components/intelligence/__tests__/utils.test.ts`	New Vitest suite testing actionable-item factories, time-freeze behavior, grouping (Today/Yesterday/This Week/Older), ordering by priority/recency, filtering, and stats calculations.
Insights E2E `app/test/e2e/specs/insights-dashboard.spec.ts`	New WDIO smoke spec validating `/intelligence` UI: Memory presence, search input wiring, and source select rendering.
Autocomplete unit & E2E `app/src/features/autocomplete/__tests__/useAutocompleteSkillStatus.test.tsx`, `app/test/e2e/specs/autocomplete-flow.spec.ts`	Hook tests for derived autocomplete UI status under mocked runtime; E2E smoke spec navigates to settings and asserts status labels and CTAs.
Account integration E2E `app/test/e2e/specs/slack-flow.spec.ts`, `app/test/e2e/specs/whatsapp-flow.spec.ts`	New WDIO specs for Add Account modal flows: provider tile discovery, selection, modal close, and registration verification (via exposed store).
Memory round‑trip tests `app/test/e2e/specs/memory-roundtrip.spec.ts`, `tests/memory_roundtrip_e2e.rs`	E2E and Rust integration tests exercising memory JSON‑RPC lifecycle: init, put/doc_put, recall, and clear_namespace with canary verification and isolated workspace.
E2E helper `app/test/e2e/helpers/shared-flows.ts`	Adds exported helper `openAddAccountModal()` (DOM scan by aria-label, click, wait-for-text) and brings `waitForText` into scope.
Store exposure `app/src/store/index.ts`	Expose the Redux store on `window.__OPENHUMAN_STORE__` when `window` is defined (dev/test exposure only).
Testing docs `docs/TESTING-STRATEGY.md`, `docs/TEST-COVERAGE-MATRIX.md`	Adds repository testing strategy and canonical coverage matrix mapping features to test layers, with rules for determinism, mock-first policy, and update contract.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

[Test Infra] Tighten PR template + add CI enforcement for tests/docs #965 — The added TEST-COVERAGE-MATRIX and TESTING-STRATEGY plus many test additions directly support enforcing coverage and CI/test checklist objectives described in this issue.

Possibly related PRs

Fix/202: e2e logout relogin onboarding #305 — Modifies the same E2E helper module (app/test/e2e/helpers/shared-flows.ts); likely overlapping helper additions.
feat(memory): global singleton, CLI, graph extraction fixes & light storage #383 — Related changes to memory_init / MemoryClient and memory RPCs which the new memory tests exercise.
fix(e2e): overhaul all E2E specs for Linux tauri-driver #180 — Overlaps E2E infra and helper behavior that these new WDIO specs depend on.

Suggested reviewers

senamakel
graycyrus

Poem

🐰 I hopped through tests and docs today,
Canary seeds in memory play.
Mods and helpers spread the light,
Store on window—peek at night.
Hop, green check, and off we go—🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title concisely captures the main purpose: introducing a test coverage matrix, testing strategy documentation, and new gap-fill tests to address core foundation coverage gaps.
Linked Issues check	✅ Passed	The PR fully satisfies `#773` by delivering unit tests (Vitest), integration tests (Rust, WDIO E2E), comprehensive test strategy and coverage matrix documentation, and test infrastructure (window.OPENHUMAN_STORE for E2E assertions).
Out of Scope Changes check	✅ Passed	All changes are scoped to test files, test documentation, and test infrastructure (window store exposure); no production logic changes outside testing scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 16

🧹 Nitpick comments (3)

app/test/e2e/specs/memory-roundtrip.spec.ts (1)

92-93: Prefer structured recall assertions over JSON string scanning.

JSON.stringify(...).includes(...) is brittle and can hide shape regressions. Assert on the recalled memories collection fields directly.

Also applies to: 111-113

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/test/e2e/specs/memory-roundtrip.spec.ts` around lines 92 - 93, Replace
the brittle JSON string scan with structured assertions on the recall result:
inspect recallResult.result (or recallResult.result.memories) directly and
assert that the memories array exists and contains an object whose key equals
TEST_KEY or whose content includes TEST_CONTENT (e.g., find a memory where
memory.key === TEST_KEY || memory.content.includes(TEST_CONTENT)); apply the
same change for the other instance around TEST_KEY/TEST_CONTENT so tests assert
field values/array length and presence rather than using
JSON.stringify(...).includes(...).

app/test/e2e/specs/insights-dashboard.spec.ts (1)

58-109: Add targeted failure diagnostics to speed triage.

When these assertions fail, emit request logs and accessibility tree snapshots to make CI debugging actionable.

Based on learnings: Applies to app/test/e2e/specs/*.spec.ts — add failure diagnostics (request logs, dumpAccessibilityTree()).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/test/e2e/specs/insights-dashboard.spec.ts` around lines 58 - 109, Wrap
each failing test body (e.g., the "mounts the /intelligence..." test, the
"exposes a search input..." test, and the "renders at least one source filter
affordance" test) in a try/catch so that on any thrown error you first capture
diagnostics and then rethrow; in the catch call dumpAccessibilityTree() and
capture request logs (e.g., via the browser's network/performance logs or your
existing request-logging helper) and emit them via console.error or
processLogger before rethrowing the original error; update tests that use
navigateViaHash, waitForText, and the DOM-executing blocks to follow this
pattern so failures produce both request logs and dumpAccessibilityTree() output
for CI triage.

docs/TESTING-STRATEGY.md (1)

64-66: Optional: vary repeated “No …” bullet starts for readability.

Three consecutive bullets start with “No”, which reads a bit mechanical; rewording one or two would improve flow.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/TESTING-STRATEGY.md` around lines 64 - 66, Reword the three consecutive
bullets to avoid repeating "No" at the start: change "No order-dependent specs —
each spec must pass when run alone." to something like "Avoid order-dependent
specs — each spec must pass when run alone.", keep "No reliance on absolute
coordinates or animation timing." or rephrase to "Avoid relying on absolute
coordinates or animation timing.", and rewrite "No real keyboard via
`browser.keys()` on tauri-driver — synthesize via `browser.execute(...)` (see
`command-palette.spec.ts` for the pattern)." to "Simulate keyboard input instead
of using `browser.keys()` on tauri-driver — synthesize via
`browser.execute(...)` (see `command-palette.spec.ts` for the pattern)."

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/intelligence/__tests__/utils.test.ts`:
- Around line 26-27: The tests use a time-dependent helper daysAgo(n) that calls
Date.now(), causing flakey boundary failures; update the tests to capture a
single deterministic base timestamp (e.g., const base = todayNoon()) at the
start of each test and replace direct calls to daysAgo/Date.now()/new Date() by
deriving dates from that base (e.g., base minus n days) so all timestamps are
stable; update the daysAgo usages in this file (and the similar usages around
the daysAgo helper and the ranges noted) to accept or be computed from the
provided base timestamp rather than using Date.now() directly, ensuring each
test controls its own fixed time fixture.
- Around line 119-123: The test expects filterItems to match searchTerm against
sourceLabel as well, but the current implementation in
app/src/components/intelligence/utils.ts only checks title and description;
update the filterItems function so the searchTerm is compared case-insensitively
against item.sourceLabel in the same way it checks title and description (e.g.,
normalize to lower/upper and include sourceLabel in the OR conditions or
combined searchable string used for matching) so that searches like 'gmail' will
return items whose sourceLabel contains that term.

In `@app/test/e2e/specs/autocomplete-flow.spec.ts`:
- Around line 74-80: The test "renders an Enable / Manage / Retry CTA driven by
skill status" currently relies on prior test state; update this test to navigate
to '/settings/autocomplete' at the start and wait for the settings panel to
stabilize (e.g., wait for the known stable panel text or selector) before
querying with textExists('Enable'|'Manage'|'Retry'|'Details'), so the assertion
is independent; modify the test setup in that spec to call the router/navigation
and stabilization wait prior to using textExists.
- Around line 14-21: The spec comment overclaims features (inline suggestion
generation, debounce, acceptance trigger) beyond what the test asserts; either
narrow the comment or add focused assertions in autocomplete-flow.spec.ts:
update the top comment to only claim panel/status/CTA visibility if you don't
test behavior, or implement tests that exercise AutocompletePanel suggestion
flow by (1) mounting a third-party text field, mocking the runtime state to
online, (2) simulating typed input and using time control (e.g.,
cy.clock()/cy.tick()) to assert debounce behavior and that a suggestion element
appears, and (3) asserting acceptance semantics by simulating the acceptance
keystroke (Enter/Tab) and confirming the suggestion is committed; make these
checks within the existing spec (AutocompletePanel, skill-status pill, "Enable"
CTA) or remove the extra feature claims from the header.

In `@app/test/e2e/specs/insights-dashboard.spec.ts`:
- Around line 97-108: The current broad scan in the browser.execute block
(variable hasFilterChip) looks for any button containing provider names and can
false-pass; narrow the selector to the actual filter control/chips instead:
within the same browser.execute call target a more specific container or chip
semantics (e.g., a filter toolbar element, a container with a
test-id/aria-label, or elements with role="list" / role="listitem" for chips)
and check its children for a chip whose text matches one of
['gmail','slack','telegram','notion','whatsapp','discord','all sources']; update
the candidates/querySelectorAll usage and the expect(hasFilterChip).toBe(true)
assertion accordingly so the test verifies the filter-chip rendering in the
correct region (refer to the hasFilterChip variable and the browser.execute
block to locate the change).
- Around line 69-88: The test currently uses browser.execute and directly sets
target.value (inside the executed script with the variables typed, inputs,
target, target.focus, target.value), which performs raw DOM mutation; replace
this block with the project's cross-platform E2E helper that simulates real
typing (e.g., call the shared helper like typeIntoSearch or the test-utils
typing helper) so the element is focused and text is entered via key events
rather than assigning target.value; locate the browser.execute(...) usage in the
insights-dashboard.spec.ts test and swap it for the helper invocation (or use
browser keys/element.addValue equivalents) to type "roundtrip canary" into the
chosen input and trigger change events via the helper.

In `@app/test/e2e/specs/memory-roundtrip.spec.ts`:
- Around line 96-114: The test "clears a namespace and recall returns no canary
content (edge case)" currently clears without first seeding a canary, allowing a
vacuous pass; before calling openhuman.memory_clear_namespace in this it block,
use callOpenhumanRpc to store a known canary using
TEST_NAMESPACE/TEST_KEY/TEST_CONTENT (same helpers as the rest of the file),
verify the seed succeeded (expect ok true and recall includes
TEST_KEY/TEST_CONTENT via callOpenhumanRpc('openhuman.memory_recall_memories')),
then call openhuman.memory_clear_namespace, then assert recall no longer
contains TEST_KEY/TEST_CONTENT—i.e., add a seed-and-verify precondition and use
the existing callOpenhumanRpc, TEST_NAMESPACE, TEST_KEY, and TEST_CONTENT
identifiers to locate and implement the changes.

In `@app/test/e2e/specs/slack-flow.spec.ts`:
- Around line 82-100: The test "selecting Slack dismisses the picker and
registers an account" currently only asserts modal dismissal; add a second
assertion that verifies an account registration side-effect (either by asserting
the mock backend was called or by checking the account-rail/state shows the new
Slack account). After the existing pause and expect(await textExists('Add
account')).toBe(false), assert the mock HTTP or IPC stub (e.g.,
mockServer.expectCalled('registerAccount') or whatever mock helper your suite
exposes) or query the account list UI (via the same textExists helper or a
specific selector for the account-rail) to confirm a Slack account appears;
include a failure diagnostic message (e.g., stepLog or an informative expect
message) so failures show backend vs UI mismatch. Ensure you reference the same
test name and use existing helpers like textExists and stepLog when adding these
checks.
- Around line 82-100: The test "selecting Slack dismisses the picker and
registers an account" is state-coupled because it assumes the picker is already
open; modify the test to navigate to the accounts page and open the picker
before trying to click the Slack tile. Specifically, at the start of the it
block for 'selecting Slack...', call browser.url('/accounts') (or the helper
that loads that page), then perform the same steps used elsewhere to open the
account picker (e.g., click the "Add account" button or invoke the open-picker
helper), wait for the picker to be visible (use the existing textExists or a
visible element wait), and only then run the browser.execute block that finds
and clicks the Slack tile and the subsequent assertions so the test can run in
isolation.
- Around line 66-74: Replace the raw browser.execute + querySelectorAll DOM
clicks with the cross-platform element helpers from element-helpers.ts: remove
uses of browser.execute in the block that searches for 'Add app' and the similar
block at 84-95, and call the provided helper (e.g., clickByAriaLabel or
clickNativeButton / clickToggle with the aria-label 'Add app') to open the
modal, then use the helper for selecting the Slack entry (e.g., selectItemByText
or clickByTestId for the Slack option). Update the test to import the helpers
from element-helpers.ts and use those helper function names
(clickByAriaLabel/clickNativeButton/selectItemByText) instead of direct DOM
execution so the interactions go through the shared cross-platform layer.

In `@app/test/e2e/specs/whatsapp-flow.spec.ts`:
- Around line 92-94: This spec is non-isolated because it assumes the picker is
already open; update the test "selecting WhatsApp Web triggers the webview host
pane (no real OAuth)" to perform the picker-opening setup locally before calling
stepLog('clicking WhatsApp Web tile') and clicking the tile—i.e., call the
existing helper used in the suite setup (for example openChannelPicker /
openPicker or the same sequence used in the suite's beforeEach) or replicate the
UI actions that open the picker so the test can run independently.
- Around line 73-81: This spec uses raw DOM clicks via browser.execute to find
and click the "Add app" button (the anonymous execute block) which is brittle;
replace those blocks with the shared E2E helper functions used across tests
(e.g., the helper that opens the app modal and the helper that selects provider
tiles) so interactions are cross-platform and robust—locate the anonymous
browser.execute that queries 'button' and the similar block later in the file
(the second occurrence around the provider selection) and swap them to call the
existing helpers (openAppModal/openProviderTile or their project-specific
equivalents) and assert their returned results instead of manual DOM
querying/clicking.
- Around line 106-112: The test currently only checks that the "Add account"
modal is hidden (expect textExists('Add account') toBe false) but doesn't verify
the rail/webview transition or backend effects; update the whatsapp-flow.spec.ts
test to also assert the concrete post-pick outcomes: (1) UI: confirm the account
appears in the rail (e.g., assert presence of the WhatsApp account label/tooltip
or specific rail element instead of just modal absence), and (2) backend/mock:
assert the mock backend call or effect that registers the new account was
invoked (or that the app's activeAccount state changed) by checking the mock API
spy or app state variable used in these tests; reference the existing helpers
textExists and browser.pause and extend the expect assertions to cover both UI
rail/content and the mock/backend invocation.

In `@docs/TEST-COVERAGE-MATRIX.md`:
- Line 376: The matrix entry is incorrect because the test in
app/test/e2e/specs/insights-dashboard.spec.ts only mounts the route and checks
UI affordances but does not trigger or assert the analyze action; update the
spec to either perform and assert the analyze execution (simulate the user
clicking the Analyze button and assert that the analyze API/handler is called
and results appear, e.g., by stubbing/observing the analyze function or network
request in the test) or change the TEST-COVERAGE-MATRIX.md row for "Analyze
Trigger" to a lower coverage status and map it to a different test that actually
invokes analyze; locate the test by the file name insights-dashboard.spec.ts and
the test case that checks UI affordances to implement the click+assert flow or
downgrade the matrix entry accordingly.
- Around line 205-207: Update the matrix rows 5.2.1, 5.2.2, and 5.2.3 so they do
not claim full coverage (✅) based solely on
`app/test/e2e/specs/autocomplete-flow.spec.ts`; specifically, remove or change
the ✅ for any row where the only listed spec is `autocomplete-flow.spec.ts` (it
verifies settings-page status/CTA rendering, not inline suggestion
generation/debounce/acceptance trigger), and instead list the accurate test
sources (e.g., keep
`app/src/features/autocomplete/__tests__/useAutocompleteSkillStatus.test.tsx`
where it actually covers debounce/skill status, and add/remove indicators to
reflect partial or missing coverage for inline suggestion generation and
acceptance trigger in rows 5.2.1–5.2.3).

In `@docs/TESTING-STRATEGY.md`:
- Around line 21-33: The fenced decision-tree code block in TESTING-STRATEGY.md
is missing a language tag; update the opening fence for that block (the
triple-backticks that precede the decision-tree content) to include a language
like text (i.e., change ``` to ```text) so the markdown linter MD040 is
satisfied and docs CI will pass.

---

Nitpick comments:
In `@app/test/e2e/specs/insights-dashboard.spec.ts`:
- Around line 58-109: Wrap each failing test body (e.g., the "mounts the
/intelligence..." test, the "exposes a search input..." test, and the "renders
at least one source filter affordance" test) in a try/catch so that on any
thrown error you first capture diagnostics and then rethrow; in the catch call
dumpAccessibilityTree() and capture request logs (e.g., via the browser's
network/performance logs or your existing request-logging helper) and emit them
via console.error or processLogger before rethrowing the original error; update
tests that use navigateViaHash, waitForText, and the DOM-executing blocks to
follow this pattern so failures produce both request logs and
dumpAccessibilityTree() output for CI triage.

In `@app/test/e2e/specs/memory-roundtrip.spec.ts`:
- Around line 92-93: Replace the brittle JSON string scan with structured
assertions on the recall result: inspect recallResult.result (or
recallResult.result.memories) directly and assert that the memories array exists
and contains an object whose key equals TEST_KEY or whose content includes
TEST_CONTENT (e.g., find a memory where memory.key === TEST_KEY ||
memory.content.includes(TEST_CONTENT)); apply the same change for the other
instance around TEST_KEY/TEST_CONTENT so tests assert field values/array length
and presence rather than using JSON.stringify(...).includes(...).

In `@docs/TESTING-STRATEGY.md`:
- Around line 64-66: Reword the three consecutive bullets to avoid repeating
"No" at the start: change "No order-dependent specs — each spec must pass when
run alone." to something like "Avoid order-dependent specs — each spec must pass
when run alone.", keep "No reliance on absolute coordinates or animation
timing." or rephrase to "Avoid relying on absolute coordinates or animation
timing.", and rewrite "No real keyboard via `browser.keys()` on tauri-driver —
synthesize via `browser.execute(...)` (see `command-palette.spec.ts` for the
pattern)." to "Simulate keyboard input instead of using `browser.keys()` on
tauri-driver — synthesize via `browser.execute(...)` (see
`command-palette.spec.ts` for the pattern)."

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dcf2b3b7-7d49-43c9-9ae2-5bbc0c67f78e

📥 Commits

Reviewing files that changed from the base of the PR and between 69495d4 and c067695.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (10)

app/src/components/intelligence/__tests__/utils.test.ts
app/src/features/autocomplete/__tests__/useAutocompleteSkillStatus.test.tsx
app/test/e2e/specs/autocomplete-flow.spec.ts
app/test/e2e/specs/insights-dashboard.spec.ts
app/test/e2e/specs/memory-roundtrip.spec.ts
app/test/e2e/specs/slack-flow.spec.ts
app/test/e2e/specs/whatsapp-flow.spec.ts
docs/TEST-COVERAGE-MATRIX.md
docs/TESTING-STRATEGY.md
tests/memory_roundtrip_e2e.rs

coderabbitai · 2026-04-27T12:02:01Z

+    it('filters by searchTerm across title, description, and sourceLabel', () => {
+      expect(filterItems(items, { searchTerm: 'reply' }).map(i => i.id)).toEqual(['1', '3']);
+      expect(filterItems(items, { searchTerm: 'canary' }).map(i => i.id)).toEqual(['3']);
+      expect(filterItems(items, { searchTerm: 'gmail' }).map(i => i.id)).toEqual(['1', '3']);
+    });


⚠️ Potential issue | 🟠 Major

searchTerm expectation includes sourceLabel, but implementation currently does not.

Line 122 expects 'gmail' to match via sourceLabel, but app/src/components/intelligence/utils.ts (filter implementation) only searches title and description. This will fail unless filterItems is updated.

If `sourceLabel` matching is intended, patch `filterItems`

--- a/app/src/components/intelligence/utils.ts +++ b/app/src/components/intelligence/utils.ts @@ if (options.searchTerm) { const term = options.searchTerm.toLowerCase(); filtered = filtered.filter(item => item.title.toLowerCase().includes(term) || - (item.description && item.description.toLowerCase().includes(term)) + (item.description && item.description.toLowerCase().includes(term)) || + (item.sourceLabel && item.sourceLabel.toLowerCase().includes(term)) ); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/src/components/intelligence/__tests__/utils.test.ts` around lines 119 - 123, The test expects filterItems to match searchTerm against sourceLabel as well, but the current implementation in app/src/components/intelligence/utils.ts only checks title and description; update the filterItems function so the searchTerm is compared case-insensitively against item.sourceLabel in the same way it checks title and description (e.g., normalize to lower/upper and include sourceLabel in the OR conditions or combined searchable string used for matching) so that searches like 'gmail' will return items whose sourceLabel contains that term.

False positive — filterItems in app/src/components/intelligence/utils.ts does include sourceLabel in its searchTerm filter:

filtered = filtered.filter( item => item.title.toLowerCase().includes(term) || item.description?.toLowerCase().includes(term) || item.sourceLabel?.toLowerCase().includes(term) );

The Vitest case is correct as-is.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash rg -n "sourceLabel" app/src/components/intelligence/utils.ts -C 5

Length of output: 333

@oxoxDev, you're correct — confirmed. Line 100 of app/src/components/intelligence/utils.ts already includes:

item.sourceLabel?.toLowerCase().includes(term)

My comment was a false positive. The test and the implementation are consistent. Apologies for the noise!

✏️ Learnings added

Learnt from: oxoxDev Repo: tinyhumansai/openhuman PR: 980 File: app/src/components/intelligence/__tests__/utils.test.ts:131-135 Timestamp: 2026-04-27T12:17:54.782Z Learning: In `app/src/components/intelligence/utils.ts`, the `filterItems` function's `searchTerm` filter already checks `item.title`, `item.description`, **and** `item.sourceLabel` (all case-insensitively via `.toLowerCase().includes(term)`). Do not flag `sourceLabel` as missing from the filter in future reviews of this file.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: CR Repo: tinyhumansai/openhuman PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-04-26T06:17:19.282Z Learning: Applies to app/src/**/*.test.{ts,tsx} : Co-locate unit tests as `*.test.ts` / `*.test.tsx` under `app/src/**`. Prefer behavior over implementation. Use helpers in `app/src/test/`. No real network, no time flakes.

Learnt from: graycyrus Repo: tinyhumansai/openhuman PR: 888 File: app/src/pages/Conversations.tsx:785-796 Timestamp: 2026-04-24T15:11:32.371Z Learning: In `app/src/components/intelligence/ConfirmationModal.tsx` (used by `app/src/pages/Conversations.tsx` PR `#888`), unit test coverage for the `ConfirmationModal` component (render-null guard, confirm path, cancel/backdrop path, destructive styling) was intentionally deferred to a follow-up. The rationale: `Conversations.tsx` has deep Redux/Tauri dependencies not covered by the current test store, and the confirmation flow is fully delegated to `ConfirmationModal`, making it the correct isolated test target. Tests should be placed at `app/src/components/intelligence/ConfirmationModal.test.tsx` using Vitest + `testing-library/react`.

Learnt from: graycyrus Repo: tinyhumansai/openhuman PR: 772 File: app/src/pages/onboarding/steps/WelcomeStep.tsx:86-86 Timestamp: 2026-04-22T08:39:02.109Z Learning: In `app/src/pages/onboarding/steps/WelcomeStep.tsx`, the `OnboardingNextButton` label on `WelcomeStep` has always been `"Let's Start"` on `main` (not `"Continue"`). This is a pre-existing state and should not be flagged as a regression in any future PR. The E2E helper `clickFirstMatch(['Continue'])` returns `null` gracefully when no match is found, so the label mismatch is non-blocking for the test flow.

Learnt from: CR Repo: tinyhumansai/openhuman PR: 0 File: AGENTS.md:0-0 Timestamp: 2026-04-26T17:30:20.577Z Learning: Applies to app/test/e2e/specs/*.spec.ts : Assert both UI outcomes and backend/mock effects in E2E tests when relevant; add failure diagnostics (request logs, `dumpAccessibilityTree()`) for faster debugging

Learnt from: CR Repo: tinyhumansai/openhuman PR: 0 File: AGENTS.md:0-0 Timestamp: 2026-04-26T17:30:20.577Z Learning: Ship enough unit tests and coverage for behavior before building additional features on top of it — treat untested code as incomplete and do not accumulate depth on a shaky base

… merge Cargo.lock version field caught up to package version; mechanical bump generated by `cargo check` in this worktree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…flake (tinyhumansai#773) Add `vi.useFakeTimers()` + `vi.setSystemTime('2026-04-27T12:00:00Z')` so the time-bucket cases for `groupItemsByTime` cannot drift across midnight. Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…via Redux (tinyhumansai#773) - Each `it` independently navigates and re-opens the Add Account modal, so the spec is runnable in isolation (was previously coupled). - Replace ad-hoc `browser.execute` tile click with the shared `clickButton('WhatsApp Web')` helper. - Replace bare modal-dismiss assertion with a Redux-backed registration check (`accounts.accounts[id].provider === 'whatsapp'`) plus a strict DOM fallback when the store global is not exposed in production builds. - Extract `openAddAccountModal()` into a local helper documenting why the "Add app" button still requires a `browser.execute` aria-label selector (no helper for aria-label-only buttons today; tracked follow-up). Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… Redux (tinyhumansai#773) Mirror the WhatsApp fixes: each `it` re-establishes its modal state independently, the tile click goes through `clickButton('Slack')`, and the registration assertion checks Redux (`provider === 'slack'`) with a strict DOM fallback when the store global is not exposed. Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…vacuously (tinyhumansai#773) The clear-namespace case previously relied on the prior `it` having seeded a canary. Run in isolation, it would pass even if clear was broken (the namespace would already be empty). Seed-and-verify now happens inside the same test before clearing. Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…umansai#773) The Memory tab uses `<input id="actionable-search">` and a `<select id="actionable-source">` (not provider chips). The previous assertion scanned all buttons for any provider keyword and could false-pass on unrelated controls. - Search-input case targets `#actionable-search` by id and uses a React-aware value setter so the controlled input reflects the change. - Source-filter case targets `#actionable-source` by id and verifies the canonical "All Sources" option exists. Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…yhumansai#773) - Rewrite the file-level comment so the spec no longer claims to exercise inline-suggestion generation, debounce timing, or acceptance trigger. It only proves that the AutocompletePanel mounts, the status pill renders, and the matching CTA renders. - The CTA test now re-navigates to `/settings/autocomplete` so it is runnable in isolation. The matrix update in this PR also downgrades 5.2.1 / 5.2.3 to 🟡 to reflect the narrower scope. Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ence (tinyhumansai#773) - Matrix: 5.2.1 inline suggestion + 5.2.3 acceptance trigger move ✅ → 🟡 to reflect that the WDIO spec only exercises the settings surface. Real inline-gen requires macOS TCC grants and lives in the manual smoke checklist (tinyhumansai#971). - Matrix: 11.1.3 analyze trigger moves ✅ → 🟡 — the insights dashboard spec mounts the route and asserts UI affordances but does not invoke the analyze handler. Full trigger flow tracked as follow-up. - Strategy: add `text` language tag to the decision-tree fenced block to satisfy markdownlint MD040. Addresses CodeRabbit review on PR tinyhumansai#980. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/test/e2e/specs/slack-flow.spec.ts`:
- Around line 115-123: The execute callback always returns a boolean today so
the `registered === undefined` fallback is dead; change the callback (the
anonymous function passed to browser.execute that references winAny and
__OPENHUMAN_STORE__) to explicitly return undefined when the global store is
missing (e.g. check `'__OPENHUMAN_STORE__' in window` or
`winAny.__OPENHUMAN_STORE__ == null` and return undefined), otherwise compute
and return the boolean as before, so the `registered === undefined` branch can
run and trigger the DOM fallback.

In `@app/test/e2e/specs/whatsapp-flow.spec.ts`:
- Around line 125-133: The browser.execute callback always returns a boolean so
the subsequent check for registered === undefined never runs; update the
callback (the function using winAny and __OPENHUMAN_STORE__) to return undefined
when __OPENHUMAN_STORE__ is missing (i.e., detect absence and return undefined
instead of false), leaving it returning true/false only when the store exists so
the existing registered === undefined branch will be reachable and the DOM
fallback will run; locate the callback referenced as browser.execute that builds
state/accounts and adjust its return values accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 96bc8041-0e72-40e1-887b-d904fc5a9bbc

📥 Commits

Reviewing files that changed from the base of the PR and between c067695 and ca51764.

⛔ Files ignored due to path filters (1)

app/src-tauri/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (8)

app/src/components/intelligence/__tests__/utils.test.ts
app/test/e2e/specs/autocomplete-flow.spec.ts
app/test/e2e/specs/insights-dashboard.spec.ts
app/test/e2e/specs/memory-roundtrip.spec.ts
app/test/e2e/specs/slack-flow.spec.ts
app/test/e2e/specs/whatsapp-flow.spec.ts
docs/TEST-COVERAGE-MATRIX.md
docs/TESTING-STRATEGY.md

✅ Files skipped from review due to trivial changes (2)

docs/TEST-COVERAGE-MATRIX.md
app/test/e2e/specs/memory-roundtrip.spec.ts

🚧 Files skipped from review as they are similar to previous changes (3)

app/test/e2e/specs/autocomplete-flow.spec.ts
app/test/e2e/specs/insights-dashboard.spec.ts
app/src/components/intelligence/tests/utils.test.ts

coderabbitai · 2026-04-27T12:21:41Z

+    const registered = await browser.execute(() => {
+      const winAny = window as unknown as { __OPENHUMAN_STORE__?: { getState: () => unknown } };
+      const state = winAny.__OPENHUMAN_STORE__?.getState() as
+        | { accounts?: { accounts?: Record<string, { provider?: string }> } }
+        | undefined;
+      const accounts = state?.accounts?.accounts ?? {};
+      return Object.values(accounts).some(a => a.provider === 'slack');
+    });
+    if (registered === undefined) {


⚠️ Potential issue | 🟠 Major

Store-unavailable fallback cannot execute with current return logic.

The callback on Line 115 always yields a boolean, so registered === undefined (Line 123) is dead code. In builds without __OPENHUMAN_STORE__, this returns false and fails rather than taking the DOM fallback.

Proposed fix

- const registered = await browser.execute(() => { + const registered = await browser.execute(() => { const winAny = window as unknown as { __OPENHUMAN_STORE__?: { getState: () => unknown } }; - const state = winAny.__OPENHUMAN_STORE__?.getState() as + const store = winAny.__OPENHUMAN_STORE__; + if (!store || typeof store.getState !== 'function') { + return undefined; + } + const state = store.getState() as | { accounts?: { accounts?: Record<string, { provider?: string }> } } | undefined; const accounts = state?.accounts?.accounts ?? {}; return Object.values(accounts).some(a => a.provider === 'slack'); });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const registered = await browser.execute(() => {

const winAny = window as unknown as { __OPENHUMAN_STORE__?: { getState: () => unknown } };

const state = winAny.__OPENHUMAN_STORE__?.getState() as

| { accounts?: { accounts?: Record<string, { provider?: string }> } }

| undefined;

const accounts = state?.accounts?.accounts ?? {};

return Object.values(accounts).some(a => a.provider === 'slack');

});

if (registered === undefined) {

const registered = await browser.execute(() => {

const winAny = window as unknown as { __OPENHUMAN_STORE__?: { getState: () => unknown } };

const store = winAny.__OPENHUMAN_STORE__;

if (!store || typeof store.getState !== 'function') {

return undefined;

}

const state = store.getState() as

| { accounts?: { accounts?: Record<string, { provider?: string }> } }

| undefined;

const accounts = state?.accounts?.accounts ?? {};

return Object.values(accounts).some(a => a.provider === 'slack');

});

if (registered === undefined) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/test/e2e/specs/slack-flow.spec.ts` around lines 115 - 123, The execute callback always returns a boolean today so the `registered === undefined` fallback is dead; change the callback (the anonymous function passed to browser.execute that references winAny and __OPENHUMAN_STORE__) to explicitly return undefined when the global store is missing (e.g. check `'__OPENHUMAN_STORE__' in window` or `winAny.__OPENHUMAN_STORE__ == null` and return undefined), otherwise compute and return the boolean as before, so the `registered === undefined` branch can run and trigger the DOM fallback.

coderabbitai · 2026-04-27T12:21:41Z

+    const registered = await browser.execute(() => {
+      const winAny = window as unknown as { __OPENHUMAN_STORE__?: { getState: () => unknown } };
+      const state = winAny.__OPENHUMAN_STORE__?.getState() as
+        | { accounts?: { accounts?: Record<string, { provider?: string }> } }
+        | undefined;
+      const accounts = state?.accounts?.accounts ?? {};
+      return Object.values(accounts).some(a => a.provider === 'whatsapp');
+    });
+    if (registered === undefined) {


⚠️ Potential issue | 🟠 Major

Fallback branch is unreachable due to always-boolean return.

Line 125’s browser.execute always returns true/false, so registered === undefined on Line 133 never triggers. If __OPENHUMAN_STORE__ is missing, this currently returns false and fails instead of using the DOM fallback.

Proposed fix

- const registered = await browser.execute(() => { + const registered = await browser.execute(() => { const winAny = window as unknown as { __OPENHUMAN_STORE__?: { getState: () => unknown } }; - const state = winAny.__OPENHUMAN_STORE__?.getState() as + const store = winAny.__OPENHUMAN_STORE__; + if (!store || typeof store.getState !== 'function') { + return undefined; + } + const state = store.getState() as | { accounts?: { accounts?: Record<string, { provider?: string }> } } | undefined; const accounts = state?.accounts?.accounts ?? {}; return Object.values(accounts).some(a => a.provider === 'whatsapp'); });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/test/e2e/specs/whatsapp-flow.spec.ts` around lines 125 - 133, The browser.execute callback always returns a boolean so the subsequent check for registered === undefined never runs; update the callback (the function using winAny and __OPENHUMAN_STORE__) to return undefined when __OPENHUMAN_STORE__ is missing (i.e., detect absence and return undefined instead of false), leaving it returning true/false only when the store exists so the existing registered === undefined branch will be reachable and the DOM fallback will run; locate the callback referenced as browser.execute that builds state/accounts and adjust its return values accordingly.

graycyrus

Solid backbone PR — the matrix structure is clean, the strategy doc is well-organized, and the Vitest tests are accurate against the actual hook implementation. The memory Rust integration test is a good pattern addition. A few issues worth addressing before this merges.

The main concern is that the "Redux registration assertion" in the Slack and WhatsApp specs is dead code in CI. The __OPENHUMAN_STORE__ global is never assigned anywhere in production app code (confirmed by searching app/src/ — it's not in store/index.ts or anywhere else). The registered === undefined branch always fires in CI, so both specs reduce to expect(await textExists('Slack')).toBe(true) — which the preceding it already proved. The matrix marks 10.1.2 and 10.1.4 as fully covered (was ❌), but the registration step (the meaningful half of the second test case) is actually untested. Either expose the store on window in a test-only build, or drop the Redux fallback and replace it with a stronger DOM assertion that can actually distinguish the post-pick rail state from the pre-pick modal.

The openAddAccountModal duplication is a readability concern more than a correctness one, but given that shared-flows.ts exists for exactly this, it should go there.

Reposting with proper inline comments

graycyrus

Solid backbone PR — the matrix structure is clean, the strategy doc is well-organized, and the Vitest tests are accurate against the actual hook implementation. The memory Rust integration test is a good pattern addition. A few issues worth addressing before this merges.

The main concern is that the "Redux registration assertion" in the Slack and WhatsApp specs is dead code in CI. __OPENHUMAN_STORE__ is never assigned anywhere in production app code (confirmed by searching app/src/ — it's not in store/index.ts or anywhere else). The registered === undefined branch always fires in CI, so both specs reduce to expect(await textExists('Slack')).toBe(true) — which the preceding it already proved. The matrix marks 10.1.2 and 10.1.4 as fully covered (was missing), but the registration step (the meaningful half of the second test case) is actually untested. Either expose the store on window in a test-only build, or drop the Redux fallback and replace it with a stronger DOM assertion that can actually distinguish the post-pick rail state from the pre-pick modal.

The openAddAccountModal duplication is a readability concern more than a correctness one, but given that shared-flows.ts exists for exactly this, it should go there.

…humansai#773) Matches the `tests/live_routing_e2e.rs` pattern: `std::env::set_var` and `std::env::remove_var` became `unsafe` in Rust 1.83 because process-global env mutation can race with FFI readers. EnvVarGuard already serialises via `env_lock()`, so wrap the calls in `unsafe { ... }` and document the critical-section invariant in `// SAFETY:` comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…inyhumansai#773) `docs/RELEASE-MANUAL-SMOKE.md` is created in tinyhumansai#971; until that lands the inline link 404s. Make both references explicit about the dependency so readers don't hit a dead link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tinyhumansai#773) WDIO specs in app/test/e2e/specs/*.spec.ts assert state-changes by reading `window.__OPENHUMAN_STORE__.getState()` (e.g. account registration after picking a provider tile). The handle was previously not assigned, so the assertion always missed and tests fell back to weak DOM-only checks. Expose the store via a stable `__OPENHUMAN_STORE__` key. The store handle contains no secrets that aren't already present in the renderer's memory; this only surfaces the existing instance under a namespaced global so specs can perform Redux-level assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ssertion (tinyhumansai#773) - Move `openAddAccountModal` into `app/test/e2e/helpers/shared-flows.ts`; Slack and WhatsApp specs now import the single shared definition instead of carrying byte-identical copies of the helper. - Replace the dead `registered === undefined` fallback in both specs with `browser.waitUntil` over the Redux selector. The previous check was unreachable: `browser.execute` returns the callback's value (always boolean), so `=== undefined` never matched and the fallback never fired. With the store exposed on `window.__OPENHUMAN_STORE__`, the wait now fails loudly if registration never lands instead of silently passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

app/test/e2e/helpers/shared-flows.ts (1)

44-47: Add richer diagnostics on modal-open failures.

Line 45/47 failures currently throw with minimal context. Include dumpAccessibilityTree() (and optionally current hash) before throwing to speed up E2E triage.

Suggested improvement

   if (!opened) {
-    throw new Error('Could not locate Add app button on /accounts');
+    const tree = await dumpAccessibilityTree();
+    throw new Error(
+      `Could not locate Add app button on /accounts. Accessibility tree:\n${tree.slice(0, 4000)}`
+    );
   }
-  await waitForText('Add account', 5_000);
+  try {
+    await waitForText('Add account', 5_000);
+  } catch (err) {
+    const tree = await dumpAccessibilityTree();
+    const msg = err instanceof Error ? err.message : String(err);
+    throw new Error(`Add account modal did not appear after click: ${msg}\n${tree.slice(0, 4000)}`);
+  }

Based on learnings: Assert both UI outcomes and backend/mock effects in E2E tests when relevant; add failure diagnostics (request logs, dumpAccessibilityTree()) for faster debugging.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/test/e2e/helpers/shared-flows.ts` around lines 44 - 47, When the
modal-open check for the Add app button (the opened variable) fails, capture
richer diagnostics before throwing: call dumpAccessibilityTree() and record the
current route/hash (e.g., via page.url() or location.hash) and include those
details in the thrown Error message so they’re available in test logs; update
the block around opened, and keep the existing waitForText('Add account', 5_000)
usage intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/store/index.ts`:
- Around line 75-81: Guard the global exposure of window.__OPENHUMAN_STORE__ so
it only occurs in non-production/testing builds: import the centralized flag
(e.g. IS_DEV or IS_E2E) from the app config module, wrap the existing assignment
in an if (IS_DEV || IS_E2E) check, and instead of direct assignment use
Object.defineProperty(window, "__OPENHUMAN_STORE__", { value: store, writable:
false, configurable: false, enumerable: false }) to make it read-only and
non-enumerable; keep the typeof window !== 'undefined' check and the reference
to the Redux store symbol named store unchanged.

In `@tests/memory_roundtrip_e2e.rs`:
- Around line 163-174: The test currently only unwraps the transport Result from
memory_recall_memories(recall_request()) but can still receive an in-band RPC
error envelope; before asserting the absence of CONTENT and KEY in post_blob,
add an explicit assertion that the RPC response is not an error envelope (e.g.,
assert that post has no error field or that post.error.is_none()/post.is_ok()
depending on the response type) referencing the returned value from
memory_recall_memories (the variable post and its post.value) so the
negative-content checks only run for successful RPC results.

---

Nitpick comments:
In `@app/test/e2e/helpers/shared-flows.ts`:
- Around line 44-47: When the modal-open check for the Add app button (the
opened variable) fails, capture richer diagnostics before throwing: call
dumpAccessibilityTree() and record the current route/hash (e.g., via page.url()
or location.hash) and include those details in the thrown Error message so
they’re available in test logs; update the block around opened, and keep the
existing waitForText('Add account', 5_000) usage intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 45295fac-4d80-4755-99e0-df196bf48b82

📥 Commits

Reviewing files that changed from the base of the PR and between ca51764 and 6f85b2e.

📒 Files selected for processing (6)

app/src/store/index.ts
app/test/e2e/helpers/shared-flows.ts
app/test/e2e/specs/slack-flow.spec.ts
app/test/e2e/specs/whatsapp-flow.spec.ts
docs/TESTING-STRATEGY.md
tests/memory_roundtrip_e2e.rs

🚧 Files skipped from review as they are similar to previous changes (2)

app/test/e2e/specs/whatsapp-flow.spec.ts
app/test/e2e/specs/slack-flow.spec.ts

coderabbitai · 2026-04-28T01:56:02Z

+// Expose the store on `window` so WDIO E2E specs can read Redux state directly
+// to assert backing-state changes (see app/test/e2e/specs/*.spec.ts). The store
+// holds no secrets that aren't already in the renderer's memory; this only
+// surfaces the existing handle under a stable, namespaced key.
+if (typeof window !== 'undefined') {
+  (window as unknown as { __OPENHUMAN_STORE__?: typeof store }).__OPENHUMAN_STORE__ = store;
+}


⚠️ Potential issue | 🟠 Major

Restrict global store exposure to test/dev builds.

Line 79 currently exposes window.__OPENHUMAN_STORE__ in all renderer runtimes, not just E2E contexts. That widens read/write access to full Redux state in packaged app sessions. Please gate this behind a non-production config flag (at minimum IS_DEV, or a dedicated E2E flag from app/src/utils/config) and define it as non-writable to reduce accidental mutation.

Suggested hardening

-// Expose the store on `window` so WDIO E2E specs can read Redux state directly +// Expose the store on `window` only in non-production contexts so WDIO E2E +// specs can read Redux state directly. // to assert backing-state changes (see app/test/e2e/specs/*.spec.ts). The store // holds no secrets that aren't already in the renderer's memory; this only // surfaces the existing handle under a stable, namespaced key. -if (typeof window !== 'undefined') { - (window as unknown as { __OPENHUMAN_STORE__?: typeof store }).__OPENHUMAN_STORE__ = store; +if (typeof window !== 'undefined' && IS_DEV) { + Object.defineProperty(window as Window & { __OPENHUMAN_STORE__?: typeof store }, '__OPENHUMAN_STORE__', { + value: store, + writable: false, + enumerable: false, + configurable: true, + }); }

As per coding guidelines: “Frontend config is centralized in app/src/utils/config.ts. Read VITE_* there and re-export — never import import.meta.env directly elsewhere.”

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/src/store/index.ts` around lines 75 - 81, Guard the global exposure of window.__OPENHUMAN_STORE__ so it only occurs in non-production/testing builds: import the centralized flag (e.g. IS_DEV or IS_E2E) from the app config module, wrap the existing assignment in an if (IS_DEV || IS_E2E) check, and instead of direct assignment use Object.defineProperty(window, "__OPENHUMAN_STORE__", { value: store, writable: false, configurable: false, enumerable: false }) to make it read-only and non-enumerable; keep the typeof window !== 'undefined' check and the reference to the Redux store symbol named store unchanged.

coderabbitai · 2026-04-28T01:56:02Z

+    let post = memory_recall_memories(recall_request())
+        .await
+        .expect("post-clear recall");
+    let post_blob = serde_json::to_string(&post.value).expect("serialise post");
+    assert!(
+        !post_blob.contains(CONTENT),
+        "canary content must be absent after clear — got {post_blob}"
+    );
+    assert!(
+        !post_blob.contains(KEY),
+        "canary key must be absent after clear — got {post_blob}"
+    );


⚠️ Potential issue | 🟠 Major

Assert recall success before negative-content checks.

On Line 163, memory_recall_memories(...).await.expect(...) only validates transport-level Result. The RPC can still return an in-band error envelope, and Line 168/Line 172 negative substring assertions would then pass falsely. Add an explicit “not error envelope” assertion before absence checks.

Suggested patch

+fn assert_recall_succeeded(serialized: &str, phase: &str) { + assert!( + !serialized.contains("memory.recall_memories_failed"), + "{phase} recall returned error envelope: {serialized}" + ); +} + @@ let recall_outcome = memory_recall_memories(recall_request()) .await .expect("memory_recall_memories rpc"); let serialised = serde_json::to_string(&recall_outcome.value).expect("serialise recall envelope"); + assert_recall_succeeded(&serialised, "happy-path"); @@ let pre = memory_recall_memories(recall_request()) .await .expect("pre-clear recall"); let pre_blob = serde_json::to_string(&pre.value).expect("serialise pre"); + assert_recall_succeeded(&pre_blob, "pre-clear"); @@ let post = memory_recall_memories(recall_request()) .await .expect("post-clear recall"); let post_blob = serde_json::to_string(&post.value).expect("serialise post"); + assert_recall_succeeded(&post_blob, "post-clear"); assert!( !post_blob.contains(CONTENT), "canary content must be absent after clear — got {post_blob}" );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/memory_roundtrip_e2e.rs` around lines 163 - 174, The test currently only unwraps the transport Result from memory_recall_memories(recall_request()) but can still receive an in-band RPC error envelope; before asserting the absence of CONTENT and KEY in post_blob, add an explicit assertion that the RPC response is not an error envelope (e.g., assert that post has no error field or that post.error.is_none()/post.is_ok() depending on the response type) referencing the returned value from memory_recall_memories (the variable post and its post.value) so the negative-content checks only run for successful RPC results.

graycyrus

All findings addressed — unsafe blocks added, dead-link annotated, Redux store exposed for E2E, and the shared helper extracted. CI green across the board. Nice work landing the backbone for the test coverage epic.

oxoxDev and others added 10 commits April 27, 2026 17:08

oxoxDev requested a review from a team April 27, 2026 11:55

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

oxoxDev and others added 8 commits April 27, 2026 17:46

chore(deps): align app/src-tauri/Cargo.lock to v0.53.3 after upstream…

110598b

… merge Cargo.lock version field caught up to package version; mechanical bump generated by `cargo check` in this worktree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

graycyrus previously requested changes Apr 27, 2026

View reviewed changes

Comment thread app/test/e2e/specs/slack-flow.spec.ts

Comment thread app/test/e2e/specs/whatsapp-flow.spec.ts

Comment thread app/test/e2e/specs/slack-flow.spec.ts

Comment thread tests/memory_roundtrip_e2e.rs

Comment thread docs/TESTING-STRATEGY.md

graycyrus requested changes Apr 27, 2026

View reviewed changes

Comment thread app/test/e2e/specs/slack-flow.spec.ts Outdated

Comment thread app/test/e2e/specs/whatsapp-flow.spec.ts Outdated

Comment thread app/test/e2e/specs/slack-flow.spec.ts Outdated

Comment thread tests/memory_roundtrip_e2e.rs Outdated

Comment thread docs/TESTING-STRATEGY.md Outdated

oxoxDev and others added 4 commits April 28, 2026 07:20

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

graycyrus approved these changes Apr 28, 2026

View reviewed changes

graycyrus merged commit 4f9da9e into tinyhumansai:main Apr 28, 2026
9 checks passed

Conversation

oxoxDev commented Apr 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oxoxDev Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

oxoxDev commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

coderabbitai Bot Apr 27, 2026 •

edited

Loading