feat(browser): PR 7 — co-pilot mode (driving, annotations, capability prompts)#307
Conversation
…arent-overlay sentinels)
…rip wash, driving pill
…s-bridge Extend ALLOWED_EVENT_KINDS with driving-state and capability-needed. driving-state events flip the browser-agent-store drivingState and start a 30 s decay timer (DRIVING_DECAY_MS) that auto-flips back to idle; successive events reset the timer; close() cancels it. capability-needed events dispatch a taos-browser:capability-prompt CustomEvent for CapabilityPromptModal to consume. Neither kind calls opts.onEvent (they use dedicated handlers). 8 new tests; 692 total.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (16)
📝 WalkthroughWalkthroughThis PR implements an end-to-end agent capability and browser driving system. It adds permission grants for agents to control browser tabs, a capability request modal, a management panel, visual indicators (annotations, banners, tints), server-side drive-session tracking, and WebSocket infrastructure to bridge agent operations with iframe event handling. ChangesAgent Capability & Driving System
Sequence Diagram(s)sequenceDiagram
participant User
participant Browser as Browser UI
participant BrowserStore as Zustand Store
participant Server as Backend
participant Iframe as iFrame
User->>Browser: Agent requests drive permission
Browser->>Browser: Dispatch CustomEvent<br/>(taos-browser:capability-prompt)
Browser->>+Browser: CapabilityPromptModal listens
Browser-->>User: Show permission dialog<br/>(page/session/always)
User->>Browser: Select "This site (always)"
Browser->>+Server: POST /api/desktop/browser/capabilities<br/>(grant with null expiry)
Server->>Server: Validate permissions<br/>Store grant
Server-->>-Browser: 200 {"granted": true}
Browser->>-Browser: Close modal, refresh grants
Note over User,Iframe: Agent begins driving
Iframe->>Server: WebSocket: Send drive op<br/>(click, type, etc.)
Server->>Server: Validate capability grant<br/>(host_pattern matches)
Server->>Server: Bump drive_session<br/>Set drivingState="driving"
Server->>Browser: WebSocket event:<br/>driving-state (driving)
Browser->>BrowserStore: Update drivingState<br/>Start 30s decay timer
Browser->>Browser: Render CoPilotBanner<br/>Green tint overlay
Server->>Iframe: Forward op to iframe
Iframe->>Iframe: Execute DOM action<br/>(click, type, scroll)
Iframe->>Browser: Post driving-state event<br/>(iframe-side confirmation)
User->>Browser: Click "Take back" button
Browser->>Server: POST revoke + DELETE session
Server->>Server: Remove grant & session
Browser->>BrowserStore: setDrivingState="idle"<br/>Cancel decay timer
Browser->>Browser: Hide banner & tint
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| if (!iso) return "Never"; | ||
| const ms = new Date(iso).getTime() - Date.now(); | ||
| if (ms < 0) return "Expired"; | ||
| const hours = Math.floor(ms / (60 * 60 * 1000)); |
There was a problem hiding this comment.
WARNING: Potential runtime error if iso is an invalid date string, new Date(iso).getTime() returns NaN, leading to NaN calculations and malformed expiry strings.
| } | ||
|
|
||
| useEffect(() => { | ||
| load(); |
There was a problem hiding this comment.
WARNING: Missing error handling; if listCapabilities or listAgents reject, the promise rejection is unhandled, potentially leaving the UI in loading state.
| return; | ||
| } | ||
| // Refresh list | ||
| const fresh = await listCapabilities(profileId); |
There was a problem hiding this comment.
WARNING: Missing error handling; if listCapabilities rejects, the promise is unhandled, potentially causing unhandled rejection.
| useBrowserAgentStore.getState().setDrivingState(windowId, tabId, agentId, "idle"); | ||
| }; | ||
|
|
||
| const handleTakeBack = async () => { |
There was a problem hiding this comment.
WARNING: Missing error handling; if revokeCapability or unpinAgent reject, the promises are unhandled, potentially leaving inconsistent state.
|
|
||
| useEffect(() => { | ||
| function handler(e: Event) { | ||
| const ce = e as CustomEvent<CapabilityPromptDetail>; |
There was a problem hiding this comment.
WARNING: Unsafe type assertion; if the event is not a CustomEvent with the expected detail type, accessing properties later could cause runtime errors.
| @@ -180,7 +198,7 @@ export function Chrome({ windowId }: ChromeProps) { | |||
| <Settings size={16} /> | |||
| </button> | |||
There was a problem hiding this comment.
WARNING: Breaking API change; SettingsPanel now requires a profileId prop, which could cause runtime errors if this component is used elsewhere without it.
Code Review SummaryStatus: 5 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)CRITICALWARNING
SUGGESTION
Other Observations (not in diff)Issues found in unchanged code that cannot receive inline comments:
Files Reviewed (35 files)
Fix these issues in Kilo Cloud Reviewed by grok-code-fast-1:optimized:free · 494,357 tokens |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
desktop/src/apps/BrowserApp/BrowserApp.tsx (1)
78-126:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRender
CapabilityPromptModalin the mobile branch as well.The modal is currently mounted only in the non-mobile path (Line 145). In mobile layout, capability-needed events have no prompt UI, so users cannot grant permissions there.
Proposed fix
if (isMobile) { return ( <div className="flex flex-col h-full bg-shell-bg overflow-hidden relative"> @@ <div className="flex items-center gap-1 px-2 py-1 bg-shell-surface border-t border-shell-border-subtle"> @@ </div> + <CapabilityPromptModal /> </div> ); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@desktop/src/apps/BrowserApp/BrowserApp.tsx` around lines 78 - 126, The mobile branch in BrowserApp (when isMobile is true) does not render the CapabilityPromptModal, so capability-needed events have no UI; update the mobile JSX path to also mount <CapabilityPromptModal ... /> (same props as used in the non-mobile branch) so the modal appears on mobile—locate the mobile return block in BrowserApp.tsx and add the CapabilityPromptModal component alongside WindowChooser/TabRenderer so it uses the same handlers/props as the desktop branch.desktop/src/apps/BrowserApp/SettingsPanel.tsx (1)
51-57:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPrevent Escape from closing both dialogs when the capabilities modal is open.
Line 53 always closes
SettingsPanel. WhencapsOpenis true, Escape also triggers the child modal close, so both layers dismiss at once.Proposed fix
useEffect(() => { const handler = (e: KeyboardEvent) => { - if (e.key === "Escape") onClose(); + if (e.key !== "Escape") return; + if (capsOpen) { + setCapsOpen(false); + return; + } + onClose(); }; window.addEventListener("keydown", handler); return () => window.removeEventListener("keydown", handler); - }, [onClose]); + }, [onClose, capsOpen]);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@desktop/src/apps/BrowserApp/SettingsPanel.tsx` around lines 51 - 57, The Escape key handler in SettingsPanel's useEffect always calls onClose, which lets Escape dismiss both the panel and a child capabilities modal; change the handler so it only calls onClose when capsOpen is false (i.e., if (e.key === "Escape" && !capsOpen) onClose()). Also add capsOpen to the useEffect dependency array so the handler sees updates to that flag; reference the useEffect block, the handler function, the capsOpen boolean, and onClose callback when making the change.
🧹 Nitpick comments (4)
desktop/src/apps/BrowserApp/AgentPresencePill.tsx (1)
152-167: 💤 Low valueConsider whether
animate-pingreads as "driving" or just looks busy.
animate-pingproduces a single outward-expanding fade, not a faster pulse — visually it can look like a notification ping rather than an intensified watching state. If the goal is "more emphatic than watching", a brighter colour with the sameanimate-pulse(or a slightly larger dot) usually communicates that more clearly. Purely cosmetic; safe to defer to design review.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@desktop/src/apps/BrowserApp/AgentPresencePill.tsx` around lines 152 - 167, The presence-dot span currently uses animate-ping for isDriving which reads like a notification ping; change it to use the same pulse animation with a brighter color and optionally a slightly larger size to convey "more emphatic than watching." Specifically, update the class array in the span (data-testid="presence-dot") so the isDriving branch uses "bg-green-300 animate-pulse" (or similar brighter bg class) and optionally "w-2.5 h-2.5" instead of animate-ping; leave the anyWatching branch as "bg-green-400 animate-pulse" and the idle branch unchanged.desktop/src/stores/browser-agent-store.test.ts (1)
408-415: 💤 Low valueTest name vs. scenario mismatch.
The case name says "multiple driving agents" but only
agent-bis driving (agent-ais set toidle). To genuinely exercise multi-driver behaviour, set both agents to"driving"and assert the result is one of the two:- s.setDrivingState("win-1", "tab-1", "agent-a", "idle"); + s.setDrivingState("win-1", "tab-1", "agent-a", "driving"); s.setDrivingState("win-1", "tab-1", "agent-b", "driving"); const result = useBrowserAgentStore.getState().isAnyDriving("win-1", "tab-1"); - expect(result).toBe("agent-b"); + expect(["agent-a", "agent-b"]).toContain(result);Otherwise the name is misleading — consider renaming to
"isAnyDriving picks the driving agent when others are idle".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@desktop/src/stores/browser-agent-store.test.ts` around lines 408 - 415, The test title and scenario disagree: update the test around useBrowserAgentStore to match intent — either rename the it() description from "isAnyDriving returns one of multiple driving agents" to "isAnyDriving picks the driving agent when others are idle" if you want to keep agent-a idle, or make both agents driving by calling s.setDrivingState("win-1","tab-1","agent-a","driving") and s.setDrivingState("win-1","tab-1","agent-b","driving") and change the assertion to accept either agent (e.g., expect(["agent-a","agent-b"]).toContain(useBrowserAgentStore.getState().isAnyDriving("win-1","tab-1"))); ensure you modify the test that uses useBrowserAgentStore, setDrivingState and isAnyDriving accordingly.tinyagentos/routes/desktop_browser/copilot_ws.py (1)
281-285: 💤 Low valueStale comment — ack handling is now in this PR.
The
# Drive ack events come in PR 7.comment predates this change; ack routing is wired below at lines 329–334. Worth dropping the parenthetical to avoid future-reader confusion.-# Allowed event kinds from iframe → server. Drive ack events come in PR 7. +# Allowed event kinds from iframe → server. 'ack' is routed back to the agent runtime.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tinyagentos/routes/desktop_browser/copilot_ws.py` around lines 281 - 285, The inline comment above _ALLOWED_EVENT_KINDS is stale; remove the parenthetical "Drive ack events come in PR 7." and update the comment to simply state that these are the allowed event kinds from iframe→server (e.g., replace the line starting with "# Allowed event kinds..." so it no longer references PR 7), since ack handling is already implemented in the copilot_ws routing (see _ALLOWED_EVENT_KINDS and the ack routing code).tests/routes/desktop_browser/test_copilot_agent_ws.py (1)
227-230: ⚡ Quick winReplace fixed sleeps with predicate/event waits to reduce flaky WS tests.
Hardcoded
sleep(0.05)makes these tests timing-sensitive under slow CI runners. A small wait helper that polls a concrete condition (e.g., mock await count) is more deterministic.♻️ Example pattern
+def _wait_until(predicate, timeout_s=1.0, interval_s=0.01): + import time + deadline = time.time() + timeout_s + while time.time() < deadline: + if predicate(): + return + time.sleep(interval_s) + raise AssertionError("condition not met before timeout") with ws_client.websocket_connect( f"/api/desktop/browser/copilot-agent?ticket={ticket}" ) as ws: ws.send_json(op_msg) - import time - time.sleep(0.05) + _wait_until(lambda: mock_iframe_ws.send_json.await_count == 1)Also applies to: 268-270, 305-307, 572-574, 626-628
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/routes/desktop_browser/test_copilot_agent_ws.py` around lines 227 - 230, Replace the fragile time.sleep(0.05) calls with a small polling wait helper (e.g., wait_for(predicate, timeout=1.0, interval=0.01)) that waits until a concrete condition is true; implement wait_for once in the test module and use it where the code currently calls time.sleep(0.05) by passing a predicate that checks the real signal (for example: lambda: my_mock.await_count >= expected or lambda: mock_ws.send.call_count > 0), and swap each time.sleep(0.05) occurrence (the literal time.sleep calls) to wait_for with an appropriate predicate and timeout.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@desktop/src/apps/BrowserApp/AgentCapabilitiesPanel.tsx`:
- Around line 33-44: Wrap the async load() with a request-version guard and
error handling: generate a local requestId (or capture current profileId) at
start of load(), run listCapabilities(profileId) and listAgents() inside
try/catch, and on success only call setGrants and setAgents if the captured
requestId/profileId still matches the latest one (or component is mounted) to
avoid stale overwrites; on error catch and handle the rejection (e.g., set an
error state or stop loading and log) so a rejected request doesn't leave the
panel stuck. Ensure the guard logic is coordinated with the existing
useEffect([profileId]) so each invocation of load() validates against the latest
profileId before mutating state.
In `@desktop/src/apps/BrowserApp/AnnotationLayer.tsx`:
- Around line 13-15: The marker ID and arrowhead fill are causing collisions and
wrong colors: replace the static MARKER_ID with a per-instance unique id (e.g.,
generate inside the AnnotationLayer component via React's useId or a uuid and
store as markerId) and update all references to use that markerId (both the
<marker id=...> and markerEnd="url(#...)" usages); also stop hardcoding the
arrowhead fill to DEFAULT_COLOR—use the annotation's color (falling back to
DEFAULT_COLOR) when rendering the <marker> so the arrowhead matches the stroke
color of the corresponding annotation. Ensure changes touch the
constants/variables around MARKER_ID, DEFAULT_COLOR and the AnnotationLayer
render logic that creates markers and sets markerEnd.
In `@desktop/src/apps/BrowserApp/CoPilotBanner.test.tsx`:
- Line 1: The test file imports vitest symbols as "import { describe, it,
expect, beforeEach, vi } from 'vitest';" but uses afterEach in the test body;
update the import to explicitly include afterEach so the file no longer relies
on globals. Locate the top-level import statement in CoPilotBanner.test.tsx and
add afterEach to the named imports from "vitest" (alongside describe, it,
expect, beforeEach, vi) so the test explicitly imports all used hooks.
In `@desktop/src/apps/BrowserApp/CoPilotBanner.tsx`:
- Around line 37-53: handleTakeBack currently aborts before flipping local
driving state if revokeCapability or unpinAgent throws; change it so the local
state is always cleared: call revokeCapability/unpinAgent inside a try/catch (or
try) and move useBrowserAgentStore.getState().setDrivingState(windowId, tabId,
agentId, "idle") into a finally block so it runs regardless of API failures;
keep removePinnedAgent and onTakeBack behavior but ensure setDrivingState
executes in finally (log errors from revoke/unpin rather than skipping the local
state flip).
In `@desktop/src/apps/BrowserApp/TabStrip.tsx`:
- Around line 104-107: Guard access to tab.pinnedAgentIds in the TabStrip
component's tabDriving logic to prevent crashes when older tab objects lack that
property: before iterating, ensure tab.pinnedAgentIds is an array (e.g.,
Array.isArray(tab.pinnedAgentIds) or default to []), then loop over it and check
s.drivingState[`${windowId}:${tab.id}:${aid}`] === "driving" as before and
return false if none match; update the useBrowserAgentStore selector that builds
tabDriving to use this safe fallback.
In `@tinyagentos/routes/desktop_browser/capability_routes.py`:
- Around line 44-50: The GrantRequest model accepts arbitrary expires_at strings
which later get ignored if malformed; update GrantRequest to validate expires_at
at grant time by either changing the field to datetime | None or adding a
pydantic `@validator` for expires_at that attempts to parse an ISO-8601 datetime
(or the expected format) and raises a validation error on failure; ensure the
validator normalizes the value (e.g., to timezone-aware UTC) and that any route
or function consuming GrantRequest (the grant handling code that reads
GrantRequest.expires_at) relies on the validated datetime type rather than raw
strings so invalid expiries are rejected upfront.
In `@tinyagentos/routes/desktop_browser/copilot_agent_ws.py`:
- Around line 41-53: The current _extract_host(msg) trusts agent-supplied
msg["host"], which must not be used for authorization; change the logic so
authorization uses an authoritative host value passed in from the server (e.g.
current_tab_host or validated_navigate_host) instead of msg["host"]. Modify
_extract_host to either accept an explicit trusted_host parameter (preferred) or
to only parse msg["url"] for display purposes (not auth), then update all
capability checks that call _extract_host (including the checks referenced
around the 91-101 region) to use the server-provided trusted host variable for
enforcement while continuing to optionally use msg.url/msg.host only for
UI/logging, not decision-making.
- Around line 85-90: The current per-op override of target_profile/target_tab
using target_profile = msg.get("profile_id", consumed.profile_id) and target_tab
= msg.get("tab_id", consumed.tab_id) bypasses the pin authorization that was
only checked for consumed.tab_id at connect time; update the message-handling
path to re-run the same pin/ownership check used at connect time whenever msg
contains a "profile_id" or "tab_id" override (i.e., detect when msg.get returns
a different value than consumed.profile_id/consumed.tab_id), and reject or
ignore the override (return error or revert to consumed values and log a
warning) unless that specific (profile, tab) is authorized for this agent;
ensure you reference the same authorization routine used at connect and apply it
to the computed target_profile/target_tab before proceeding with routing/drive
actions.
In `@tinyagentos/routes/desktop_browser/copilot.js`:
- Around line 85-95: The navigate function currently assigns untrusted args.url
directly to location.href; change it to validate and allow only http: and https:
schemes by parsing args.url (use the URL constructor with a base of
location.href to support relative URLs), check url.protocol is exactly 'http:'
or 'https:', and return an error for anything else; also wrap the URL parsing in
a try/catch to return { error: 'invalid url' } on parse failure before setting
location.href.
In `@tinyagentos/routes/desktop_browser/store.py`:
- Around line 689-707: Replace the read-then-delete loop with a single atomic
SQL delete: compute cutoff = (now -
timedelta(seconds=idle_timeout_s)).isoformat(), then run one await
self._db.execute("DELETE FROM drive_sessions WHERE last_op_at <= ?", (cutoff,))
instead of the initial SELECT and per-row DELETEs; commit, obtain the number of
deleted rows from the DELETE cursor (cursor.rowcount) and return that count.
Update references to datetime.fromisoformat/expired logic accordingly and remove
the per-row delete loop so freshly bumped sessions can't be deleted between
SELECT and DELETE.
---
Outside diff comments:
In `@desktop/src/apps/BrowserApp/BrowserApp.tsx`:
- Around line 78-126: The mobile branch in BrowserApp (when isMobile is true)
does not render the CapabilityPromptModal, so capability-needed events have no
UI; update the mobile JSX path to also mount <CapabilityPromptModal ... /> (same
props as used in the non-mobile branch) so the modal appears on mobile—locate
the mobile return block in BrowserApp.tsx and add the CapabilityPromptModal
component alongside WindowChooser/TabRenderer so it uses the same handlers/props
as the desktop branch.
In `@desktop/src/apps/BrowserApp/SettingsPanel.tsx`:
- Around line 51-57: The Escape key handler in SettingsPanel's useEffect always
calls onClose, which lets Escape dismiss both the panel and a child capabilities
modal; change the handler so it only calls onClose when capsOpen is false (i.e.,
if (e.key === "Escape" && !capsOpen) onClose()). Also add capsOpen to the
useEffect dependency array so the handler sees updates to that flag; reference
the useEffect block, the handler function, the capsOpen boolean, and onClose
callback when making the change.
---
Nitpick comments:
In `@desktop/src/apps/BrowserApp/AgentPresencePill.tsx`:
- Around line 152-167: The presence-dot span currently uses animate-ping for
isDriving which reads like a notification ping; change it to use the same pulse
animation with a brighter color and optionally a slightly larger size to convey
"more emphatic than watching." Specifically, update the class array in the span
(data-testid="presence-dot") so the isDriving branch uses "bg-green-300
animate-pulse" (or similar brighter bg class) and optionally "w-2.5 h-2.5"
instead of animate-ping; leave the anyWatching branch as "bg-green-400
animate-pulse" and the idle branch unchanged.
In `@desktop/src/stores/browser-agent-store.test.ts`:
- Around line 408-415: The test title and scenario disagree: update the test
around useBrowserAgentStore to match intent — either rename the it() description
from "isAnyDriving returns one of multiple driving agents" to "isAnyDriving
picks the driving agent when others are idle" if you want to keep agent-a idle,
or make both agents driving by calling
s.setDrivingState("win-1","tab-1","agent-a","driving") and
s.setDrivingState("win-1","tab-1","agent-b","driving") and change the assertion
to accept either agent (e.g.,
expect(["agent-a","agent-b"]).toContain(useBrowserAgentStore.getState().isAnyDriving("win-1","tab-1")));
ensure you modify the test that uses useBrowserAgentStore, setDrivingState and
isAnyDriving accordingly.
In `@tests/routes/desktop_browser/test_copilot_agent_ws.py`:
- Around line 227-230: Replace the fragile time.sleep(0.05) calls with a small
polling wait helper (e.g., wait_for(predicate, timeout=1.0, interval=0.01)) that
waits until a concrete condition is true; implement wait_for once in the test
module and use it where the code currently calls time.sleep(0.05) by passing a
predicate that checks the real signal (for example: lambda: my_mock.await_count
>= expected or lambda: mock_ws.send.call_count > 0), and swap each
time.sleep(0.05) occurrence (the literal time.sleep calls) to wait_for with an
appropriate predicate and timeout.
In `@tinyagentos/routes/desktop_browser/copilot_ws.py`:
- Around line 281-285: The inline comment above _ALLOWED_EVENT_KINDS is stale;
remove the parenthetical "Drive ack events come in PR 7." and update the comment
to simply state that these are the allowed event kinds from iframe→server (e.g.,
replace the line starting with "# Allowed event kinds..." so it no longer
references PR 7), since ack handling is already implemented in the copilot_ws
routing (see _ALLOWED_EVENT_KINDS and the ack routing code).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 94fc4a1f-5894-47c6-9dc2-71af43c5109c
📒 Files selected for processing (32)
desktop/src/apps/BrowserApp/AgentCapabilitiesPanel.test.tsxdesktop/src/apps/BrowserApp/AgentCapabilitiesPanel.tsxdesktop/src/apps/BrowserApp/AgentPresencePill.tsxdesktop/src/apps/BrowserApp/AnnotationLayer.test.tsxdesktop/src/apps/BrowserApp/AnnotationLayer.tsxdesktop/src/apps/BrowserApp/BrowserApp.tsxdesktop/src/apps/BrowserApp/CapabilityPromptModal.test.tsxdesktop/src/apps/BrowserApp/CapabilityPromptModal.tsxdesktop/src/apps/BrowserApp/Chrome.tsxdesktop/src/apps/BrowserApp/CoPilotBanner.test.tsxdesktop/src/apps/BrowserApp/CoPilotBanner.tsxdesktop/src/apps/BrowserApp/SettingsPanel.test.tsxdesktop/src/apps/BrowserApp/SettingsPanel.tsxdesktop/src/apps/BrowserApp/TabRenderer.tsxdesktop/src/apps/BrowserApp/TabStrip.tsxdesktop/src/apps/BrowserApp/agent-ws-bridge.test.tsdesktop/src/apps/BrowserApp/agent-ws-bridge.tsdesktop/src/lib/browser-capability-api.test.tsdesktop/src/lib/browser-capability-api.tsdesktop/src/stores/browser-agent-store.test.tsdesktop/src/stores/browser-agent-store.tstests/routes/desktop_browser/test_capability_routes.pytests/routes/desktop_browser/test_copilot_agent_ws.pytests/routes/desktop_browser/test_copilot_js.pytests/routes/desktop_browser/test_drive_sessions.pytinyagentos/routes/desktop_browser/__init__.pytinyagentos/routes/desktop_browser/capability_routes.pytinyagentos/routes/desktop_browser/copilot.jstinyagentos/routes/desktop_browser/copilot_agent_ws.pytinyagentos/routes/desktop_browser/copilot_ws.pytinyagentos/routes/desktop_browser/schema.pytinyagentos/routes/desktop_browser/store.py
…op pin re-check (CR critical)
| # Allow the agent to target a specific (profile, tab) per op via msg fields, | ||
| # falling back to the ticket-bound (profile, tab). For PR 7 the typical agent | ||
| # only operates on the ticket's tab; cross-tab ops are out of scope. | ||
| target_profile = msg.get("profile_id", consumed.profile_id) |
There was a problem hiding this comment.
WARNING: Missing input validation for target_profile and target_tab; agent could supply None or empty strings, potentially causing runtime errors in store operations.
| # Record authoritative current URL for this tab. Agent-side capability | ||
| # checks read from this rather than trusting agent-supplied msg["host"]. | ||
| request.app.state.copilot_hub.set_tab_url( | ||
| user_id=user_id, profile_id=profile_id, tab_id=tab_id, url=url, |
There was a problem hiding this comment.
CRITICAL: Logic bug - set_tab_url records the initial request URL instead of the final URL after redirects, causing capability checks to use incorrect host.
| # Using 0.6s so the test isn't flaky under heavy CI load (where the Python | ||
| # interpreter can pause for hundreds of ms between awaits — a tighter | ||
| # boundary would let the "fresh" session also slip past the cutoff). | ||
| await asyncio.sleep(0.6) |
There was a problem hiding this comment.
WARNING: Potential race condition in test timing; asyncio.sleep(0.6) may not guarantee expiration under heavy system load, potentially causing intermittent test failures despite the comment's attempt to address CI flakiness.
| # Now create the fresh session AFTER the sleep so its last_op_at is | ||
| # guaranteed within the timeout window when we prune. | ||
| await store.start_drive_session( | ||
| user_id="u1", profile_id="p1", tab_id="t1", agent_id="fresh-agent", |
There was a problem hiding this comment.
SUGGESTION: Edge case not handled; if start_drive_session fails for the fresh session, the test may pass incorrectly assuming prune removes only one, but missing error handling could mask issues.
|
@coderabbitai full review — please re-review at |
|
Tip For best results, initiate chat on the files or code changes.
[full_review] |
|
@coderabbitai full review — please re-review at |
|
Tip For best results, initiate chat on the files or code changes.
Results will be posted shortly. [full_review] |
Summary
The "write half" of agent integration on top of PR 6's read half. Agents can now drive (scrollTo / click / type / focus / navigate) and annotate (highlight, sticky, parent-overlay cursor + arrows). Privileged ops (drive / navigate / see_cookies) require explicit user grant via a modal that pops on first use; grants are revocable in Settings. The driving state has obvious chrome — green banner with Pause/Take back, tinted iframe, tab-strip wash, brighter pulsing presence pill.
Architecture
/api/desktop/browser/copilot-agent?ticket=…— agent runtime connects, sends ops, receives acks. Same ticket store as the iframe side._agent_connsregistry +route_op_to_iframe(server→iframe) +route_ack_to_agent(iframe→agent) +notify_capability_needed(server→iframe→parent).check_capability(host, permission); on miss → deny agent + notify iframe so the modal pops.last_op_atfor the 30s idle timeout.is_drivingis the source of truth for "is this agent currently driving on this tab".driving-statevia postMessage; agent-ws-bridge sets localdrivingState+ arms a 30s decay timer. Chrome reads from store: CoPilotBanner, iframe tint, tab-strip wash, pill pulse-faster.data-taos-annotation-idforclearto find/remove. Parent-overlay (cursor, arrows) drawn by<AnnotationLayer />SVG withpointer-events: none.Files (24 changed, ~3500 lines)
Backend:
drive_sessionsschema + 5 store methods + 3 capability wrappers,copilot_agent_ws.py,capability_routes.py(GET/POST/DELETE), CopilotHub extensions,copilot.js(5 drive ops + 5 annotation ops + driving-state forwarding).Frontend:
CapabilityPromptModal,AgentCapabilitiesPanel,AnnotationLayer,CoPilotBanner,browser-capability-api.ts. Agent store extensions:drivingState+isAnyDriving+ annotation actions. agent-ws-bridge handlesdriving-stateandcapability-neededevents with 30s decay. Chrome / TabRenderer / TabStrip / AgentPresencePill render the new chrome states.Test plan
pytest tests/routes/desktop_browser/ -q→ 346 passedcd desktop && npx vitest run→ 692 passed across 102 filescd desktop && npx tsc --noEmit→ zero errorsAtomic invariants
drive_sessions.last_op_at+is_driving) and client-side (agent-ws-bridgesetTimeout). Server is source of truth.(user, agent_id)agent connections +(user, profile, tab, agent)iframe connections in the same hub. PR 6's no-clobber design preserved.Deferred to follow-up brainstorms
Spec / plan references
Summary by CodeRabbit