From e13de67490eea8a0f4dc688fac301e1e6ff66847 Mon Sep 17 00:00:00 2001 From: Bulat Skill7 Date: Sun, 8 Mar 2026 15:43:40 +0000 Subject: [PATCH 01/11] docs: map codebase with parallel agents MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 7 documents across tech, architecture, quality, and concerns: - STACK.md (169 lines) — technology stack analysis - INTEGRATIONS.md (300 lines) — external integrations - ARCHITECTURE.md (302 lines) — system architecture - STRUCTURE.md (371 lines) — directory and module structure - CONVENTIONS.md (193 lines) — coding conventions - TESTING.md (425 lines) — test infrastructure - CONCERNS.md (206 lines) — technical debt and risks Co-Authored-By: Claude Opus 4.6 --- .planning/codebase/ARCHITECTURE.md | 302 ++++++++++++++++++++ .planning/codebase/CONCERNS.md | 206 ++++++++++++++ .planning/codebase/CONVENTIONS.md | 193 +++++++++++++ .planning/codebase/INTEGRATIONS.md | 300 ++++++++++++++++++++ .planning/codebase/STACK.md | 169 ++++++++++++ .planning/codebase/STRUCTURE.md | 371 +++++++++++++++++++++++++ .planning/codebase/TESTING.md | 425 +++++++++++++++++++++++++++++ 7 files changed, 1966 insertions(+) create mode 100644 .planning/codebase/ARCHITECTURE.md create mode 100644 .planning/codebase/CONCERNS.md create mode 100644 .planning/codebase/CONVENTIONS.md create mode 100644 .planning/codebase/INTEGRATIONS.md create mode 100644 .planning/codebase/STACK.md create mode 100644 .planning/codebase/STRUCTURE.md create mode 100644 .planning/codebase/TESTING.md diff --git a/.planning/codebase/ARCHITECTURE.md b/.planning/codebase/ARCHITECTURE.md new file mode 100644 index 0000000..501cff1 --- /dev/null +++ b/.planning/codebase/ARCHITECTURE.md @@ -0,0 +1,302 @@ +# Architecture + +**Analysis Date:** 2026-03-08 + +## Pattern Overview + +**Overall:** Headless AI Agent Engine with Event-Driven Architecture + +The CLI is a headless execution engine for AI agent skills. It uses a layered namespace-based architecture with an event bus for cross-component communication. The core loop is: user prompt -> session -> LLM stream -> tool execution -> response. Multiple project instances can run concurrently via `AsyncLocalStorage` context isolation. + +**Key Characteristics:** +- TypeScript namespace modules (`export namespace Foo {}`) as the primary code organization pattern +- Event bus (pub/sub) for decoupled communication between components +- `AsyncLocalStorage`-based context for per-project instance isolation +- Lazy-loaded AI provider SDKs via dynamic imports +- SQLite (bun:sqlite + Drizzle ORM) for persistent storage +- Zod schemas for all data types, used for both validation and OpenAPI generation +- Plugin system with lifecycle hooks for extensibility + +## Layers + +**CLI Layer (Entry & Commands):** +- Purpose: Parse CLI arguments, dispatch to handlers +- Location: `packages/cli/src/cli/` +- Contains: Yargs command definitions, UI helpers, bootstrap logic +- Depends on: Session, Agent, Provider, Config +- Used by: End users via `aictrl` binary +- Key files: + - `packages/cli/src/index.ts` - Main entry point, yargs setup, command registration + - `packages/cli/src/cli/cmd/run.ts` - Primary `run` command (send message, stream events) + - `packages/cli/src/cli/cmd/cmd.ts` - Command helper wrapper + - `packages/cli/src/cli/bootstrap.ts` - Instance bootstrap wrapper + - `packages/cli/src/cli/ui.ts` - Terminal output helpers + +**Project/Instance Layer:** +- Purpose: Per-project context isolation and lifecycle management +- Location: `packages/cli/src/project/` +- Contains: Project detection (git), instance context, VCS integration, bootstrap +- Depends on: Database, Global, Config +- Used by: All layers that need project-scoped state +- Key files: + - `packages/cli/src/project/instance.ts` - `Instance.provide()` creates AsyncLocalStorage context; `Instance.state()` creates per-instance memoized state + - `packages/cli/src/project/bootstrap.ts` - `InstanceBootstrap()` initializes plugins, LSP, file watcher, VCS, snapshots, truncation + - `packages/cli/src/project/project.ts` - Project detection from directory (git worktree resolution) + - `packages/cli/src/project/vcs.ts` - VCS integration + +**Session Layer:** +- Purpose: Manage AI chat sessions, messages, parts, and the LLM interaction loop +- Location: `packages/cli/src/session/` +- Contains: Session CRUD, message/part storage, prompt construction, LLM streaming, compaction, retry, status +- Depends on: Database, Bus, Agent, Provider, Tool, Permission, Config +- Used by: CLI commands, SDK server +- Key files: + - `packages/cli/src/session/index.ts` - Session namespace: CRUD operations, message/part management, cost calculation + - `packages/cli/src/session/prompt.ts` - `SessionPrompt.prompt()` and `SessionPrompt.command()` - orchestrates the full LLM loop + - `packages/cli/src/session/llm.ts` - `LLM.stream()` - wraps Vercel AI SDK `streamText()` with system prompts, tool config, provider transforms + - `packages/cli/src/session/processor.ts` - `SessionProcessor` - processes LLM stream events (tool calls, text, reasoning, step boundaries) + - `packages/cli/src/session/system.ts` - `SystemPrompt` - provider-specific system prompts (anthropic, gemini, codex, etc.) + - `packages/cli/src/session/instruction.ts` - Instruction loading (AGENTS.md, config markdown) + - `packages/cli/src/session/compaction.ts` - Context window compaction + - `packages/cli/src/session/status.ts` - Session status tracking (idle/busy/retry) + - `packages/cli/src/session/message-v2.ts` - Message and Part type definitions + - `packages/cli/src/session/session.sql.ts` - Drizzle schema for sessions, messages, parts, todos, permissions + +**Agent Layer:** +- Purpose: Define AI agent configurations (permissions, prompts, models) +- Location: `packages/cli/src/agent/` +- Contains: Built-in agents (build, plan, general, explore, compaction, title, summary), custom agent loading from config +- Depends on: Config, Permission, Provider, Plugin, Skill +- Used by: Session layer when starting LLM interactions +- Key files: + - `packages/cli/src/agent/agent.ts` - Agent namespace: defines built-in agents, merges config overrides, agent generation via LLM + - `packages/cli/src/agent/prompt/` - Agent-specific prompt templates (.txt files) + +**Provider Layer:** +- Purpose: Abstract AI model providers (Anthropic, OpenAI, Google, etc.) +- Location: `packages/cli/src/provider/` +- Contains: Provider/model resolution, lazy-loaded SDK factories, custom loaders (Copilot, Bedrock), model transforms +- Depends on: Config, Auth, ModelsDev, Plugin +- Used by: Session/LLM layer +- Key files: + - `packages/cli/src/provider/provider.ts` - Provider namespace: `BUNDLED_PROVIDERS` map (20+ providers), `getLanguage()`, `getModel()`, `defaultModel()` + - `packages/cli/src/provider/transform.ts` - `ProviderTransform` - provider-specific options (max tokens, caching, structured output) + - `packages/cli/src/provider/models.ts` - `ModelsDev` - fetches model catalog from models.dev + - `packages/cli/src/provider/sdk/copilot/` - GitHub Copilot provider implementation + +**Tool Layer:** +- Purpose: Define tools available to AI agents +- Location: `packages/cli/src/tool/` +- Contains: Built-in tools (bash, read, write, edit, glob, grep, task, skill, etc.), tool registry, truncation +- Depends on: Permission, Plugin, Config, Instance +- Used by: Session/Processor when executing tool calls +- Key files: + - `packages/cli/src/tool/tool.ts` - `Tool.define()` factory, `Tool.Info` interface, `Tool.Context` type + - `packages/cli/src/tool/registry.ts` - `ToolRegistry` - loads built-in tools, custom tools from `.aictrl/tool/`, plugin tools + - `packages/cli/src/tool/bash.ts` - Shell command execution + - `packages/cli/src/tool/edit.ts` - File editing (diff-based) + - `packages/cli/src/tool/write.ts` - File writing + - `packages/cli/src/tool/read.ts` - File reading + - `packages/cli/src/tool/task.ts` - Subagent spawning + - `packages/cli/src/tool/skill.ts` - Skill invocation + - `packages/cli/src/tool/truncation.ts` - Output truncation for large tool results + +**Event Bus Layer:** +- Purpose: Decoupled pub/sub communication between components +- Location: `packages/cli/src/bus/` +- Contains: Per-instance Bus, global EventEmitter (GlobalBus), typed event definitions +- Depends on: Instance (per-instance subscriptions) +- Used by: All layers publish/subscribe events +- Key files: + - `packages/cli/src/bus/index.ts` - `Bus.publish()`, `Bus.subscribe()`, `Bus.subscribeAll()` - per-instance event bus + - `packages/cli/src/bus/global.ts` - `GlobalBus` - cross-instance EventEmitter, bridges events to SDK/server + - `packages/cli/src/bus/bus-event.ts` - `BusEvent.define()` - typed event definition factory + +**Permission Layer:** +- Purpose: Control tool access per agent/session with pattern-based rules +- Location: `packages/cli/src/permission/` +- Contains: Ruleset merging, wildcard pattern matching, permission requests/replies +- Depends on: Database, Config, Bus +- Used by: Session/Processor before tool execution +- Key files: + - `packages/cli/src/permission/next.ts` - `PermissionNext` - rule evaluation, `fromConfig()`, `merge()`, ask/reply flow + +**Config Layer:** +- Purpose: Load and merge configuration from multiple sources +- Location: `packages/cli/src/config/` +- Contains: Config loading (global, project, managed, inline), markdown-based config, config paths +- Depends on: Instance, Auth, Global +- Used by: All layers +- Key files: + - `packages/cli/src/config/config.ts` - `Config.get()` - merges configs from 6+ sources (well-known, global, custom, project, .aictrl dirs, inline, managed) + - `packages/cli/src/config/markdown.ts` - Markdown frontmatter parser for agents/commands/skills + - `packages/cli/src/config/paths.ts` - Config file path resolution + +**Storage Layer:** +- Purpose: SQLite database and file-based storage +- Location: `packages/cli/src/storage/` +- Contains: Database client (bun:sqlite + Drizzle), migrations, JSON-to-SQLite migration, file storage +- Depends on: Global (data directory) +- Used by: Session, Project, Control, Permission +- Key files: + - `packages/cli/src/storage/db.ts` - `Database.Client()` - lazy SQLite init with WAL mode, `Database.use()` for synchronized access + - `packages/cli/src/storage/schema.ts` - Re-exports all SQL table definitions + - `packages/cli/src/storage/schema.sql.ts` - Shared schema helpers (Timestamps) + - `packages/cli/src/storage/json-migration.ts` - One-time migration from JSON files to SQLite + - `packages/cli/src/storage/storage.ts` - File-based storage (legacy) + +**Plugin Layer:** +- Purpose: Extensibility via external plugins and built-in auth plugins +- Location: `packages/cli/src/plugin/` +- Contains: Plugin loading (npm packages, local files), hook trigger system, built-in Codex/Copilot auth plugins +- Depends on: Config, Instance, BunProc +- Used by: Tool registry, Session, Provider +- Key files: + - `packages/cli/src/plugin/index.ts` - `Plugin.init()`, `Plugin.trigger()`, `Plugin.list()` - load and invoke plugin hooks + +**Skill Layer:** +- Purpose: Markdown-based reusable agent skills (SKILL.md files) +- Location: `packages/cli/src/skill/` +- Contains: Skill discovery from .aictrl/skills/, .claude/skills/, .agents/skills/, global skills +- Depends on: Config, Instance, ConfigMarkdown +- Used by: Agent layer (skill directories for permissions), Tool layer (SkillTool) +- Key files: + - `packages/cli/src/skill/skill.ts` - `Skill.state()` discovers and loads SKILL.md files + - `packages/cli/src/skill/discovery.ts` - Skill file discovery logic + +**MCP Layer:** +- Purpose: Model Context Protocol client integration +- Location: `packages/cli/src/mcp/` +- Contains: MCP client connections (stdio, SSE, HTTP), tool bridging, OAuth support +- Depends on: Config, Instance +- Used by: Session layer (additional tools from MCP servers) +- Key files: + - `packages/cli/src/mcp/index.ts` - `MCP` namespace - client lifecycle, tool listing, resource access + +**ACP Layer:** +- Purpose: Agent Client Protocol server implementation +- Location: `packages/cli/src/acp/` +- Contains: ACP server, session management for ACP clients +- Depends on: SDK, Session +- Key files: + - `packages/cli/src/acp/agent.ts` - ACP agent implementation + - `packages/cli/src/acp/session.ts` - ACP session management + - `packages/cli/src/acp/types.ts` - ACP type definitions + +## Data Flow + +**User Prompt to AI Response:** + +1. User invokes `aictrl run "message"` -> `packages/cli/src/index.ts` parses args +2. `RunCommand` handler calls `bootstrap()` -> `Instance.provide()` creates project context +3. `InstanceBootstrap()` initializes plugins, LSP, file watcher, VCS, snapshots +4. `Session.createNext()` creates session in SQLite, publishes `Session.Event.Created` +5. `SessionPrompt.prompt()` builds user message, resolves agent + model +6. `LLM.stream()` constructs system prompts, loads tools from `ToolRegistry`, calls Vercel AI SDK `streamText()` +7. `SessionProcessor.process()` iterates the LLM stream: + - Text parts -> stored as `TextPart` in DB, published via Bus + - Tool calls -> permission check -> `tool.execute()` -> result stored as `ToolPart` + - Step boundaries -> triggers next iteration if more tool calls needed +8. On completion, `SessionStatus.set(sessionID, { type: "idle" })` published +9. `RunCommand` event loop receives `session.status` idle event, exits + +**Event Flow (Bus Architecture):** + +1. Components publish typed events via `Bus.publish(EventDef, properties)` +2. Per-instance subscriptions receive matching events +3. `GlobalBus.emit("event", payload)` bridges to cross-instance listeners (SDK server, CLI event loop) +4. SDK server streams events to connected clients via SSE + +**State Management:** +- Per-instance state via `Instance.state(() => initializer)` - memoized per project directory +- Global state via `Global.Path` (XDG directories) +- Database state via `Database.use(db => ...)` - synchronous SQLite access +- Context propagation via `AsyncLocalStorage` (Node.js `async_hooks`) + +## Key Abstractions + +**Instance (Project Context):** +- Purpose: Isolates state per project directory using AsyncLocalStorage +- Implementation: `packages/cli/src/project/instance.ts` +- Pattern: `Instance.provide({ directory, init, fn })` creates/reuses context, `Instance.state()` creates memoized per-instance singletons +- All state-bearing namespaces use `Instance.state()` for isolation + +**Tool.Info:** +- Purpose: Defines a tool available to AI agents +- Implementation: `packages/cli/src/tool/tool.ts` +- Pattern: `Tool.define(id, async (initCtx) => ({ description, parameters: zodSchema, execute: async (args, ctx) => result }))` +- Tools receive `Tool.Context` with sessionID, messageID, abort signal, permission asking + +**BusEvent.Definition:** +- Purpose: Typed event definitions for the pub/sub bus +- Implementation: `packages/cli/src/bus/bus-event.ts` +- Pattern: `BusEvent.define("event.type", z.object({ ... }))` returns typed definition for `Bus.publish()` / `Bus.subscribe()` + +**fn (Validated Functions):** +- Purpose: Runtime-validated function inputs using Zod +- Implementation: `packages/cli/src/util/fn.ts` +- Pattern: `const myFn = fn(zodSchema, async (validatedInput) => result)` - used extensively in Session namespace + +**Plugin Hooks:** +- Purpose: Named extension points for plugins to modify behavior +- Implementation: `packages/plugin/src/index.ts` (types), `packages/cli/src/plugin/index.ts` (trigger) +- Pattern: `Plugin.trigger("hook.name", inputContext, mutableOutput)` - plugins can modify the output object +- 15+ hook points: chat.message, chat.params, tool.execute.before/after, permission.ask, etc. + +## Entry Points + +**CLI Entry (`aictrl` command):** +- Location: `packages/cli/src/index.ts` +- Triggers: User runs `aictrl [args]` +- Responsibilities: Yargs setup, logging init, database migration check, command dispatch + +**Bootstrap:** +- Location: `packages/cli/src/cli/bootstrap.ts` +- Triggers: Called by commands that need a project context +- Responsibilities: Creates Instance context, runs InstanceBootstrap, executes callback + +**SDK Entry:** +- Location: `packages/sdk/src/index.ts` +- Triggers: External code creates `createOpencode()` or `createAictrlClient()` +- Responsibilities: Spawns aictrl server process, provides typed client + +**Plugin Entry:** +- Location: `packages/plugin/src/index.ts` +- Triggers: External plugins implement the Plugin interface +- Responsibilities: Defines plugin types, hook interfaces, tool definition types + +## Error Handling + +**Strategy:** Named errors with structured data, wrapped in Zod-validated types + +**Patterns:** +- `NamedError.create("ErrorName", zodSchema)` creates typed error classes (`packages/util/src/error.ts`) +- `Session.BusyError` for concurrent session access +- `NotFoundError` for missing database records +- `PermissionNext` deny results in tool call failure message to LLM +- Top-level `process.on("unhandledRejection")` and `process.on("uncaughtException")` handlers in `packages/cli/src/index.ts` +- `FormatError()` converts errors to user-friendly messages +- LLM retries handled by `SessionRetry` with exponential backoff + +## Cross-Cutting Concerns + +**Logging:** `Log.create({ service: "name" })` creates namespaced loggers; logs written to `~/.local/share/aictrl/log/`; configurable via `--log-level` and `--print-logs` flags + +**Validation:** Zod schemas for all data types; `fn()` wrapper for runtime input validation; tool parameter validation in `Tool.define()` + +**Authentication:** File-based auth storage (`~/.local/share/aictrl/auth.json`); supports OAuth, API key, and well-known auth types; per-provider auth via `Auth.get(providerID)` + +**Configuration Precedence (low to high):** +1. Remote `.well-known/aictrl` (org defaults) +2. Global config (`~/.config/aictrl/aictrl.json{,c}`) +3. Custom config (`AICTRL_CONFIG` env var) +4. Project config (`aictrl.json{,c}` in project root) +5. `.aictrl` directories (agents, commands, plugins, tools) +6. Inline config (`AICTRL_CONFIG_CONTENT` env var) +7. Managed config directory (enterprise, `/etc/aictrl`) + +**Feature Flags:** `packages/cli/src/flag/flag.ts` - environment variable-based flags for experimental features (`AICTRL_EXPERIMENTAL_*`, `AICTRL_ENABLE_*`, `AICTRL_DISABLE_*`) + +--- + +*Architecture analysis: 2026-03-08* diff --git a/.planning/codebase/CONCERNS.md b/.planning/codebase/CONCERNS.md new file mode 100644 index 0000000..8a3ba62 --- /dev/null +++ b/.planning/codebase/CONCERNS.md @@ -0,0 +1,206 @@ +# Codebase Concerns + +**Analysis Date:** 2026-03-08 + +## Tech Debt + +**Env API vs process.env Inconsistency:** +- Issue: The `Env` namespace (`packages/cli/src/env/index.ts`) creates a shallow copy of `process.env` for isolation, but multiple provider loaders bypass it and write to `process.env` directly because `Env.set` does not propagate to the real environment. +- Files: `packages/cli/src/provider/provider.ts` (lines 208-214, 414-420), `packages/cli/src/env/index.ts` +- Impact: Provider auth keys (AWS_BEARER_TOKEN_BEDROCK, AICORE_SERVICE_KEY) are set on `process.env` directly, defeating the isolation model. Parallel test runs or multi-instance scenarios could leak credentials across contexts. +- Fix approach: Either make `Env.set` write through to `process.env` (with explicit documentation that it's intentional), or refactor provider loaders to pass auth via SDK options rather than environment variables. + +**Copilot SDK Fork (2,500+ lines):** +- Issue: The project maintains a full fork of two OpenAI-compatible language model implementations inside `packages/cli/src/provider/sdk/copilot/`. These files total 2,512 lines and contain TODO comments about lost type safety. +- Files: `packages/cli/src/provider/sdk/copilot/responses/openai-responses-language-model.ts` (1,732 lines), `packages/cli/src/provider/sdk/copilot/chat/openai-compatible-chat-language-model.ts` (780 lines) +- Impact: Difficult to keep in sync with upstream AI SDK changes. The comment at line 374 says "TODO we lost type safety on Chunk, most likely due to the error schema. MUST FIX" indicating known regression. +- Fix approach: Track upstream AI SDK releases, evaluate whether custom copilot provider can be upstreamed or reduced. At minimum, restore type safety on the Chunk type. + +**Plugin System Type Safety Workarounds:** +- Issue: The plugin hook trigger function has a `@ts-expect-error` with a "try-counter: 2" comment, indicating repeated failed attempts to fix typing. Plugin config hook also uses `@ts-expect-error` for SDK v1/v2 mismatch. +- Files: `packages/cli/src/plugin/index.ts` (lines 111-113, 127-128) +- Impact: Plugin hooks execute with no compile-time type checking. Bugs in hook signatures will only surface at runtime. +- Fix approach: Migrate plugin system to SDK v2 types. Define a proper generic type for the trigger function that preserves input/output typing across all hook names. + +**Deprecated Config Fields Still Active:** +- Issue: Multiple config schema fields are marked `@deprecated` but still parsed and handled in runtime code, creating dual code paths. +- Files: `packages/cli/src/config/config.ts` (lines 676, 698, 1010, 1043, 1129) +- Impact: New contributors may use deprecated fields. Dual code paths increase maintenance burden and risk of behavior divergence. +- Fix approach: Add deprecation warnings at runtime when deprecated fields are used. Set a removal timeline and migrate in a major version. + +**Pervasive `as any` Type Casts:** +- Issue: Over 40 `as any` casts across the CLI source, concentrated in provider loading, session prompts, storage migration, and bus events. +- Files: `packages/cli/src/provider/provider.ts` (lines 87, 91, 926, 939), `packages/cli/src/session/prompt.ts` (lines 811, 813, 957, 959), `packages/cli/src/storage/json-migration.ts` (lines 154, 188, 314, 351, 376), `packages/cli/src/acp/agent.ts` (lines 173, 1669) +- Impact: Eliminates compile-time safety at critical boundaries (provider initialization, tool schema handling, data migration). Bugs hide until runtime. +- Fix approach: Prioritize removing `as any` from provider loading (most impactful) and session prompt tool registration. The migration file is a one-shot operation and lower priority. + +**Dead/Scrap Code:** +- Issue: `packages/cli/src/util/scrap.ts` contains dummy placeholder functions (`foo`, `bar`, `dummyFunction`, `randomHelper`) with no apparent consumers. +- Files: `packages/cli/src/util/scrap.ts` +- Impact: Confusing for contributors, pollutes the codebase. +- Fix approach: Delete the file. + +**Reasoning Token Cost Approximation:** +- Issue: Reasoning tokens are charged at the output token rate because models.dev lacks a distinct reasoning token price field. +- Files: `packages/cli/src/session/index.ts` (lines 858-860) +- Impact: Cost calculations are inaccurate for reasoning-heavy models (o3, Claude with extended thinking). Users see incorrect cost estimates. +- Fix approach: Add a `reasoning` cost field to the models.dev schema and consume it in the cost calculation. + +**Disabled Directory Tree in System Prompt:** +- Issue: The environment section of the system prompt contains a dead code path: `project.vcs === "git" && false` means the directory tree is never included. +- Files: `packages/cli/src/session/system.ts` (line 48) +- Impact: The `` XML tag is always empty, wasting tokens on empty markup. Either enable the feature or remove the dead code. +- Fix approach: Remove the `&& false` guard and the empty `` block, or implement the tree listing properly. + +**Permission Ruleset Not Persisted:** +- Issue: The permission "always allow" ruleset is computed at runtime but never saved to disk. Commented-out code shows the intent to persist via DB insert. +- Files: `packages/cli/src/permission/next.ts` (lines 227-229) +- Impact: Users who grant "always allow" permissions lose them on restart. Permissions must be re-granted each session. +- Fix approach: Implement the DB persistence (the commented code is almost complete) and add a UI for managing saved rules. + +**Context Overflow Error Not Handled:** +- Issue: When a context overflow error occurs during processing, there is a TODO comment with no implementation. +- Files: `packages/cli/src/session/processor.ts` (line 357) +- Impact: Context overflow errors fall through to the generic retry logic rather than triggering compaction or a user-facing message. This can cause repeated failed retries on an inherently unrecoverable error. +- Fix approach: Trigger `SessionCompaction` when context overflow is detected, or surface a clear error to the user suggesting they start a new session. + +## Security Considerations + +**Symlink Path Traversal:** +- Risk: The `Filesystem.contains` function (`packages/cli/src/util/filesystem.ts`, line 134) performs lexical-only path comparison using `path.relative()`. Symlinks inside the project directory can point to files outside it, bypassing the containment check. +- Files: `packages/cli/src/util/filesystem.ts` (line 134), `packages/cli/src/file/index.ts` (lines 499-502, 575-578), `packages/cli/src/project/instance.ts` (lines 60, 64) +- Current mitigation: The TODO comments acknowledge the issue. On Linux/macOS, the check prevents basic `../` traversal. +- Recommendations: Use `fs.realpath()` to canonicalize paths before comparison. On Windows, also normalize drive letter casing. Add test cases for symlink traversal scenarios. + +**Direct process.env Credential Mutation:** +- Risk: Provider loaders write API keys directly into `process.env` (e.g., `process.env.AWS_BEARER_TOKEN_BEDROCK = auth.key`). These persist for the lifetime of the process and could be read by child processes (bash tool, MCP servers). +- Files: `packages/cli/src/provider/provider.ts` (lines 214, 420) +- Current mitigation: None beyond the Env isolation layer, which is bypassed. +- Recommendations: Pass credentials via SDK constructor options rather than environment variables. If env vars are required by the SDK, set them only for the duration of the SDK call and unset afterward. + +**MCP Server Process Tree Cleanup:** +- Risk: MCP server cleanup accesses `(client.transport as any)?.pid` to kill descendant processes. The `as any` cast means this may silently fail if the transport changes shape, leaving orphaned processes. +- Files: `packages/cli/src/mcp/index.ts` (lines 226-232) +- Current mitigation: Wrapped in try/catch, but failure is silent. +- Recommendations: Type the transport properly or add logging when pid is unavailable. Consider using process groups for reliable cleanup. + +## Performance Bottlenecks + +**Synchronous File Operations in Hot Paths:** +- Problem: Several modules use synchronous filesystem APIs (`readFileSync`, `existsSync`, `statSync`, `readdirSync`) in paths that could be async. +- Files: `packages/cli/src/storage/db.ts` (lines 54, 61, 63 - migration loading), `packages/cli/src/util/filesystem.ts` (lines 13, 18, 24-25), `packages/cli/src/config/config.ts` (lines 185, 299), `packages/cli/src/patch/index.ts` (line 315) +- Cause: Historical convenience. The DB migration loader reads all migration files synchronously at startup. +- Improvement path: For `Filesystem.exists/isDir/stat` in the utility module, these are deliberately sync for simplicity. For `storage/db.ts`, consider reading migrations asynchronously during startup. For `patch/index.ts`, the `readFileSync` at line 315 is in a hot path during file editing and should be async. + +**models.dev Fetch on Module Load:** +- Problem: `packages/cli/src/provider/models.ts` (lines 124-132) fires a `ModelsDev.refresh()` fetch at module import time (top-level side effect) and sets up a 1-hour `setInterval`. This blocks or delays startup if the network is slow. +- Files: `packages/cli/src/provider/models.ts` (lines 124-132) +- Cause: Eager refresh to ensure fresh model data. +- Improvement path: The fetch is fire-and-forget (not awaited), but network timeouts (10s) can still impact startup perception. Consider deferring the initial fetch until after the first user interaction, or reducing the timeout for the initial fetch. + +**Levenshtein Distance in Edit Tool:** +- Problem: The edit tool uses a full Levenshtein distance computation with O(n*m) time and space for fuzzy matching during file edits. Large files with large search blocks could be slow. +- Files: `packages/cli/src/tool/edit.ts` (lines 165-179, used by `BlockAnchorReplacer` at line 227) +- Cause: Intentional fallback matching strategy for when exact matches fail. +- Improvement path: Add early exit when distance exceeds threshold. Consider using a bounded Levenshtein or switching to a faster similarity metric for large inputs. + +## Fragile Areas + +**Provider Transform Layer:** +- Files: `packages/cli/src/provider/transform.ts` (955 lines) +- Why fragile: This file contains model-specific branching logic based on string matching of model IDs (e.g., `id.includes("deepseek")`, `id.includes("grok")`, `id.includes("claude")`). Every new model or provider requires adding more string-match branches. The reasoning effort mapping alone spans 200+ lines of conditional logic. +- Safe modification: When adding a new model, add a test case in `packages/cli/test/provider/transform.test.ts` first. Follow the existing pattern of checking model ID substrings. +- Test coverage: Well-tested at 2,353 lines of tests, but new model combinations can still produce unexpected interactions. + +**Session Prompt Assembly:** +- Files: `packages/cli/src/session/prompt.ts` (1,983 lines) +- Why fragile: Assembles the full prompt from system prompts, tools, agents, permissions, structured output, and subtask routing. Multiple `as any` casts at tool registration boundaries. The TODO at line 351 acknowledges that "invoke tool" logic is not centralized. +- Safe modification: Use `packages/cli/test/session/prompt.test.ts` for any changes. Be careful with tool ID typing -- the `as any` casts exist because tool IDs are dynamically generated strings that don't match the type system's expectations. +- Test coverage: Has dedicated tests but the interaction between structured output, subtasks, and regular tools is complex. + +**GitHub Actions Integration:** +- Files: `packages/cli/src/cli/cmd/github.ts` (1,631 lines) +- Why fragile: Large file handling GitHub Actions workflow (issue triggers, PR creation, branch management, OIDC auth, comment reactions). Heavy use of `console.log` for CI output. Multiple `nothrow()` git commands where failures are silently ignored. +- Safe modification: Test changes against actual GitHub Actions workflows. The `nothrow()` calls mean git failures (network issues, permission problems) are swallowed silently. +- Test coverage: Limited -- `packages/cli/test/cli/github-action.test.ts` and `packages/cli/test/cli/github-remote.test.ts` exist but the file is 1,631 lines. + +**ACP Agent Interface:** +- Files: `packages/cli/src/acp/agent.ts` (1,742 lines) +- Why fragile: Second-largest source file. Handles the ACP (Agent Communication Protocol) with event subscription, message handling, permission forwarding, and model resolution. Uses `(event as any)?.payload` for event deserialization (line 173). +- Safe modification: Changes to event handling should be accompanied by updates to `packages/cli/test/acp/agent-interface.test.ts`. +- Test coverage: `packages/cli/test/acp/agent-interface.test.ts` exists but is only 1 file for 1,742 lines of code. + +## Scaling Limits + +**In-Memory Lock Map:** +- Current capacity: The `Lock` namespace (`packages/cli/src/util/lock.ts`) uses an in-memory `Map` for reader-writer locks. +- Limit: If many concurrent file operations target distinct paths, the map grows unbounded during a session. Locks are cleaned up when released, but under sustained high concurrency the map could become large. +- Scaling path: The current design is appropriate for single-process use. If multi-process coordination is ever needed, consider file-based locks or a lock server. + +**SQLite Single-Writer:** +- Current capacity: SQLite with WAL mode supports concurrent reads but serializes writes. `busy_timeout` is set to 5000ms. +- Limit: Under heavy write load (many sessions saving simultaneously, especially during migration), writers can block for up to 5 seconds before failing. +- Files: `packages/cli/src/storage/db.ts` (line 80) +- Scaling path: Current setup is appropriate for a CLI tool. If server mode grows to handle many concurrent sessions, consider batching writes or moving to a client-server database. + +## Dependencies at Risk + +**Bun Runtime Workarounds:** +- Risk: Two separate workarounds for Bun bugs: `--no-cache` flag forced when proxied/CI due to [bun#19936](https://github.com/oven-sh/bun/issues/19936), and `timeout: false` on fetch due to [bun#16682](https://github.com/oven-sh/bun/issues/16682). +- Files: `packages/cli/src/config/config.ts` (line 271), `packages/cli/src/bun/index.ts` (line 92), `packages/cli/src/provider/provider.ts` (line 1097) +- Impact: Performance degradation from disabled caching. The timeout workaround could mask actual hanging requests. +- Migration plan: Monitor Bun issue tracker. Remove workarounds when upstream fixes are released. Add integration tests that verify the workarounds are still needed. + +**Copilot Rate Limits:** +- Risk: The Copilot messages API integration is partially disabled due to rate limits. Comment says "TODO: re-enable once messages api has higher rate limits." +- Files: `packages/cli/src/plugin/copilot.ts` (lines 43-44) +- Impact: Copilot users are forced onto a less capable API path. The "hacky-ness" comment suggests the current workaround is not production-quality. +- Migration plan: Monitor GitHub Copilot API rate limit changes. Re-enable messages API when limits are raised. + +**models.dev External Dependency:** +- Risk: Model metadata is fetched from `https://models.dev/api.json` at startup and every hour. If models.dev goes down, new installations without a bundled snapshot would have no model data. +- Files: `packages/cli/src/provider/models.ts` (lines 84-99, 106-121, 124-132) +- Impact: Complete failure to list available models if both the cache file and bundled snapshot are missing and the network request fails. +- Migration plan: The bundled snapshot (`models-snapshot.ts`) is a fallback. Ensure the snapshot is always up-to-date at build time. Consider adding a longer cache TTL or a stale-while-revalidate strategy. + +## Test Coverage Gaps + +**MCP Integration:** +- What's not tested: MCP server lifecycle management, OAuth flow, process tree cleanup, tool discovery. +- Files: `packages/cli/src/mcp/index.ts` (961 lines), `packages/cli/src/mcp/auth.ts`, `packages/cli/src/mcp/oauth-provider.ts` +- Risk: MCP is a complex subsystem managing external server processes. Silent failures in cleanup (orphaned processes) or auth (token refresh) would go unnoticed. +- Priority: High -- MCP servers are long-lived processes that interact with the OS process table. + +**LSP Server Management:** +- What's not tested: LSP server downloading, installation, process lifecycle, auto-detection of language-specific servers. +- Files: `packages/cli/src/lsp/server.ts` (2,057 lines -- largest source file), `packages/cli/src/lsp/index.ts` (485 lines) +- Risk: LSP server installation downloads binaries from GitHub Releases (e.g., zls, clangd) using `as any` on response JSON. Download failures, architecture mismatches, and permission issues are all untested. +- Priority: High -- this file downloads and executes arbitrary binaries. + +**Worktree Management:** +- What's not tested: Git worktree creation, cleanup, sandbox registration. +- Files: `packages/cli/src/worktree/index.ts` (643 lines) +- Risk: Worktree operations involve git commands that modify the repository state. Failures could leave orphaned worktrees or corrupt the git index. +- Priority: Medium -- worktrees are used for sandboxed execution. + +**Storage Migration (JSON to SQLite):** +- What's not tested while running: The full end-to-end migration from JSON storage to SQLite, including edge cases like missing fields, corrupt JSON, and interrupted migrations. +- Files: `packages/cli/src/storage/json-migration.ts` (403 lines), `packages/cli/src/storage/storage.ts` (217 lines) +- Risk: Data loss during migration if edge cases are hit. The migration uses `any[]` arrays for batch inserts, bypassing type checking. +- Priority: Medium -- migration is a one-time operation but data loss is severe. Test file exists (`packages/cli/test/storage/json-migration.test.ts`, 846 lines) which is good, but uses `as any` extensively. + +**Provider Custom Loaders:** +- What's not tested: Most of the 15+ custom provider loaders (AWS Bedrock, Azure, Vertex AI, SAP AI Core, Cloudflare, etc.) are defined in `packages/cli/src/provider/provider.ts` but only a subset have test coverage. +- Files: `packages/cli/src/provider/provider.ts` (lines 98-540) +- Risk: Auth flows, environment variable handling, and SDK initialization for specific providers could break without detection. The Amazon Bedrock test exists but is only 446 lines for a complex auth flow. +- Priority: Medium -- broken provider loading surfaces as user-facing errors. + +**Snapshot System:** +- What's not tested: Snapshot creation, pruning, diff generation, and garbage collection via git. +- Files: `packages/cli/src/snapshot/index.ts` (159 lines) +- Risk: The snapshot system runs `git gc` with pruning and creates a separate git directory. Test file exists (`packages/cli/test/snapshot/snapshot.test.ts`, 1,180 lines) which provides good coverage. +- Priority: Low -- well-tested. + +--- + +*Concerns audit: 2026-03-08* diff --git a/.planning/codebase/CONVENTIONS.md b/.planning/codebase/CONVENTIONS.md new file mode 100644 index 0000000..16b61e8 --- /dev/null +++ b/.planning/codebase/CONVENTIONS.md @@ -0,0 +1,193 @@ +# Coding Conventions + +**Analysis Date:** 2026-03-08 + +## Naming Patterns + +**Files:** +- Use `kebab-case` for filenames: `message-v2.ts`, `bus-event.ts`, `json-migration.ts` +- SQL schema files use `.sql.ts` suffix: `session.sql.ts`, `project.sql.ts`, `schema.sql.ts`, `control.sql.ts` +- Module entry points use `index.ts` inside directories: `packages/cli/src/bus/index.ts`, `packages/cli/src/session/index.ts` +- Text prompt files use `.txt` extension: `packages/cli/src/tool/bash.txt`, `packages/cli/src/agent/prompt/title.txt` +- Test files use `.test.ts` suffix: `bash.test.ts`, `session.test.ts` + +**Exports / Modules:** +- Use `PascalCase` namespaces for all major modules. Every domain module is a TypeScript `namespace`: + - `Session`, `Bus`, `Config`, `Provider`, `File`, `Agent`, `Tool`, `Database`, `Instance`, `MCP`, `Plugin`, `Pty`, `Vcs`, `Worktree`, `Shell`, `Global`, `Installation`, `Filesystem`, `Log`, `Ripgrep`, `Identifier`, `Control`, `Storage` +- Prefer `export namespace Foo {}` over `export class Foo {}` or scattered exports +- Namespace members are exported directly: `Session.create()`, `Bus.publish()`, `Config.get()` + +**Functions:** +- Use `camelCase` for all functions: `fromRow()`, `toRow()`, `createDefaultTitle()`, `containsPath()` +- Factory functions prefixed with `create`: `createModel()`, `createAictrlClient()`, `createOpencodeServer()` +- Boolean-returning functions prefixed with `is`: `isDefaultTitle()`, `isOverflow()`, `isGpt5OrLater()` + +**Variables:** +- Use `camelCase` for local variables: `eventReceived`, `receivedInfo`, `transportCalls` +- Use `UPPER_SNAKE_CASE` for constants: `MAX_METADATA_LENGTH`, `DEFAULT_TIMEOUT`, `OVERFLOW_PATTERNS`, `BUNDLED_PROVIDERS` +- Use `PascalCase` for Zod schemas that double as types: `Session.Info`, `File.Content`, `Agent.Info` + +**Types:** +- Zod-inferred types share the name with their schema constant (dual export pattern): + ```typescript + export const Info = z.object({ ... }) + export type Info = z.infer + ``` +- Interface names use `PascalCase`: `Context`, `Options`, `Metadata` +- Type aliases use `PascalCase`: `ParsedStreamError`, `ParsedAPICallError`, `SessionRow` + +**Database Columns:** +- Use `snake_case` for all database column names: `project_id`, `time_created`, `share_url`, `summary_additions` +- Table names are `PascalCase` + `Table` suffix: `SessionTable`, `MessageTable`, `PartTable`, `ProjectTable` + +## Code Style + +**Formatting:** +- Prettier (version 3.6.2) configured in root `package.json` +- `semi: false` (no semicolons) +- `printWidth: 120` +- No `.prettierrc` file; config is inline in `package.json` + +**Linting:** +- No ESLint or Biome configured. Prettier is the sole formatting tool. +- Pre-push hook runs `bun typecheck` (not pre-commit): `.husky/pre-push` + +**TypeScript:** +- Strict mode via `@tsconfig/bun` base: `packages/cli/tsconfig.json` +- `noUncheckedIndexedAccess: false` (explicitly disabled in CLI package) +- JSX preserve mode with SolidJS: `"jsx": "preserve"`, `"jsxImportSource": "@opentui/solid"` +- Custom conditions: `"customConditions": ["browser"]` +- ESM modules throughout: `"type": "module"` in all package.json files + +## Import Organization + +**Order:** +1. External workspace packages: `@aictrl/util/error`, `@aictrl/sdk/v2`, `@aictrl/plugin` +2. External npm packages: `zod`, `remeda`, `ai`, `path`, `fs` +3. Path-aliased internal imports: `@/bus`, `@/util/log`, `@/project/instance` +4. Relative imports: `../util/log`, `../../src/tool/bash` + +**Path Aliases:** +- `@/*` maps to `./src/*` (CLI package): `import { Bus } from "@/bus"` +- `@tui/*` maps to `./src/cli/cmd/tui/*` (TUI components) +- Both aliases configured in `packages/cli/tsconfig.json` + +**Import Style:** +- Use `import z from "zod"` (default import for Zod) +- Use `import type` for type-only imports: `import type { LanguageModelV2 } from "@openrouter/ai-sdk-provider"` +- Imports from workspace packages use subpath exports: `@aictrl/util/error`, `@aictrl/util/slug`, `@aictrl/util/fn` +- Relative imports mix with alias imports in the same file (both `@/` and `../` are used) + +## Error Handling + +**Patterns:** +- Custom typed errors via `NamedError.create()` factory from `@aictrl/util/error`: + ```typescript + export const NotFoundError = NamedError.create( + "NotFoundError", + z.object({ message: z.string() }), + ) + ``` +- Error checking via `.isInstance()` static method: `Config.JsonError.isInstance(input)` +- `FormatError()` function in `packages/cli/src/cli/error.ts` formats known error types for user display +- Unknown errors handled in catch blocks with type narrowing: `if (e instanceof Error)`, `if (e instanceof NamedError)` +- API errors parsed via dedicated `ProviderError` namespace: `packages/cli/src/provider/error.ts` +- Provider overflow detection uses regex pattern matching against error messages +- Use `Error` with `cause` option for wrapping: `throw new Error("message", { cause: error })` + +**Throw patterns:** +- Throw `Error` directly for tool validation failures +- Throw `NamedError` subclasses for domain-specific errors +- Catch blocks use `try {} catch {}` (no variable) for optional parsing: `try { JSON.parse(...) } catch {}` + +## Logging + +**Framework:** Custom `Log` namespace at `packages/cli/src/util/log.ts` + +**Patterns:** +- Create scoped loggers with `Log.create({ service: "name" })`: + ```typescript + const log = Log.create({ service: "session" }) + log.info("publishing", { type: def.type }) + ``` +- Default logger: `Log.Default.error("fatal", data)` +- Log levels: `DEBUG`, `INFO`, `WARN`, `ERROR` +- Logs write to file by default, stderr only in dev/print mode +- Initialize logging early: `Log.init({ print: false, dev: true, level: "DEBUG" })` +- Log cleanup keeps last 5 log files + +## Comments + +**When to Comment:** +- Use comments sparingly; code is the documentation +- Use `//` inline comments for non-obvious logic: `// Windows can keep SQLite WAL handles alive until GC finalizers run` +- Use `TODO:` for known improvements: `// TODO: we may wanna rename this tool so it works better on other shells` + +**JSDoc/TSDoc:** +- Minimal JSDoc usage (~142 occurrences across 38 files, mostly in copilot SDK adapter code) +- JSDoc used primarily in plugin interface definitions (`packages/cli/src/plugin/index.ts`) for hook descriptions +- Source code in `packages/cli/src/` rarely uses JSDoc; prefer self-documenting code + +## Function Design + +**Size:** Functions are generally small (10-30 lines). Complex functions split into helper functions within the same namespace. + +**Parameters:** +- Use object parameters for functions with 3+ arguments: + ```typescript + export async function provide(input: { directory: string; init?: () => Promise; fn: () => R }) + ``` +- Zod schemas define tool parameters inline + +**Return Values:** +- Functions return plain objects, not class instances +- Async functions return `Promise` (never callbacks) +- Use `iife()` helper from `packages/cli/src/util/iife.ts` for inline async computation +- Use `lazy()` helper from `packages/cli/src/util/lazy.ts` for deferred initialization + +## Module Design + +**Exports:** +- Each major module is a namespace (`export namespace Session {}`) +- Namespace contains both runtime code and type definitions +- Zod schemas serve as both runtime validators and type sources (dual pattern) +- Bus events defined via `BusEvent.define()` with Zod schemas for type-safe event payloads + +**Barrel Files:** +- No barrel files (`index.ts` re-exporting from siblings). Each `index.ts` is the module itself. +- SDK package uses standard re-exports: `export * from "./client.js"` + +**State Management:** +- Module-scoped state via `Instance.state()`: creates per-project-instance state with automatic cleanup +- `AsyncLocalStorage`-based context via `Context.create()` from `packages/cli/src/util/context.ts` +- Global singletons for cross-instance state: `GlobalBus`, `Database` + +## Zod Usage + +**Schema Definition:** +- Always use `.meta({ ref: "Name" })` on top-level schemas for OpenAPI/JSON Schema generation: + ```typescript + export const Info = z.object({ ... }).meta({ ref: "Session" }) + ``` +- Use `z.enum()` for string unions: `z.enum(["added", "deleted", "modified"])` +- Use `z.discriminatedUnion()` for tagged unions +- Identifier schemas use prefix validation: `Identifier.schema("session")` + +**Validation:** +- Tool parameters validated via Zod `.parse()` in `Tool.define()` wrapper +- Config validated via Zod schemas with detailed error reporting + +## Async Patterns + +**Conventions:** +- Use `await using` for disposable resources (TC39 Explicit Resource Management): + ```typescript + await using tmp = await tmpdir({ git: true }) + ``` +- Use `AbortSignal` for cancellation: `abort: AbortSignal.any([])` +- Use `Promise.allSettled()` for concurrent operations that may fail independently +- Use `Bun.sleep()` instead of `setTimeout` for simple delays in Bun context + +--- + +*Convention analysis: 2026-03-08* diff --git a/.planning/codebase/INTEGRATIONS.md b/.planning/codebase/INTEGRATIONS.md new file mode 100644 index 0000000..d6685c1 --- /dev/null +++ b/.planning/codebase/INTEGRATIONS.md @@ -0,0 +1,300 @@ +# External Integrations + +**Analysis Date:** 2026-03-08 + +## APIs & External Services + +**AI/LLM Providers (via Vercel AI SDK):** + +All providers are lazy-loaded via dynamic imports in `packages/cli/src/provider/provider.ts` (`BUNDLED_PROVIDERS` map). Model metadata is fetched from models.dev. + +- Anthropic (Claude) - `@ai-sdk/anthropic` + - Auth: `ANTHROPIC_API_KEY` env var or OAuth via plugin + - Custom headers: `anthropic-beta` for Claude Code features, interleaved thinking, fine-grained tool streaming +- OpenAI - `@ai-sdk/openai` + - Auth: `OPENAI_API_KEY` + - Uses Responses API for newer models +- Google Gemini - `@ai-sdk/google` + - Auth: `GOOGLE_API_KEY` / `GOOGLE_GENERATIVE_AI_API_KEY` +- Google Vertex AI - `@ai-sdk/google-vertex` + - Auth: `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION` env vars, plus application default credentials + - Custom URL template with `${GOOGLE_VERTEX_PROJECT}`, `${GOOGLE_VERTEX_LOCATION}` variables +- Amazon Bedrock - `@ai-sdk/amazon-bedrock` + - Auth: `AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` or credentials provider chain +- Azure OpenAI - `@ai-sdk/azure` + - Auth: `AZURE_API_KEY`, `AZURE_RESOURCE_NAME` + - Supports both Responses and Chat completion APIs via `useCompletionUrls` option +- Azure Cognitive Services - Custom loader in `packages/cli/src/provider/provider.ts` + - Auth: `AZURE_COGNITIVE_SERVICES_RESOURCE_NAME` +- xAI (Grok) - `@ai-sdk/xai` + - Auth: `XAI_API_KEY` +- Groq - `@ai-sdk/groq` + - Auth: `GROQ_API_KEY` +- Mistral - `@ai-sdk/mistral` + - Auth: `MISTRAL_API_KEY` +- OpenRouter - `@openrouter/ai-sdk-provider` (patched) + - Auth: `OPENROUTER_API_KEY` +- Cerebras - `@ai-sdk/cerebras` + - Auth: `CEREBRAS_API_KEY` +- Cohere - `@ai-sdk/cohere` + - Auth: `COHERE_API_KEY` +- DeepInfra - `@ai-sdk/deepinfra` + - Auth: `DEEPINFRA_API_KEY` +- Together AI - `@ai-sdk/togetherai` + - Auth: `TOGETHER_AI_API_KEY` +- Perplexity - `@ai-sdk/perplexity` + - Auth: `PERPLEXITY_API_KEY` +- Vercel AI - `@ai-sdk/vercel` + - Auth: via Vercel AI Gateway +- GitLab AI - `@gitlab/gitlab-ai-provider` + - Auth: GitLab token +- Cloudflare AI Gateway - `ai-gateway-provider` + - Auth: Cloudflare credentials +- Vercel AI Gateway - `@ai-sdk/gateway` + - Auth: Vercel token + +**GitHub Copilot:** +- Custom OpenAI-compatible provider SDK in `packages/cli/src/provider/sdk/copilot/` +- Full Chat Completions and Responses API implementations +- OAuth device code flow for authentication (`packages/cli/src/plugin/copilot.ts`) + - Device code URL: `https://github.com/login/device/code` + - Access token URL: `https://github.com/login/oauth/access_token` + - Client ID: `Ov23li8tweQw6odWQebz` +- Enterprise Copilot support via custom base URL (`copilot-api.{enterprise-domain}`) + +**OpenAI Codex:** +- Auth plugin in `packages/cli/src/plugin/codex.ts` + +**models.dev:** +- Model metadata registry at `https://models.dev/api.json` + - Fetched at runtime or from bundled snapshot (`packages/cli/src/provider/models-snapshot.ts`, auto-generated at build) + - Cached locally at `~/.cache/aictrl/models.json` + - Override URL: `AICTRL_MODELS_URL` env var + - Override file: `AICTRL_MODELS_PATH` env var + - Implementation: `packages/cli/src/provider/models.ts` + +**Exa Web Search:** +- MCP-based web search via `https://mcp.exa.ai/mcp` + - Implementation: `packages/cli/src/tool/websearch.ts` + - Auth: `EXA_API_KEY` env var + - Feature flag: `AICTRL_ENABLE_EXA` or `AICTRL_EXPERIMENTAL` + +## Data Storage + +**Local Database (CLI):** +- SQLite via Bun's native `bun:sqlite` driver + - Location: `~/.local/share/aictrl/aictrl.db` + - ORM: Drizzle ORM (`drizzle-orm/bun-sqlite`) + - Client: `packages/cli/src/storage/db.ts` + - Schema: `packages/cli/src/storage/schema.ts` (re-exports from domain modules) + - Migrations: `packages/cli/migration/` (5 SQL migrations) + - PRAGMA settings: WAL mode, NORMAL sync, 5s busy timeout, 64MB cache, foreign keys ON + - Tables: `ControlAccountTable`, `SessionTable`, `MessageTable`, `PartTable`, `TodoTable`, `PermissionTable`, `SessionShareTable`, `ProjectTable` + +**JSON File Storage (Legacy/Hybrid):** +- JSON files in `~/.local/share/aictrl/storage/` + - Implementation: `packages/cli/src/storage/storage.ts` + - Migration from JSON to SQLite: `packages/cli/src/storage/json-migration.ts` + - Uses file-level read/write locking + +**Auth Storage:** +- Local JSON file at `~/.local/share/aictrl/auth.json` + - Implementation: `packages/cli/src/auth/index.ts` + - Stored with `0o600` permissions + - Supports: OAuth (access/refresh tokens with expiry), API keys, well-known auth + +**Cloud Database (Console - not CLI):** +- PlanetScale (MySQL) - `infra/console.ts` + - Organization: `anomalyco`, Database: `aictrl` + - Per-stage branches (production branch or dev branches) + - ORM: Drizzle Kit for schema management + - Connection via PlanetScale password credentials + +**Cloud File Storage (Console - not CLI):** +- Cloudflare R2 buckets + - `Bucket` - API storage (`infra/app.ts`) + - `ZenData`, `ZenDataNew` - Console data (`infra/console.ts`) + - `EnterpriseStorage` - Enterprise/Teams data (`infra/enterprise.ts`) + +**Caching:** +- Local filesystem cache at `~/.cache/aictrl/` + - Model data cache + - Version-tagged cache invalidation (current: version 21) + - No external cache service + +## Authentication & Identity + +**CLI Provider Auth:** +- File-based auth store (`~/.local/share/aictrl/auth.json`) +- Three auth types: OAuth (refresh/access tokens), API keys, well-known endpoints +- Plugin-based auth hooks (`packages/cli/src/provider/auth.ts`, `packages/cli/src/plugin/index.ts`) +- Built-in auth plugins: GitHub Copilot OAuth, OpenAI Codex +- External plugins loaded from npm at runtime via `Config.plugin` + +**Console Auth (not CLI):** +- OpenAuth (`@openauthjs/openauth`) on Cloudflare Workers + - Implementation: `packages/console/function/src/auth.ts` + - Cloudflare KV for auth session storage + - GitHub OAuth: `GITHUB_CLIENT_ID_CONSOLE`, `GITHUB_CLIENT_SECRET_CONSOLE` + - Google OAuth: `GOOGLE_CLIENT_ID` + - Deployed at: `auth.{domain}` + +**Enterprise/Control Auth:** +- OAuth refresh token flow in `packages/cli/src/control/index.ts` +- Refresh endpoint: `{url}/oauth/token` +- Stored in SQLite `ControlAccountTable` + +## Monitoring & Observability + +**Error Tracking:** +- No external error tracking service +- Structured logging to local files via `packages/cli/src/util/log.ts` +- `NamedError` pattern from `@aictrl/util/error` for typed errors + +**Logs:** +- Local file-based logging at `~/.local/share/aictrl/log/` +- Structured JSON log entries with service tags +- Log levels: DEBUG, INFO, WARN, ERROR +- Console/cloud: Honeycomb via `HONEYCOMB_API_KEY` (log processor worker in `infra/console.ts`) + +**NDJSON Output:** +- `--format json` flag produces NDJSON output for headless execution observability +- Primary interface for CI/automation consumers + +## CI/CD & Deployment + +**CI Pipeline:** +- GitHub Actions (`.github/workflows/ci.yml`) + - Trigger: push/PR to main + - Steps: install (bun), build (turbo), typecheck, test + - Runner: ubuntu-latest with Bun + +**Automated Code Review:** +- GitHub Actions (`.github/workflows/code-review.yml`) + - Trigger: PR to main/master, or manual workflow_dispatch + - Uses `aictrl run` CLI with `zai-coding-plan/glm-5` model + - Auth: `ZHIPUAI_API_KEY` + - Smart SHA tracking to avoid duplicate reviews + +**Publishing:** +- GitHub Actions (`.github/workflows/publish.yml`) + - Trigger: GitHub Release published + - npm OIDC trusted publishing (no tokens) + - Publishes: `@aictrl/util`, `@aictrl/plugin`, `@aictrl/sdk`, `@aictrl/cli-linux-x64`, `@aictrl/cli` + - CLI dependencies stripped from published package.json (all bundled in binary) + +**Cloud Deployment:** +- SST Ion (`sst.config.ts`) - Cloudflare-native deployment + - Home: Cloudflare + - Providers: Stripe, PlanetScale + - Stages: production (`aictrl.ai`), dev (`dev.aictrl.ai`), per-developer branches + - Components deployed: + - `Api` - Cloudflare Worker at `api.{domain}` (`infra/app.ts`) + - `Web` - Astro docs site at `docs.{domain}` (`infra/app.ts`) + - `WebApp` - Static SolidStart app at `app.{domain}` (`infra/app.ts`) + - `Console` - SolidStart app at `{domain}` (`infra/console.ts`) + - `AuthApi` - Auth worker at `auth.{domain}` (`infra/console.ts`) + - `Teams` - Enterprise SolidStart app at `{shortDomain}` (`infra/enterprise.ts`) + +## Protocols + +**MCP (Model Context Protocol):** +- Client implementation: `packages/cli/src/mcp/index.ts` +- Transports: Streamable HTTP, SSE, Stdio +- OAuth support for MCP servers: `packages/cli/src/mcp/oauth-provider.ts`, `packages/cli/src/mcp/oauth-callback.ts` +- Configured via `aictrl.json` config file + +**ACP (Agent Client Protocol):** +- Server implementation: `packages/cli/src/acp/agent.ts`, `packages/cli/src/acp/session.ts` +- NDJSON streaming over stdio +- Command: `aictrl acp` + +**LSP (Language Server Protocol):** +- Client: `packages/cli/src/lsp/client.ts` +- Server management: `packages/cli/src/lsp/server.ts` +- JSON-RPC over stdio (`vscode-jsonrpc`) +- Used for code intelligence (symbols, diagnostics) + +## GitHub Integration + +**GitHub API:** +- `@octokit/rest` 22.0.0 - REST API client +- `@octokit/graphql` 9.0.2 - GraphQL API client +- Used in: `packages/cli/src/cli/cmd/github.ts` +- Handles: PR review, issue comments, code review workflows + +**GitHub Actions:** +- `@actions/core` 1.11.1 - Core actions utilities +- `@actions/github` 6.0.1 - GitHub context access +- Used in: `packages/cli/src/cli/cmd/github.ts` +- Webhook event types from `@octokit/webhooks-types` + +**GitHub App (Console):** +- App ID: `GITHUB_APP_ID` secret +- Private key: `GITHUB_APP_PRIVATE_KEY` secret +- Used by API worker (`infra/app.ts`) + +## Third-Party Services (Console) + +**Stripe (Billing):** +- SST Stripe provider in `sst.config.ts` +- Webhook endpoint at `https://{domain}/stripe/webhook` +- Products: "Aictrl Go" ($10/mo), "Aictrl Black" ($20/$100/$200/mo tiers) +- Events: checkout, charge, invoice, customer, subscription lifecycle +- Secrets: `STRIPE_SECRET_KEY`, `STRIPE_PUBLISHABLE_KEY`, `STRIPE_WEBHOOK_SECRET` + +**Discord:** +- Support bot integration +- Secrets: `DISCORD_SUPPORT_BOT_TOKEN`, `DISCORD_SUPPORT_CHANNEL_ID` + +**Feishu (Lark):** +- App integration +- Secrets: `FEISHU_APP_ID`, `FEISHU_APP_SECRET` + +**EmailOctopus:** +- Email marketing/newsletter +- Secret: `EMAILOCTOPUS_API_KEY` + +**AWS SES:** +- Transactional email +- Secrets: `AWS_SES_ACCESS_KEY_ID`, `AWS_SES_SECRET_ACCESS_KEY` + +## Environment Configuration + +**Required env vars (CLI runtime - at least one provider key):** +- At minimum one AI provider API key (e.g., `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) +- Or OAuth auth stored in `~/.local/share/aictrl/auth.json` + +**Optional env vars (CLI):** +- `AICTRL_CONFIG` - Custom config file path +- `AICTRL_MODELS_URL` - Custom models.dev URL +- `EXA_API_KEY` - Exa web search +- `GITHUB_TOKEN` - GitHub API access +- Full feature flags in `packages/cli/src/flag/flag.ts` + +**Required env vars (Console/Infrastructure):** +- `STRIPE_SECRET_KEY` - Stripe API +- `CLOUDFLARE_DEFAULT_ACCOUNT_ID` - Cloudflare account +- `CLOUDFLARE_API_TOKEN` - Cloudflare API +- Various secrets managed via SST Secret resources + +**Secrets location:** +- CLI: `~/.local/share/aictrl/auth.json` (local, 0600 permissions) +- Infrastructure: SST Secrets (encrypted, per-stage) +- CI: GitHub Actions secrets + +## Webhooks & Callbacks + +**Incoming:** +- Stripe webhook at `https://{domain}/stripe/webhook` - billing events +- GitHub App webhooks at API worker - PR/issue events + +**Outgoing:** +- Session sharing to `https://opncd.ai` (or enterprise URL) - `packages/cli/src/share/share-next.ts` + - Syncs session, message, and part data + - Can be disabled via `AICTRL_DISABLE_SHARE=true` +- MCP OAuth callback server (local) - `packages/cli/src/mcp/oauth-callback.ts` + +--- + +*Integration audit: 2026-03-08* diff --git a/.planning/codebase/STACK.md b/.planning/codebase/STACK.md new file mode 100644 index 0000000..d72e0b2 --- /dev/null +++ b/.planning/codebase/STACK.md @@ -0,0 +1,169 @@ +# Technology Stack + +**Analysis Date:** 2026-03-08 + +## Languages + +**Primary:** +- TypeScript 5.8.2 - All application code across every package +- SQL (SQLite dialect) - Database migrations in `packages/cli/migration/` + +**Secondary:** +- Shell/Bash - CI workflows in `.github/workflows/`, container scripts in `packages/containers/` +- WASM - tree-sitter grammars loaded at runtime for bash parsing (`packages/cli/src/tool/bash.ts`) + +## Runtime + +**Environment:** +- Bun 1.3.10 - Primary runtime for CLI execution, testing, and builds +- Node.js 22 - Used in CI for npm publishing (OIDC trusted publishing requires npm >= 11.5.1) + +**Package Manager:** +- Bun (workspace protocol) - Monorepo dependency management +- Lockfile: `bun.lockb` (binary lockfile, present) + +**Configuration:** +- `bunfig.toml` - Sets `exact = true` for installs, blocks root-level test runs +- `package.json` field `"packageManager": "bun@1.3.10"` + +## Frameworks + +**Core:** +- Vercel AI SDK (`ai` 5.0.124) - Unified LLM provider interface, streaming, tool calling +- Yargs 18.0.0 - CLI command framework (`packages/cli/src/index.ts`) +- SolidJS 1.9.10 - TUI rendering via `@opentui/solid` and `@opentui/core` (terminal UI framework) +- Drizzle ORM 1.0.0-beta.12 - Local SQLite database for sessions, messages, projects + +**Testing:** +- Bun's built-in test runner (`bun test --timeout 30000`) + +**Build/Dev:** +- Turborepo 2.5.6 - Monorepo task orchestration (`turbo.json`) +- SST 3.18.10 (Ion) - Infrastructure-as-code for Cloudflare deployments +- Custom Bun build script (`packages/cli/script/build.ts`) - Compiles platform-specific binaries + +## Key Dependencies + +**Critical (AI/LLM):** +- `ai` 5.0.124 - Core AI SDK for `streamText`, `generateObject`, tool definitions +- `@ai-sdk/anthropic` 2.0.65 - Anthropic Claude provider +- `@ai-sdk/openai` 2.0.89 - OpenAI provider +- `@ai-sdk/google` 2.0.54 - Google Gemini provider +- `@ai-sdk/google-vertex` 3.0.106 - Google Vertex AI provider +- `@ai-sdk/amazon-bedrock` 3.0.82 - AWS Bedrock provider +- `@ai-sdk/azure` 2.0.91 - Azure OpenAI provider +- `@ai-sdk/xai` 2.0.51 - xAI/Grok provider +- `@ai-sdk/groq` 2.0.34 - Groq provider +- `@ai-sdk/mistral` 2.0.27 - Mistral provider +- `@openrouter/ai-sdk-provider` 1.5.4 - OpenRouter aggregator (patched) +- `@ai-sdk/gateway` 2.0.30 - Vercel AI Gateway +- `@gitlab/gitlab-ai-provider` 3.6.0 - GitLab AI provider +- `ai-gateway-provider` 2.3.1 - Cloudflare AI Gateway +- `@ai-sdk/cerebras` 1.0.36 - Cerebras provider +- `@ai-sdk/cohere` 2.0.22 - Cohere provider +- `@ai-sdk/deepinfra` 1.0.36 - DeepInfra provider +- `@ai-sdk/togetherai` 1.0.34 - Together AI provider +- `@ai-sdk/perplexity` 2.0.23 - Perplexity provider +- `@ai-sdk/vercel` 1.0.33 - Vercel provider + +All AI provider SDKs are **optional peerDependencies**, lazy-loaded via dynamic imports in `packages/cli/src/provider/provider.ts` (see `BUNDLED_PROVIDERS` map). They are bundled into compiled binaries at build time. + +**Critical (Protocols):** +- `@modelcontextprotocol/sdk` 1.25.2 - MCP client for tool server connectivity +- `@agentclientprotocol/sdk` 0.14.1 - ACP server for agent-to-agent communication +- `vscode-jsonrpc` 8.2.1 - JSON-RPC protocol for LSP communication + +**Infrastructure:** +- `drizzle-orm` 1.0.0-beta.12 + `bun:sqlite` - Local SQLite database +- `zod` 4.1.8 - Runtime schema validation everywhere +- `solid-js` 1.9.10 - Terminal UI reactivity +- `@opentui/core` + `@opentui/solid` 0.1.81 - Terminal rendering framework +- `remeda` 2.26.0 - Functional utility library (used instead of lodash) +- `web-tree-sitter` 0.25.10 + `tree-sitter-bash` 0.25.0 - Bash command parsing for security analysis +- `xdg-basedir` 5.1.0 - XDG directory resolution for data/config/cache paths +- `ulid` 3.0.1 - ID generation +- `@octokit/rest` 22.0.0 + `@octokit/graphql` 9.0.2 - GitHub API integration +- `@actions/core` 1.11.1 + `@actions/github` 6.0.1 - GitHub Actions integration +- `turndown` 7.2.0 - HTML-to-markdown conversion (web fetch tool) +- `fuzzysort` 3.1.0 - Fuzzy search for model/command matching +- `jsonc-parser` 3.3.1 - JSONC config file parsing +- `gray-matter` 4.0.3 - Markdown frontmatter parsing (skills/agents) + +**Workspace Internal:** +- `@aictrl/cli` - Main CLI package (compiled binary) +- `@aictrl/sdk` 0.1.2 - Client/server SDK with OpenAPI-generated types +- `@aictrl/plugin` 1.2.16 - Plugin interface definitions +- `@aictrl/util` 1.2.16 - Shared utilities (error types, etc.) +- `@aictrl/script` 0.1.1 - Build scripting utilities + +## Monorepo Structure + +**Workspace catalog** (`package.json` `"catalog"` field) pins shared dependency versions across all packages. Dependencies reference `"catalog:"` instead of version numbers. + +**Patched dependencies:** +- `@standard-community/standard-openapi@0.2.9` - Patched via `patches/` directory +- `@openrouter/ai-sdk-provider@1.5.4` - Patched via `patches/` directory + +**Trusted dependencies** (allowed to run install scripts): +- `esbuild`, `protobufjs`, `tree-sitter`, `tree-sitter-bash`, `web-tree-sitter` + +## Configuration + +**Environment Variables (CLI runtime):** +- Provider API keys: `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`, `AWS_REGION`, etc. (per provider) +- `AICTRL_CONFIG` - Custom config file path +- `AICTRL_CONFIG_DIR` - Custom config directory +- `AICTRL_CONFIG_CONTENT` - Inline JSON config +- `AICTRL_PERMISSION` - Permission mode override +- `AICTRL_MODELS_URL` - Override models.dev URL (default: `https://models.dev`) +- `AICTRL_MODELS_PATH` - Local models JSON path override +- `AICTRL_DISABLE_SHARE` - Disable session sharing +- `AICTRL_DISABLE_AUTOUPDATE` - Disable auto-update checks +- `AICTRL_DISABLE_DEFAULT_PLUGINS` - Skip default plugins +- `AICTRL_DISABLE_PROJECT_CONFIG` - Skip project-level config +- `AICTRL_DISABLE_MODELS_FETCH` - Skip fetching models from models.dev +- `AICTRL_HEADLESS` - Headless execution mode +- `AICTRL_ENABLE_EXA` - Enable Exa web search tool +- `AICTRL_EXPERIMENTAL` - Enable all experimental features +- Full list in `packages/cli/src/flag/flag.ts` + +**Build:** +- `turbo.json` - Turborepo build orchestration with `build`, `typecheck`, `test` tasks +- `sst.config.ts` - SST Ion config (Cloudflare home, Stripe + PlanetScale providers) +- `AICTRL_VERSION` - Injected at build time from release tag, declared as global (`AICTRL_VERSION`) +- `AICTRL_BUILD_SINGLE` - Build single platform binary only (for CI) + +**TypeScript:** +- Base: `@tsconfig/bun` extending to Bun-optimized settings +- CLI-specific: `packages/cli/tsconfig.json` adds JSX support (`preserve` mode, `@opentui/solid`), path aliases (`@/*` -> `./src/*`, `@tui/*` -> `./src/cli/cmd/tui/*`) + +**XDG Data Paths:** +- Data: `~/.local/share/aictrl/` (database, auth, logs, bin) +- Cache: `~/.cache/aictrl/` (models cache) +- Config: `~/.config/aictrl/` (global config, AGENTS.md) +- State: `~/.local/state/aictrl/` + +## Platform Requirements + +**Development:** +- Bun >= 1.3.10 +- Git (for VCS operations and project identification) +- Optional: LSP servers for code intelligence + +**Production (CLI binary):** +- Compiled Bun binary (platform-specific: linux-x64, linux-arm64, darwin-x64, darwin-arm64, win32-x64) +- No runtime dependencies needed - all deps bundled at build time +- SQLite database created automatically at `~/.local/share/aictrl/aictrl.db` +- Binary published as `@aictrl/cli` with platform-specific optional deps (`@aictrl/cli-linux-x64`, etc.) + +**Infrastructure (Console/Enterprise):** +- Cloudflare Workers (API, Auth, Console) +- Cloudflare R2 (file storage) +- Cloudflare KV (auth storage, gateway) +- Cloudflare Durable Objects (SyncServer) +- PlanetScale (MySQL - console database) +- Stripe (billing) + +--- + +*Stack analysis: 2026-03-08* diff --git a/.planning/codebase/STRUCTURE.md b/.planning/codebase/STRUCTURE.md new file mode 100644 index 0000000..6450e92 --- /dev/null +++ b/.planning/codebase/STRUCTURE.md @@ -0,0 +1,371 @@ +# Codebase Structure + +**Analysis Date:** 2026-03-08 + +## Directory Layout + +``` +opencode/ +├── packages/ +│ ├── cli/ # Core CLI application (main package) +│ │ ├── bin/ # Binary wrapper scripts +│ │ ├── dist/ # Build output (compiled binaries) +│ │ ├── migration/ # Drizzle SQLite migrations +│ │ ├── script/ # Build scripts (build.ts) +│ │ ├── src/ # Source code +│ │ └── test/ # Test files +│ ├── sdk/ # Public SDK for external consumers +│ │ └── src/ # SDK source (client, server, v2 types) +│ ├── plugin/ # Plugin type definitions and interfaces +│ │ └── src/ # Plugin types, tool definition types +│ ├── util/ # Shared utility library +│ │ └── src/ # Error types, identifiers, slugs, etc. +│ ├── script/ # Infra/deployment scripts +│ ├── containers/ # Docker container definitions +│ └── identity/ # Brand assets (logos, marks) +├── .opencode/ # Project-level aictrl configuration +│ ├── agent/ # Custom agent definitions (.md) +│ ├── command/ # Custom command templates (.md) +│ ├── glossary/ # Translation glossaries +│ ├── skills/ # Skill definitions (SKILL.md) +│ ├── themes/ # UI themes (.json) +│ ├── tool/ # Custom tool implementations (.ts) +│ └── opencode.jsonc # Project config file +├── .github/ # GitHub workflows and CI config +├── infra/ # SST infrastructure definitions +├── sdks/ # External SDK extensions (vscode) +├── specs/ # Project specifications +├── nix/ # Nix flake scripts +├── script/ # Root-level utility scripts +├── patches/ # Dependency patches +├── .planning/ # GSD planning documents +├── package.json # Root workspace manifest +├── turbo.json # Turborepo configuration +├── sst.config.ts # SST infrastructure config +├── tsconfig.json # Root TypeScript config +├── bunfig.toml # Bun configuration +└── AGENTS.md # AI agent instructions for this repo +``` + +## Directory Purposes + +**`packages/cli/src/` (Core Application):** +``` +src/ +├── index.ts # CLI entry point (yargs setup) +├── headless.ts # Headless execution mode +├── sql.d.ts # SQL type declarations +├── acp/ # Agent Client Protocol implementation +│ ├── agent.ts # ACP agent handler +│ ├── session.ts # ACP session management +│ └── types.ts # ACP type definitions +├── agent/ # AI agent definitions +│ ├── agent.ts # Built-in agents, config merge, generation +│ └── prompt/ # Agent-specific prompts (.txt files) +├── auth/ # Authentication (OAuth, API key, well-known) +│ └── index.ts # Auth CRUD, file-based storage +├── bun/ # Bun-specific utilities (package install, proc) +├── bus/ # Event bus system +│ ├── index.ts # Per-instance Bus (publish/subscribe) +│ ├── global.ts # GlobalBus (cross-instance EventEmitter) +│ └── bus-event.ts # Typed event definition factory +├── cli/ # CLI layer +│ ├── cmd/ # Yargs command handlers +│ │ ├── run.ts # `aictrl run` - main command +│ │ ├── generate.ts # `aictrl generate` - structured output +│ │ ├── auth.ts # `aictrl auth` - provider auth +│ │ ├── agent.ts # `aictrl agent` - agent management +│ │ ├── mcp.ts # `aictrl mcp` - MCP server management +│ │ ├── acp.ts # `aictrl acp` - ACP server +│ │ ├── pr.ts # `aictrl pr` - PR operations +│ │ ├── github.ts # `aictrl github` - GitHub integration +│ │ ├── models.ts # `aictrl models` - list models +│ │ ├── export.ts # `aictrl export` - export sessions +│ │ ├── import.ts # `aictrl import` - import sessions +│ │ ├── session.ts # `aictrl session` - session management +│ │ ├── db.ts # `aictrl db` - database commands +│ │ ├── stats.ts # `aictrl stats` - usage statistics +│ │ ├── upgrade.ts # `aictrl upgrade` - self-update +│ │ ├── uninstall.ts # `aictrl uninstall` - self-remove +│ │ ├── debug/ # Debug subcommands +│ │ └── cmd.ts # Command helper wrapper +│ ├── bootstrap.ts # Instance bootstrap wrapper +│ ├── ui.ts # Terminal UI helpers +│ └── error.ts # Error formatting +├── command/ # Slash commands (init, review, custom) +│ ├── index.ts # Command registry and loading +│ └── template/ # Built-in command templates (.txt) +├── config/ # Configuration loading and merging +│ ├── config.ts # Multi-source config merger +│ ├── markdown.ts # Frontmatter markdown parser +│ └── paths.ts # Config file path resolution +├── control/ # Control plane (enterprise auth) +│ └── index.ts # OAuth token management +├── env/ # Per-instance environment variables +│ └── index.ts # Env.get/set/all (instance-isolated) +├── file/ # File system operations +│ ├── index.ts # File info, content, diffing, fuzzy search +│ ├── watcher.ts # File system watcher +│ ├── ripgrep.ts # Ripgrep integration +│ └── time.ts # File modification tracking +├── flag/ # Feature flags (env vars) +│ └── flag.ts # All AICTRL_* flags +├── format/ # Code formatting +│ ├── index.ts # Format initialization +│ └── formatter.ts # Formatter implementation +├── global/ # Global paths (XDG directories) +│ └── index.ts # Global.Path.data/cache/config/state +├── id/ # ID generation (ULID-based) +│ └── id.ts # Identifier.ascending/descending +├── ide/ # IDE integration +│ └── index.ts # IDE detection and commands +├── installation/ # Installation metadata +│ └── index.ts # VERSION, isLocal(), channel +├── lsp/ # LSP client integration +│ ├── index.ts # LSP namespace, diagnostics, symbols +│ ├── client.ts # LSP client connections +│ └── server.ts # LSP server management +├── mcp/ # Model Context Protocol +│ ├── index.ts # MCP client connections, tool bridging +│ ├── oauth-provider.ts # MCP OAuth flow +│ ├── oauth-callback.ts # MCP OAuth callback +│ └── auth.ts # MCP auth management +├── patch/ # Patch/diff utilities +├── permission/ # Permission system +│ └── next.ts # Rule-based permission (allow/deny/ask) +├── plugin/ # Plugin loading and hooks +│ ├── index.ts # Plugin.init(), trigger(), list() +│ ├── codex.ts # Built-in Codex auth plugin +│ └── copilot.ts # Built-in Copilot auth plugin +├── project/ # Project context and detection +│ ├── instance.ts # Instance context (AsyncLocalStorage) +│ ├── bootstrap.ts # InstanceBootstrap initialization +│ ├── project.ts # Project detection (git worktree) +│ ├── project.sql.ts # Project SQL schema +│ ├── state.ts # Per-instance state management +│ └── vcs.ts # VCS (git) integration +├── provider/ # AI model providers +│ ├── provider.ts # Provider registry (20+ providers) +│ ├── transform.ts # Provider-specific options +│ ├── models.ts # models.dev catalog fetching +│ └── sdk/ # Custom provider SDKs +│ └── copilot/ # GitHub Copilot implementation +├── pty/ # Pseudo-terminal support +├── question/ # Interactive question tool +├── scheduler/ # Periodic task scheduler +│ └── index.ts # Scheduler.register() with intervals +├── session/ # Session management (core domain) +│ ├── index.ts # Session CRUD, message/part ops +│ ├── prompt.ts # Prompt orchestration (LLM loop) +│ ├── llm.ts # LLM.stream() wrapper +│ ├── processor.ts # Stream processor (tool calls, text, etc.) +│ ├── system.ts # System prompts per provider +│ ├── instruction.ts # Instruction loading (AGENTS.md) +│ ├── compaction.ts # Context window compaction +│ ├── retry.ts # LLM retry logic +│ ├── revert.ts # Session revert (undo changes) +│ ├── status.ts # Session status (idle/busy/retry) +│ ├── summary.ts # Session summary generation +│ ├── message-v2.ts # Message/Part type definitions +│ ├── session.sql.ts # Drizzle SQL schema +│ ├── todo.ts # Todo list management +│ └── prompt/ # Prompt templates (.txt files) +├── share/ # Session sharing +│ └── share-next.ts # Share to opncd.ai +├── shell/ # Shell execution +│ └── shell.ts # Shell command runner +├── skill/ # Skill system +│ ├── skill.ts # Skill discovery and loading +│ └── discovery.ts # Skill file discovery +├── snapshot/ # Git-based file snapshots +│ └── index.ts # Snapshot tracking, cleanup +├── storage/ # Database and file storage +│ ├── db.ts # SQLite client (bun:sqlite + Drizzle) +│ ├── schema.ts # All table re-exports +│ ├── schema.sql.ts # Shared schema helpers +│ ├── json-migration.ts # JSON -> SQLite migration +│ └── storage.ts # File-based storage +├── tool/ # AI tools +│ ├── tool.ts # Tool.define() factory +│ ├── registry.ts # ToolRegistry (built-in + custom + plugin) +│ ├── bash.ts # Shell execution tool +│ ├── read.ts # File reading tool +│ ├── write.ts # File writing tool +│ ├── edit.ts # File editing tool (diff-based) +│ ├── glob.ts # File glob tool +│ ├── grep.ts # Content search tool +│ ├── task.ts # Subagent spawning tool +│ ├── skill.ts # Skill invocation tool +│ ├── webfetch.ts # Web fetching tool +│ ├── websearch.ts # Web search (Exa) tool +│ ├── codesearch.ts # Code search (Exa) tool +│ ├── todo.ts # Todo read/write tools +│ ├── plan.ts # Plan mode enter/exit tools +│ ├── question.ts # Interactive question tool +│ ├── batch.ts # Batch execution tool +│ ├── lsp.ts # LSP tool (experimental) +│ ├── apply_patch.ts # Apply patch tool (Codex compat) +│ ├── multiedit.ts # Multi-file edit tool +│ ├── ls.ts # Directory listing tool +│ ├── invalid.ts # Invalid tool handler +│ ├── external-directory.ts # External directory permission +│ ├── truncation.ts # Output truncation +│ └── *.txt # Tool description templates +├── util/ # Shared utilities +│ ├── context.ts # AsyncLocalStorage wrapper +│ ├── filesystem.ts # File system helpers +│ ├── fn.ts # Zod-validated function wrapper +│ ├── git.ts # Git command helpers +│ ├── glob.ts # Glob pattern matching +│ ├── iife.ts # Immediately invoked function expression +│ ├── lazy.ts # Lazy initialization +│ ├── locale.ts # Locale/i18n helpers +│ ├── lock.ts # File locking +│ ├── log.ts # Logging system +│ ├── process.ts # Process utilities +│ ├── queue.ts # Work queue +│ ├── rpc.ts # RPC utilities +│ ├── signal.ts # Signal handling +│ ├── timeout.ts # Timeout utilities +│ ├── token.ts # Token counting +│ ├── wildcard.ts # Wildcard pattern matching +│ └── ... # Other utility modules +└── worktree/ # Git worktree management + └── index.ts # Worktree CRUD operations +``` + +## Key File Locations + +**Entry Points:** +- `packages/cli/src/index.ts`: CLI main entry (yargs command registration) +- `packages/cli/bin/aictrl`: Binary wrapper (tries compiled binary, falls back to `bun run src/index.ts`) +- `packages/sdk/src/index.ts`: SDK entry (createOpencode, createAictrlClient) + +**Configuration:** +- `packages/cli/src/config/config.ts`: Multi-source config loading +- `packages/cli/src/flag/flag.ts`: All feature flags +- `packages/cli/src/global/index.ts`: XDG path resolution +- `.opencode/opencode.jsonc`: Project-level config + +**Core Logic:** +- `packages/cli/src/session/prompt.ts`: Main LLM interaction loop +- `packages/cli/src/session/llm.ts`: LLM stream wrapper +- `packages/cli/src/session/processor.ts`: Stream event processor +- `packages/cli/src/agent/agent.ts`: Agent definitions and loading +- `packages/cli/src/provider/provider.ts`: Provider SDK registry +- `packages/cli/src/tool/registry.ts`: Tool registration and loading + +**Database:** +- `packages/cli/src/storage/db.ts`: SQLite client setup +- `packages/cli/src/session/session.sql.ts`: Session/Message/Part/Todo/Permission tables +- `packages/cli/src/project/project.sql.ts`: Project table +- `packages/cli/migration/`: Drizzle migration files + +**Testing:** +- `packages/cli/test/`: Test files + +## Naming Conventions + +**Files:** +- Lowercase with hyphens: `bus-event.ts`, `message-v2.ts`, `share-next.ts` +- SQL schemas: `*.sql.ts` suffix (e.g., `session.sql.ts`, `project.sql.ts`) +- Prompt templates: `.txt` files alongside their consumer (e.g., `bash.txt` next to `bash.ts`) +- Index re-exports: `index.ts` in each directory + +**Directories:** +- Lowercase, single-word or hyphenated: `bus/`, `session/`, `cli/`, `file/` +- Feature-based grouping, not technical-layer grouping + +**Code Patterns:** +- TypeScript namespaces: `export namespace Session {}`, `export namespace Config {}` +- Zod schemas: PascalCase matching the type (e.g., `Session.Info`, `Agent.Info`) +- Bus events: `Namespace.Event.EventName` (e.g., `Session.Event.Created`) +- Functions: camelCase (`createNext`, `fromRow`, `getUsage`) +- Constants: UPPER_SNAKE_CASE for env vars (`AICTRL_EXPERIMENTAL`), camelCase for code constants + +## Where to Add New Code + +**New CLI Command:** +- Create handler: `packages/cli/src/cli/cmd/.ts` +- Pattern: Use `cmd()` helper from `./cmd.ts`, follow RunCommand as template +- Register: Add `.command(Command)` in `packages/cli/src/index.ts` + +**New Tool:** +- Create tool: `packages/cli/src/tool/.ts` +- Create description: `packages/cli/src/tool/.txt` +- Pattern: Use `Tool.define(id, async (initCtx) => ({ description, parameters, execute }))` +- Register: Add to `ToolRegistry.all()` array in `packages/cli/src/tool/registry.ts` + +**New Agent:** +- Built-in: Add to `result` record in `packages/cli/src/agent/agent.ts` +- Custom (project): Create `.opencode/agent/.md` with frontmatter +- Custom (config): Add to `agent` section in `aictrl.jsonc` + +**New Provider:** +- Add SDK to `BUNDLED_PROVIDERS` in `packages/cli/src/provider/provider.ts` +- Add optional peer dependency in `packages/cli/package.json` +- If custom loading needed, add to `CUSTOM_LOADERS` + +**New Bus Event:** +- Define: `BusEvent.define("namespace.event", z.object({ ... }))` in the relevant namespace +- Publish: `Bus.publish(Namespace.Event.EventName, { ... })` +- Subscribe: `Bus.subscribe(Namespace.Event.EventName, handler)` + +**New Database Table:** +- Create schema: `packages/cli/src//.sql.ts` +- Re-export: Add to `packages/cli/src/storage/schema.ts` +- Create migration: `bun drizzle-kit generate` from `packages/cli/` + +**New Plugin Hook:** +- Add type to `Hooks` interface in `packages/plugin/src/index.ts` +- Add trigger call in relevant code: `Plugin.trigger("hook.name", input, output)` + +**New Skill:** +- Create: `.opencode/skills//SKILL.md` with frontmatter (name, description) +- Pattern: Markdown with YAML frontmatter + +**New Custom Tool (project-level):** +- Create: `.opencode/tool/.ts` +- Export a `ToolDefinition` (from `@aictrl/plugin/tool`) + +**New Custom Command (project-level):** +- Create: `.opencode/command/.md` with frontmatter +- Supports `$ARGUMENTS` and `$1`, `$2` parameter placeholders + +**Utilities:** +- CLI-specific helpers: `packages/cli/src/util/.ts` +- Cross-package helpers: `packages/util/src/.ts` + +## Special Directories + +**`packages/cli/migration/`:** +- Purpose: Drizzle ORM SQLite migration files +- Generated: Yes, via `bun drizzle-kit generate` +- Committed: Yes +- Pattern: Timestamped directories containing `migration.sql` + +**`packages/cli/dist/`:** +- Purpose: Compiled binary output +- Generated: Yes, via `bun run script/build.ts` +- Committed: No (gitignored) + +**`packages/sdk/src/gen/`:** +- Purpose: Generated OpenAPI client code +- Generated: Yes, via `@hey-api/openapi-ts` +- Committed: Yes +- Contains: `types.gen.ts`, `client.gen.ts`, `sdk.gen.ts` + +**`.opencode/`:** +- Purpose: Project-level aictrl configuration (agents, commands, tools, skills) +- Generated: No (user-created) +- Committed: Yes +- Scanned at runtime by Config, Agent, Command, Tool, and Skill loaders + +**`infra/`:** +- Purpose: SST v3 infrastructure definitions (Cloudflare, Stripe, PlanetScale) +- Generated: No +- Committed: Yes + +--- + +*Structure analysis: 2026-03-08* diff --git a/.planning/codebase/TESTING.md b/.planning/codebase/TESTING.md new file mode 100644 index 0000000..3cf6996 --- /dev/null +++ b/.planning/codebase/TESTING.md @@ -0,0 +1,425 @@ +# Testing Patterns + +**Analysis Date:** 2026-03-08 + +## Test Framework + +**Runner:** +- Bun's built-in test runner (`bun:test`) +- Config: `packages/cli/bunfig.toml` (preload) + `packages/cli/package.json` (timeout) + +**Assertion Library:** +- `bun:test` built-in `expect()` (Jest-compatible API) + +**Run Commands:** +```bash +cd packages/cli && bun test --timeout 30000 # Run all CLI tests +cd packages/cli && bun test test/tool/bash.test.ts # Run specific test file +# No watch mode or coverage commands configured +``` + +**Important:** Tests must NOT be run from the monorepo root. The root `bunfig.toml` intentionally sets `root = "./do-not-run-tests-from-root"` and the root `package.json` script exits with error. Always run tests from `packages/cli/`. + +## Test File Organization + +**Location:** +- Separate `test/` directory at `packages/cli/test/` +- Test files mirror the `src/` directory structure: + - `src/tool/bash.ts` -> `test/tool/bash.test.ts` + - `src/session/index.ts` -> `test/session/session.test.ts` + - `src/util/lazy.ts` -> `test/util/lazy.test.ts` + - `src/config/config.ts` -> `test/config/config.test.ts` + +**Naming:** +- `{module-name}.test.ts` pattern +- Snapshot files in `__snapshots__/` subdirectory: `test/tool/__snapshots__/tool.test.ts.snap` + +**Structure:** +``` +packages/cli/test/ +├── preload.ts # Test environment setup (runs before all tests) +├── fixture/ +│ └── fixture.ts # Shared test helpers (tmpdir, etc.) +├── tool/ +│ ├── bash.test.ts +│ ├── edit.test.ts +│ ├── read.test.ts +│ ├── grep.test.ts +│ └── __snapshots__/ +├── session/ +│ ├── session.test.ts +│ ├── compaction.test.ts +│ ├── retry.test.ts +│ └── ... +├── config/ +│ ├── config.test.ts +│ └── ... +├── provider/ +│ ├── provider.test.ts +│ ├── copilot/ +│ │ └── copilot-chat-model.test.ts +│ └── ... +├── util/ +│ ├── lazy.test.ts +│ ├── format.test.ts +│ ├── glob.test.ts +│ └── ... +└── ... +``` + +## Test Preload / Environment Setup + +**Preload file:** `packages/cli/test/preload.ts` + +This file is critical and runs before all tests. It: + +1. Sets XDG env vars to isolate test data in a temp directory per process +2. Clears all provider API key env vars (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.) +3. Sets `AICTRL_MODELS_PATH` to a test fixture JSON +4. Sets `AICTRL_TEST_HOME` to prevent reading real user configs +5. Sets `AICTRL_TEST_MANAGED_CONFIG_DIR` to isolate managed settings +6. Writes a cache version file to prevent migrations +7. Initializes `Log` with `print: false` and `dev: true` +8. Registers `afterAll` hook to close database and clean up temp directory + +**Key env vars set by preload:** +- `XDG_DATA_HOME`, `XDG_CACHE_HOME`, `XDG_CONFIG_HOME`, `XDG_STATE_HOME` +- `AICTRL_MODELS_PATH`, `AICTRL_TEST_HOME`, `AICTRL_TEST_MANAGED_CONFIG_DIR` + +## Test Structure + +**Suite Organization:** +```typescript +import { describe, expect, test } from "bun:test" +import { BashTool } from "../../src/tool/bash" +import { Instance } from "../../src/project/instance" +import { tmpdir } from "../fixture/fixture" + +describe("tool.bash", () => { + test("basic", async () => { + await Instance.provide({ + directory: projectRoot, + fn: async () => { + const bash = await BashTool.init() + const result = await bash.execute({ command: "echo 'test'", description: "Echo test" }, ctx) + expect(result.metadata.exit).toBe(0) + expect(result.metadata.output).toContain("test") + }, + }) + }) +}) +``` + +**Core Pattern: Instance.provide() wrapper:** + +Almost every test wraps its logic in `Instance.provide()` which sets up the project context (directory, VCS, config). This is the most critical pattern to follow: + +```typescript +await Instance.provide({ + directory: tmp.path, // Project directory + init: async () => { ... }, // Optional: run once on first provide + fn: async () => { + // Test code runs here with project context + }, +}) +``` + +**Test Context Object:** + +Tool tests create a shared `ctx` object matching `Tool.Context`: +```typescript +const ctx = { + sessionID: "test", + messageID: "", + callID: "", + agent: "build", + abort: AbortSignal.any([]), + messages: [], + metadata: () => {}, + ask: async () => {}, +} +``` + +## Temporary Directories + +**Helper:** `tmpdir()` from `packages/cli/test/fixture/fixture.ts` + +```typescript +// Basic temp directory +await using tmp = await tmpdir() + +// With git initialization +await using tmp = await tmpdir({ git: true }) + +// With initial files +await using tmp = await tmpdir({ + init: async (dir) => { + await Bun.write(path.join(dir, "aictrl.json"), JSON.stringify({ ... })) + }, +}) + +// With config +await using tmp = await tmpdir({ + config: { model: "test/model" }, +}) +``` + +**Key details:** +- Uses `await using` (TC39 Explicit Resource Management) for automatic cleanup +- Returns `{ path: string, extra: T }` where `path` is the real path of the temp dir +- `git: true` initializes a git repo with initial commit +- `config` option writes an `aictrl.json` config file +- `init` callback runs after directory creation for custom setup +- Paths are sanitized to strip null bytes (CI environment defense) + +## Mocking + +**Framework:** `bun:test` built-in `mock` and `mock.module()` + +**Module Mocking Pattern:** +```typescript +import { mock, beforeEach } from "bun:test" + +// Mock entire modules BEFORE importing the code under test +mock.module("@modelcontextprotocol/sdk/client/streamableHttp.js", () => ({ + StreamableHTTPClientTransport: class MockStreamableHTTP { + constructor(url: URL, options?: any) { + transportCalls.push({ type: "streamable", url: url.toString(), options: options ?? {} }) + } + async start() { + throw new Error("Mock transport cannot connect") + } + }, +})) + +// Import AFTER mocking +const { MCP } = await import("../../src/mcp/index") +``` + +**Function Mocking Pattern:** +```typescript +const createMockFetch = mock(async () => { + const body = new ReadableStream({ + start(controller) { + for (const chunk of chunks) { + controller.enqueue(new TextEncoder().encode(chunk + "\n\n")) + } + controller.close() + }, + }) + return new Response(body, { status: 200, headers: { "content-type": "text/event-stream" } }) +}) +``` + +**Inline Test Servers:** +```typescript +import { beforeAll, afterAll } from "bun:test" + +let server: ReturnType + +beforeAll(async () => { + server = Bun.serve({ + port: 0, // Random port + async fetch(req) { + const url = new URL(req.url) + // Handle routes... + return new Response(Bun.file(fullPath)) + }, + }) +}) + +afterAll(async () => { + server?.stop() +}) +``` + +**What to Mock:** +- External HTTP transports (MCP SDK transports) +- Network fetch calls (provider API responses) +- Module-level dependencies when testing in isolation + +**What NOT to Mock:** +- `Instance.provide()` -- always use real project context +- Database/SQLite -- tests use real in-memory database (via preload temp dir) +- Filesystem operations -- tests use real temp directories +- The Bus event system -- tests subscribe to real events + +## Fixtures and Factories + +**Test Data:** +```typescript +// Factory function for model objects +function createModel(opts: { + context: number + output: number + input?: number + cost?: Provider.Model["cost"] +}): Provider.Model { + return { + id: "test-model", + providerID: "test", + name: "Test", + limit: { context: opts.context, input: opts.input, output: opts.output }, + cost: opts.cost ?? { input: 0, output: 0, cache: { read: 0, write: 0 } }, + capabilities: { toolcall: true, attachment: false, ... }, + api: { npm: "@ai-sdk/anthropic" }, + options: {}, + } as Provider.Model +} +``` + +**Fixture data inlined in test files:** +- SSE response chunks defined as string arrays (see `packages/cli/test/provider/copilot/copilot-chat-model.test.ts`) +- Config JSON constructed inline in `tmpdir()` init callbacks +- No shared fixture factory module; each test file defines its own helpers + +**Fixture files:** +- `packages/cli/test/fixture/fixture.ts` -- shared `tmpdir()` helper +- `packages/cli/test/fixture/skills/` -- skill discovery test fixtures +- `packages/cli/test/tool/fixtures/models-api.json` -- mock models API response + +## Coverage + +**Requirements:** None enforced. No coverage thresholds configured. + +**View Coverage:** +```bash +cd packages/cli && bun test --coverage +``` + +## Test Types + +**Unit Tests:** +- Pure function tests: `packages/cli/test/util/lazy.test.ts`, `packages/cli/test/util/format.test.ts` +- No `Instance.provide()` needed for pure logic tests +- Direct import and test of module functions + +**Integration Tests (majority of tests):** +- Test modules with real database, filesystem, and project context +- Use `Instance.provide()` + `tmpdir()` pattern +- Example: `packages/cli/test/tool/bash.test.ts` runs real shell commands +- Example: `packages/cli/test/config/config.test.ts` reads real config files from temp directories + +**Snapshot Tests:** +- Bun snapshot testing via `expect().toMatchSnapshot()` +- Snapshot file: `packages/cli/test/tool/__snapshots__/tool.test.ts.snap` +- Minimal usage (only 1 snapshot file found) + +**E2E Tests:** +- Not present in the CLI package test suite +- Playwright is listed as a catalog dependency but used elsewhere (likely `sdks/vscode/`) + +## Common Patterns + +**Async Testing:** +```typescript +test("creates new file when oldString is empty", async () => { + await using tmp = await tmpdir() + const filepath = path.join(tmp.path, "newfile.txt") + + await Instance.provide({ + directory: tmp.path, + fn: async () => { + const edit = await EditTool.init() + const result = await edit.execute( + { filePath: filepath, oldString: "", newString: "new content" }, + ctx, + ) + expect(result.metadata.diff).toContain("new content") + const content = await fs.readFile(filepath, "utf-8") + expect(content).toBe("new content") + }, + }) +}) +``` + +**Error Testing:** +```typescript +test("throws error when file does not exist", async () => { + await using tmp = await tmpdir() + await Instance.provide({ + directory: tmp.path, + fn: async () => { + FileTime.read(ctx.sessionID, filepath) + const edit = await EditTool.init() + await expect( + edit.execute({ filePath: filepath, oldString: "old", newString: "new" }, ctx), + ).rejects.toThrow("not found") + }, + }) +}) +``` + +**Event Testing (Bus events):** +```typescript +test("emits add event for new files", async () => { + await using tmp = await tmpdir() + await Instance.provide({ + directory: tmp.path, + fn: async () => { + const { Bus } = await import("../../src/bus") + const { File } = await import("../../src/file") + + const events: string[] = [] + const unsubEdited = Bus.subscribe(File.Event.Edited, () => events.push("edited")) + const unsubUpdated = Bus.subscribe(FileWatcher.Event.Updated, () => events.push("updated")) + + // ... trigger action ... + + expect(events).toContain("edited") + unsubEdited() + unsubUpdated() + }, + }) +}) +``` + +**Permission Testing:** +```typescript +test("asks for bash permission with correct pattern", async () => { + await using tmp = await tmpdir({ git: true }) + await Instance.provide({ + directory: tmp.path, + fn: async () => { + const bash = await BashTool.init() + const requests: Array> = [] + const testCtx = { + ...ctx, + ask: async (req: Omit) => { + requests.push(req) + }, + } + await bash.execute({ command: "echo hello", description: "Echo hello" }, testCtx) + expect(requests.length).toBe(1) + expect(requests[0].permission).toBe("bash") + }, + }) +}) +``` + +## Test Naming Conventions + +**Describe blocks:** Use dotted module path: `"tool.bash"`, `"tool.bash permissions"`, `"tool.bash truncation"`, `"session.started event"`, `"util.lazy"`, `"util.format"` + +**Test names:** Descriptive sentence starting with verb or subject: +- `"basic"` (for simple happy path) +- `"creates new file when oldString is empty"` +- `"throws error when file does not exist"` +- `"asks for bash permission with correct pattern"` +- `"returns true when token count exceeds usable context"` +- `"does not ask for external_directory permission when rm inside project"` + +## Test-specific Imports + +Tests import directly from source using relative paths (not `@/` alias): +```typescript +import { BashTool } from "../../src/tool/bash" +import { Instance } from "../../src/project/instance" +import { tmpdir } from "../fixture/fixture" +``` + +The `@/` path alias works in test files that import from `@/provider/sdk/copilot/...` (see copilot test), but the majority of tests use `../../src/` relative paths. + +--- + +*Testing analysis: 2026-03-08* From 3bb4c45ed5d8640c718cfbb6eb01a9af581db65b Mon Sep 17 00:00:00 2001 From: Bulat Skill7 Date: Sun, 8 Mar 2026 16:51:53 +0000 Subject: [PATCH 02/11] docs: initialize hardening project Reliability and security hardening of aictrl CLI based on 31 findings from 5-engineer code review. 16 CRITICAL + 15 IMPORTANT findings across error handling, security boundaries, resource management, and session reliability. Co-Authored-By: Claude Opus 4.6 --- .planning/PROJECT.md | 107 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) create mode 100644 .planning/PROJECT.md diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md new file mode 100644 index 0000000..23dba3b --- /dev/null +++ b/.planning/PROJECT.md @@ -0,0 +1,107 @@ +# Aictrl CLI Hardening + +## What This Is + +Reliability and security hardening of the aictrl CLI — a headless AI agent execution engine forked from OpenCode. This project addresses 31 findings from a 5-engineer code review covering error handling, security boundaries, resource management, and operational reliability. The primary deployment target is headless CI execution via `aictrl run`. + +## Core Value + +Every `aictrl run` invocation must either complete successfully or fail with a clear, actionable error — never hang, silently swallow errors, or leak resources. + +## Requirements + +### Validated + +- ✓ Headless execution via `aictrl run` — existing +- ✓ Multi-provider LLM support (Anthropic, OpenAI, Google, etc.) — existing +- ✓ Tool execution (bash, file edit, read, grep, glob) — existing +- ✓ MCP server integration — existing +- ✓ Plugin/skill system with discovery — existing +- ✓ SQLite persistence with Drizzle ORM — existing +- ✓ Session compaction for long conversations — existing +- ✓ NDJSON output format for CI observability — existing +- ✓ Snapshot/undo system via git — existing +- ✓ Per-project context isolation via AsyncLocalStorage — existing + +### Active + +**CRITICAL — Error Handling & Process Lifecycle** +- [ ] PROC-01: Fire-and-forget in run.ts must propagate errors instead of swallowing with `.catch(() => {})` +- [ ] PROC-02: uncaughtException handler must exit the process cleanly +- [ ] PROC-03: Instance.init rejected promise must not be cached permanently +- [ ] PROC-04: Instance.dispose must not mask original errors with Context.NotFound +- [ ] PROC-05: Migration failure must be caught and marker state handled correctly + +**CRITICAL — Security Boundaries** +- [ ] SEC-01: Skill discovery must sanitize paths to prevent traversal attacks +- [ ] SEC-02: Symlink-aware path containment in Filesystem.contains +- [ ] SEC-03: Bash permission checks must align with actual shell execution (heredoc/substitution bypass) +- [ ] SEC-04: OAuth callback must escape error_description to prevent XSS +- [ ] SEC-05: `{env:VAR}` config substitution must have an allowlist + +**CRITICAL — Resource Management** +- [ ] RES-01: Bash tool output must be bounded to prevent OOM +- [ ] RES-02: MCP startAuth must have a connection timeout +- [ ] RES-03: Timed-out MCP connections must not leak child processes +- [ ] RES-04: Custom tool loading must not follow symlinks without integrity checks + +**CRITICAL — Session Reliability** +- [ ] SESS-01: Queued prompt callbacks must be rejected on cancel/error (no dangling promises) +- [ ] SESS-02: ensureTitle must be awaited or caught +- [ ] SESS-03: SessionSummary.summarize fire-and-forget must handle DB failures +- [ ] SESS-04: Bus publish must not let one bad subscriber crash the streaming loop +- [ ] SESS-05: SessionCompaction.prune must not race with session teardown +- [ ] SESS-06: Retry loop must have a max attempt cap + +**IMPORTANT — Data Integrity & Observability** +- [ ] DATA-01: Multi-edit tool must respect per-edit filePath +- [ ] DATA-02: Bus event must be published after in-memory state update (not before) +- [ ] DATA-03: Session.updatePart DB writes must be serialized or awaited +- [ ] DATA-04: Latency tracking must not overwrite stream start time +- [ ] DATA-05: parseModel must handle bare provider strings gracefully + +**IMPORTANT — Operational Quality** +- [ ] OPS-01: OAuth token store must handle concurrent access (file locking) +- [ ] OPS-02: URL instruction fetches must be cached (not O(steps × urls)) +- [ ] OPS-03: PRAGMA synchronous=OFF must be restored after migration +- [ ] OPS-04: Log.Default must not write before Log.init() +- [ ] OPS-05: GlobalBus listener race in run.ts generator must be fixed +- [ ] OPS-06: Cost calculation for non-Anthropic providers must be corrected + +### Out of Scope + +- Feature development — this project is purely hardening existing functionality +- Copilot SDK fork reduction — tracked separately, large scope +- Deprecated config field migration — not a reliability issue +- `as any` type safety audit — quality improvement, not hardening +- Dead code cleanup (scrap.ts etc.) — trivial, not worth project overhead +- UI/TUI improvements — headless-first focus +- Performance optimization (sync fs, Levenshtein) — not reliability-critical + +## Context + +**Origin:** 5-engineer parallel code review of the full CLI codebase (2026-03-08) produced 31 findings ranked by confidence score (80-95). Findings span entry/bootstrap, session/prompt, tools/permissions, provider/streaming, and MCP/plugins/skills. + +**Deployment:** Primary use case is headless CI via GitHub Actions (`aictrl run --format json`). The fire-and-forget bug (#1) directly caused silent CI failures. Process lifecycle issues (#14) cause CI jobs to appear hung instead of failing cleanly. + +**Codebase:** ~50k lines TypeScript across 5 packages (cli, sdk, plugin, util, script). Bun runtime with compiled binary distribution. See `.planning/codebase/` for full mapping. + +**Review findings:** 16 CRITICAL (95-85 confidence), 15 IMPORTANT (88-80 confidence). All findings include exact file paths, line numbers, and concrete fix recommendations. + +## Constraints + +- **Runtime**: Bun — must work with Bun-specific APIs and known bugs (bun#19936, bun#16682) +- **Backwards compatibility**: Fix must not break existing config files, sessions, or plugin APIs +- **Test coverage**: Every fix must include a regression test +- **No feature changes**: Fixes must be surgical — no refactoring beyond what the fix requires + +## Key Decisions + +| Decision | Rationale | Outcome | +|----------|-----------|---------| +| Prioritize headless CI path | Primary deployment target; fire-and-forget bug caused real CI failures | — Pending | +| Fix all 16 CRITICAL before IMPORTANT | CRITICAL findings have higher confidence and impact | — Pending | +| Require regression tests per fix | Prevent regressions; current test gaps contributed to these issues | — Pending | + +--- +*Last updated: 2026-03-08 after initialization* From 1e37d2b4238c81dcbcfdc87cd48296cd611c26d5 Mon Sep 17 00:00:00 2001 From: Bulat Skill7 Date: Sun, 8 Mar 2026 16:59:47 +0000 Subject: [PATCH 03/11] chore: add project config Mode: yolo Depth: comprehensive Parallelization: enabled Workflow agents: research=on, plan_check=on, verifier=on Model profile: quality (Opus for research/roadmap) Co-Authored-By: Claude Opus 4.6 --- .planning/config.json | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 .planning/config.json diff --git a/.planning/config.json b/.planning/config.json new file mode 100644 index 0000000..38858d7 --- /dev/null +++ b/.planning/config.json @@ -0,0 +1,12 @@ +{ + "mode": "yolo", + "depth": "comprehensive", + "parallelization": true, + "commit_docs": true, + "model_profile": "quality", + "workflow": { + "research": true, + "plan_check": true, + "verifier": true + } +} From 3b87a9c3012ddf6d508f8aab04d0e1935ef885e8 Mon Sep 17 00:00:00 2001 From: Bulat Skill7 Date: Sun, 8 Mar 2026 17:00:28 +0000 Subject: [PATCH 04/11] docs: define v1 requirements 31 requirements across 6 categories: - Error Handling & Process Lifecycle (5) - Security Boundaries (5) - Resource Management (4) - Session Reliability (6) - Data Integrity & Observability (5) - Operational Quality (6) All derived from 5-engineer code review findings. Co-Authored-By: Claude Opus 4.6 --- .planning/REQUIREMENTS.md | 101 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 .planning/REQUIREMENTS.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md new file mode 100644 index 0000000..bf3735c --- /dev/null +++ b/.planning/REQUIREMENTS.md @@ -0,0 +1,101 @@ +# Requirements + +## v1 Requirements + +### Error Handling & Process Lifecycle + +- [ ] **PROC-01**: `aictrl run` must propagate all errors from `SessionPrompt.prompt()` instead of swallowing with `.catch(() => {})` — `run.ts:713-720` +- [ ] **PROC-02**: `uncaughtException` handler must call `process.exit(1)` and preserve stack traces — `index.ts:32-42` +- [ ] **PROC-03**: `Instance.init()` rejected promise must not be cached permanently, poisoning all future calls — `instance.ts:23-43` +- [ ] **PROC-04**: `Instance.dispose()` must not throw `Context.NotFound` in finally blocks, masking the original error — `instance.ts:69-81` +- [ ] **PROC-05**: Database migration failure must be caught; partial migration state must be handled on next launch — `index.ts:92-117` + +### Security Boundaries + +- [ ] **SEC-01**: Remote skill downloads must sanitize `skill.name`/`file` to prevent path traversal (arbitrary filesystem write) — `discovery.ts:81-86` +- [ ] **SEC-02**: `Filesystem.contains()` must use `fs.realpath()` to resolve symlinks before path comparison — `filesystem.ts:134` +- [ ] **SEC-03**: Bash tool permission checks must cover heredocs and command substitutions that bypass AST-based validation — `bash.ts:93-172` +- [ ] **SEC-04**: OAuth callback HTML must escape `error_description` to prevent XSS — `oauth-callback.ts:94-102` +- [ ] **SEC-05**: `{env:VAR}` config substitution must restrict to an allowlist of safe variables — `paths.ts:86-88` + +### Resource Management + +- [ ] **RES-01**: Bash tool must cap accumulated output buffer to prevent OOM on long-running commands — `bash.ts:183-205` +- [ ] **RES-02**: MCP `startAuth` connect must have a timeout to prevent permanent hangs — `mcp/index.ts:793` +- [ ] **RES-03**: Timed-out MCP connection promises must clean up child processes (no orphans) — `timeout.ts:1-14` +- [ ] **RES-04**: Custom tool loading from config `tool/` dirs must not follow symlinks without integrity checks — `registry.ts:40-52` + +### Session Reliability + +- [ ] **SESS-01**: Queued prompt callback promises must be rejected on cancel/error — no dangling promises or memory leaks — `prompt.ts:64-84,279` +- [ ] **SESS-02**: `ensureTitle()` must be awaited or have a catch handler — unhandled rejection currently crashes process — `prompt.ts:328-334` +- [ ] **SESS-03**: `SessionSummary.summarize()` fire-and-forget must handle DB failures explicitly — `processor.ts:278` +- [ ] **SESS-04**: `Bus.publish()` must isolate subscriber failures — one bad subscriber must not crash the streaming loop — `bus/index.ts:52-63` +- [ ] **SESS-05**: `SessionCompaction.prune()` must not race with session teardown — `prompt.ts:738` +- [ ] **SESS-06**: Retry loop must enforce a max attempt cap — `retry-after` header can currently cause infinite retries — `processor.ts:359-370` + +### Data Integrity & Observability + +- [ ] **DATA-01**: Multi-edit tool must apply each edit to its own `filePath`, not redirect all to the outer param — `multiedit.ts:30` +- [ ] **DATA-02**: Bus events must be published after in-memory state is updated, not before — `status.ts:61-75` +- [ ] **DATA-03**: `Session.updatePart()` DB writes in stream handlers must be serialized or awaited — `prompt.ts:1670` +- [ ] **DATA-04**: Latency tracking `text-end` event must not overwrite the stream start time — `processor.ts:329-332` +- [ ] **DATA-05**: `parseModel()` must handle bare provider strings gracefully instead of producing empty modelID — `provider.ts:1305-1311` + +### Operational Quality + +- [ ] **OPS-01**: OAuth JSON token store must handle concurrent access with file locking — `mcp/auth.ts:56-78` +- [ ] **OPS-02**: URL instruction fetches must be cached per-session — not O(steps × urls) — `instruction.ts:126-151` +- [ ] **OPS-03**: `PRAGMA synchronous=OFF` set during migration must be restored afterward — `json-migration.ts:49-52` +- [ ] **OPS-04**: `Log.Default` must not write to stderr before `Log.init()` has been called — `log.ts:55-58` +- [ ] **OPS-05**: GlobalBus listener in `run.ts` generator must not race with fast-path idle events — `run.ts:646-671` +- [ ] **OPS-06**: Cost calculation for non-Anthropic providers must use correct token pricing — `session/index.ts:814-820` + +## v2 Requirements + +(None — all findings scoped to v1) + +## Out of Scope + +- Copilot SDK fork reduction — separate project, 2,500+ lines of forked code +- `as any` type safety audit — quality improvement, not a reliability fix +- Deprecated config field migration — not causing failures +- Dead code cleanup — trivial, doesn't warrant project overhead +- Performance optimization (sync fs, Levenshtein) — not reliability-critical +- UI/TUI improvements — headless-first focus + +## Traceability + +| REQ-ID | Phase | Status | +|--------|-------|--------| +| PROC-01 | — | Pending | +| PROC-02 | — | Pending | +| PROC-03 | — | Pending | +| PROC-04 | — | Pending | +| PROC-05 | — | Pending | +| SEC-01 | — | Pending | +| SEC-02 | — | Pending | +| SEC-03 | — | Pending | +| SEC-04 | — | Pending | +| SEC-05 | — | Pending | +| RES-01 | — | Pending | +| RES-02 | — | Pending | +| RES-03 | — | Pending | +| RES-04 | — | Pending | +| SESS-01 | — | Pending | +| SESS-02 | — | Pending | +| SESS-03 | — | Pending | +| SESS-04 | — | Pending | +| SESS-05 | — | Pending | +| SESS-06 | — | Pending | +| DATA-01 | — | Pending | +| DATA-02 | — | Pending | +| DATA-03 | — | Pending | +| DATA-04 | — | Pending | +| DATA-05 | — | Pending | +| OPS-01 | — | Pending | +| OPS-02 | — | Pending | +| OPS-03 | — | Pending | +| OPS-04 | — | Pending | +| OPS-05 | — | Pending | +| OPS-06 | — | Pending | From cf6d0c2eccf61dc83e957fa4ac4485a9dc273b57 Mon Sep 17 00:00:00 2001 From: Bulat Skill7 Date: Sun, 8 Mar 2026 17:07:31 +0000 Subject: [PATCH 05/11] docs: create roadmap (10 phases) Phases: 1. Process Exit Path: PROC-01, PROC-02 2. Instance Lifecycle: PROC-03, PROC-04, PROC-05 3. Input Sanitization & Security: SEC-01, SEC-02, SEC-04, SEC-05 4. Shell Execution Safety: SEC-03, RES-01 5. MCP & External Process Safety: RES-02, RES-03, RES-04 6. Prompt Loop Reliability: SESS-01, SESS-02, SESS-05, SESS-06 7. Event Bus Resilience: SESS-03, SESS-04, OPS-05 8. Data Mutation Correctness: DATA-01, DATA-02, DATA-03, DATA-04 9. Provider & Model Robustness: DATA-05, OPS-06 10. Operational Hardening: OPS-01, OPS-02, OPS-03, OPS-04 All 31 v1 requirements mapped to phases. Co-Authored-By: Claude Opus 4.6 --- .planning/REQUIREMENTS.md | 62 ++++++------ .planning/ROADMAP.md | 196 ++++++++++++++++++++++++++++++++++++++ .planning/STATE.md | 63 ++++++++++++ 3 files changed, 290 insertions(+), 31 deletions(-) create mode 100644 .planning/ROADMAP.md create mode 100644 .planning/STATE.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index bf3735c..dc0b79f 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -68,34 +68,34 @@ | REQ-ID | Phase | Status | |--------|-------|--------| -| PROC-01 | — | Pending | -| PROC-02 | — | Pending | -| PROC-03 | — | Pending | -| PROC-04 | — | Pending | -| PROC-05 | — | Pending | -| SEC-01 | — | Pending | -| SEC-02 | — | Pending | -| SEC-03 | — | Pending | -| SEC-04 | — | Pending | -| SEC-05 | — | Pending | -| RES-01 | — | Pending | -| RES-02 | — | Pending | -| RES-03 | — | Pending | -| RES-04 | — | Pending | -| SESS-01 | — | Pending | -| SESS-02 | — | Pending | -| SESS-03 | — | Pending | -| SESS-04 | — | Pending | -| SESS-05 | — | Pending | -| SESS-06 | — | Pending | -| DATA-01 | — | Pending | -| DATA-02 | — | Pending | -| DATA-03 | — | Pending | -| DATA-04 | — | Pending | -| DATA-05 | — | Pending | -| OPS-01 | — | Pending | -| OPS-02 | — | Pending | -| OPS-03 | — | Pending | -| OPS-04 | — | Pending | -| OPS-05 | — | Pending | -| OPS-06 | — | Pending | +| PROC-01 | Phase 01 | Pending | +| PROC-02 | Phase 01 | Pending | +| PROC-03 | Phase 02 | Pending | +| PROC-04 | Phase 02 | Pending | +| PROC-05 | Phase 02 | Pending | +| SEC-01 | Phase 03 | Pending | +| SEC-02 | Phase 03 | Pending | +| SEC-03 | Phase 04 | Pending | +| SEC-04 | Phase 03 | Pending | +| SEC-05 | Phase 03 | Pending | +| RES-01 | Phase 04 | Pending | +| RES-02 | Phase 05 | Pending | +| RES-03 | Phase 05 | Pending | +| RES-04 | Phase 05 | Pending | +| SESS-01 | Phase 06 | Pending | +| SESS-02 | Phase 06 | Pending | +| SESS-03 | Phase 07 | Pending | +| SESS-04 | Phase 07 | Pending | +| SESS-05 | Phase 06 | Pending | +| SESS-06 | Phase 06 | Pending | +| DATA-01 | Phase 08 | Pending | +| DATA-02 | Phase 08 | Pending | +| DATA-03 | Phase 08 | Pending | +| DATA-04 | Phase 08 | Pending | +| DATA-05 | Phase 09 | Pending | +| OPS-01 | Phase 10 | Pending | +| OPS-02 | Phase 10 | Pending | +| OPS-03 | Phase 10 | Pending | +| OPS-04 | Phase 10 | Pending | +| OPS-05 | Phase 07 | Pending | +| OPS-06 | Phase 09 | Pending | diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md new file mode 100644 index 0000000..521f53a --- /dev/null +++ b/.planning/ROADMAP.md @@ -0,0 +1,196 @@ +# Roadmap: Aictrl CLI Hardening + +## Overview + +10 phases covering 31 requirements derived from a 5-engineer code review. Phases follow dependency order: process exit paths first (everything depends on clean startup/shutdown), then security boundaries (highest-risk findings), session reliability (must be stable before testing data fixes), data correctness, and operational quality last. Every phase delivers a verifiable capability improvement to headless CI execution via `aictrl run`. + +## Phases + +- [ ] **Phase 01: Process Exit Path** - `aictrl run` propagates errors and exits cleanly instead of hanging +- [ ] **Phase 02: Instance Lifecycle** - Bootstrap, init, and dispose cannot poison state or mask errors +- [ ] **Phase 03: Input Sanitization & Security** - Path traversal, XSS, and env injection attacks are blocked +- [ ] **Phase 04: Shell Execution Safety** - Bash tool permission checks cannot be bypassed and output is bounded +- [ ] **Phase 05: MCP & External Process Safety** - MCP connections time out and clean up child processes +- [ ] **Phase 06: Prompt Loop Reliability** - Session prompt cycle handles cancellation, errors, and compaction races +- [ ] **Phase 07: Event Bus Resilience** - Bus publish isolates failures and event timing races are eliminated +- [ ] **Phase 08: Data Mutation Correctness** - State updates, DB writes, and tool edits apply in correct order +- [ ] **Phase 09: Provider & Model Robustness** - Model parsing and cost calculation handle all provider formats +- [ ] **Phase 10: Operational Hardening** - File locking, caching, PRAGMA safety, and logging initialization + +## Phase Details + +### Phase 01: Process Exit Path +**Goal:** `aictrl run` always exits with a clear error code and message when something fails, instead of hanging or swallowing errors silently +**Depends on:** Nothing (first phase -- addresses the bug that directly caused silent CI failures) +**Requirements:** PROC-01, PROC-02 +**Success Criteria** (what must be TRUE): + 1. When `SessionPrompt.prompt()` throws during `aictrl run`, the error propagates to the CLI exit handler and the process exits with code 1 + 2. When an uncaught exception occurs, the process logs the full stack trace and exits with code 1 (not just logs and continues) + 3. No `.catch(() => {})` patterns remain in the `aictrl run` critical path + 4. A regression test confirms that a simulated prompt failure causes `aictrl run` to exit non-zero +**Plans:** TBD + +Plans: +- [ ] 01-01: Fix fire-and-forget error swallowing in run.ts +- [ ] 01-02: Fix uncaughtException handler to exit cleanly + +### Phase 02: Instance Lifecycle +**Goal:** Instance bootstrap, initialization, and disposal handle failures gracefully without poisoning cached state or masking original errors +**Depends on:** Phase 01 (process must exit cleanly for lifecycle errors to surface) +**Requirements:** PROC-03, PROC-04, PROC-05 +**Success Criteria** (what must be TRUE): + 1. When `Instance.init()` rejects, subsequent calls retry instead of returning the cached rejected promise + 2. When `Instance.dispose()` runs in a finally block after an error, the original error propagates (not Context.NotFound) + 3. When database migration fails mid-flight, the next launch detects partial state and either completes or reports the issue + 4. A regression test confirms that a failed init does not permanently poison the instance +**Plans:** TBD + +Plans: +- [ ] 02-01: Fix Instance.init() rejected promise caching +- [ ] 02-02: Fix Instance.dispose() error masking in finally blocks +- [ ] 02-03: Add migration failure handling and partial state recovery + +### Phase 03: Input Sanitization & Security +**Goal:** All user-controlled and externally-sourced inputs are sanitized before use in filesystem operations, HTML rendering, and environment variable expansion +**Depends on:** Phase 02 (instance must initialize correctly for security checks to run) +**Requirements:** SEC-01, SEC-02, SEC-04, SEC-05 +**Success Criteria** (what must be TRUE): + 1. A skill with `name: "../../../etc/passwd"` or `file: "../../secret"` is rejected during discovery with a clear error + 2. `Filesystem.contains()` resolves symlinks via `fs.realpath()` before comparison, so a symlink pointing outside the project is correctly detected + 3. OAuth callback HTML escapes `error_description` so injected `