Skip to content

feat(skill-trigger): multi-provider detection + workspace-scoped skills#702

Merged
christso merged 4 commits intomainfrom
feat/skill-trigger-multi-provider
Mar 22, 2026
Merged

feat(skill-trigger): multi-provider detection + workspace-scoped skills#702
christso merged 4 commits intomainfrom
feat/skill-trigger-multi-provider

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented Mar 22, 2026

Summary

  • Fix skill-trigger evaluator to scan the full transcript instead of just the first tool call, so skills invoked later in a conversation are correctly detected
  • Add codex provider support via `CODEX_MATCHER` — detects skill usage via `command_execution` tool calls whose `command` field contains the skill file path
  • Fix codex provider input format — `command_execution` items now store `input: { command: string }` instead of `input: string`, enabling `readPathFromInput` to work correctly
  • Add pi-coding-agent support via `PI_CODING_AGENT_MATCHER` — Pi CLI emits `type: 'toolCall'` with `arguments` (not `tool_use`/`input`), so tool calls were silently dropped; new matcher uses lowercase `read` + `path` field matching Pi's actual tool names
  • Add workspace-scoped skills to the multi-provider example eval — skills live in `workspace/` template (`.agents/skills/`, `.claude/skills/`, `.codex/skills/`), no global installation required
  • Add `AGENTS.md` to the workspace template so codex discovers available skills at the start of each session

Test plan

  • `bun run test` passes (all skill-trigger + codex-sdk tests green)
  • `bun apps/cli/src/cli.ts eval examples/features/agent-skills-evals/multi-provider-skill-trigger.EVAL.yaml` runs 4/4 pass for `copilot` target
  • Codex eval: `AGENTV_CODEX_STREAM_LOGS=false bun apps/cli/src/cli.ts eval examples/features/agent-skills-evals/multi-provider-skill-trigger.EVAL.yaml --target codex` runs 4/4 pass

🤖 Generated with Claude Code

…skills

- Scan full transcript (not just first tool call) so preamble meta-skills
  like using-superpowers don't cause false negatives for copilot
- Add CODEX_MATCHER: detect command_execution reading skill files via bash
  sed commands; skill name appears in the command string
- Fix codex provider: store command_execution input as { command } object
  instead of raw string so readPathFromInput can extract the path
- Add workspace template with .agents/skills/ and .codex/skills/ dirs so
  evals are self-contained and don't rely on global skill installation
- Add 11 new unit tests covering full transcript scanning, codex bash
  detection, and pi-coding-agent provider resolution

E2E results: copilot 4/4, codex 4/4, pi 2/4 (pi has no skill system)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 22, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 0a3a313
Status: ✅  Deploy successful!
Preview URL: https://e1943cfd.agentv.pages.dev
Branch Preview URL: https://feat-skill-trigger-multi-pro.agentv.pages.dev

View logs

christso and others added 2 commits March 22, 2026 07:21
…ATCHER

Pi CLI emits `type: 'toolCall'` with `arguments` (not `tool_use` / `input`),
so tool calls were silently dropped. Also add PI_CODING_AGENT_MATCHER using
lowercase `read` + `path` field to match Pi's actual tool names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ices

pi-agent-sdk has no tools (tools: []) so it cannot read files or invoke
skills. pi-coding-agent covers all the same use cases. Example targets.yaml
files now use openrouter directly for LLM grading.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The self-evaluation default target should use the coding agent,
not a direct LLM call. The offline-grader-benchmark targets
remain as openrouter since graders only need plain LLM calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@christso christso merged commit fcc301c into main Mar 22, 2026
1 check passed
@christso christso deleted the feat/skill-trigger-multi-provider branch March 22, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant