diff --git a/plugins/agentv-dev/skills/agentv-eval-builder/SKILL.md b/plugins/agentv-dev/skills/agentv-eval-builder/SKILL.md index 17f32366f..d139c21dc 100644 --- a/plugins/agentv-dev/skills/agentv-eval-builder/SKILL.md +++ b/plugins/agentv-dev/skills/agentv-eval-builder/SKILL.md @@ -1,6 +1,12 @@ --- name: agentv-eval-builder -description: Create and maintain AgentV evaluation files for testing AI agent performance. Use this skill when creating new eval files, adding tests, configuring evaluators, or converting Agent Skills evals.json files to AgentV format. +description: >- + Create and maintain AgentV EVAL.yaml / .eval.yaml evaluation files for testing AI agent output quality. + Use when asked to create new AgentV eval files, add test cases to an existing .eval.yaml, + configure AgentV evaluators (llm-judge, code-judge, rubrics), or convert evals.json to AgentV EVAL YAML format + using `agentv convert`. + Do NOT use for creating SKILL.md files, writing skill definitions, or building skill test suites — + those tasks belong to the skill-creator skill. --- # AgentV Eval Builder diff --git a/plugins/agentv-dev/skills/agentv-eval-orchestrator/SKILL.md b/plugins/agentv-dev/skills/agentv-eval-orchestrator/SKILL.md index 72afcafaa..77d8ceec4 100644 --- a/plugins/agentv-dev/skills/agentv-eval-orchestrator/SKILL.md +++ b/plugins/agentv-dev/skills/agentv-eval-orchestrator/SKILL.md @@ -1,6 +1,11 @@ --- name: agentv-eval-orchestrator -description: Run AgentV evaluations by orchestrating eval subcommands. Use this skill when asked to run evals, evaluate an agent, test prompt quality using agentv, or run Agent Skills evals.json files. +description: >- + Run AgentV evaluations against EVAL.yaml / .eval.yaml / evals.json files using the `agentv prompt eval` and `agentv eval` CLI commands. + Use when asked to run AgentV evals, evaluate agent output quality with AgentV, execute an AgentV evaluation suite, + or orchestrate AgentV eval subcommands. + Do NOT use for creating or modifying SKILL.md files, packaging skills, optimizing skill trigger descriptions, + or measuring skill-creator performance — those tasks belong to the skill-creator skill. --- # AgentV Eval Orchestrator diff --git a/plugins/agentv-dev/skills/agentv-optimizer/SKILL.md b/plugins/agentv-dev/skills/agentv-optimizer/SKILL.md index 1436608cd..24bb46f12 100644 --- a/plugins/agentv-dev/skills/agentv-optimizer/SKILL.md +++ b/plugins/agentv-dev/skills/agentv-optimizer/SKILL.md @@ -1,6 +1,13 @@ --- name: agentv-optimizer -description: Optimize agent prompts through evaluation-driven refinement. Five-phase workflow (Discovery → Planning → Optimization → Polish → Handoff) that ensures evaluation integrity and keeps the user in control. +description: >- + Optimize agent task prompts through AgentV evaluation-driven refinement using `agentv prompt eval` and EVAL.yaml files. + Five-phase workflow (Discovery → Planning → Optimization → Polish → Handoff) that iteratively improves prompts + based on AgentV eval scores. + Use when asked to optimize agent performance against AgentV evals, improve prompt quality using AgentV evaluation results, + or run the AgentV optimization loop. + Do NOT use for optimizing SKILL.md trigger descriptions, improving skill discoverability, or editing skill metadata — + those tasks belong to the skill-creator skill. --- # AgentV Optimizer diff --git a/plugins/agentv-dev/skills/agentv-trace-analyst/SKILL.md b/plugins/agentv-dev/skills/agentv-trace-analyst/SKILL.md index cdf31faa9..8af9ec56c 100644 --- a/plugins/agentv-dev/skills/agentv-trace-analyst/SKILL.md +++ b/plugins/agentv-dev/skills/agentv-trace-analyst/SKILL.md @@ -1,6 +1,12 @@ --- name: agentv-trace-analyst -description: Analyze AgentV evaluation traces using CLI primitives. Use when asked to inspect eval results, find regressions, identify failure patterns, analyze tool trajectories, compute cost/latency statistics, or reason about agent performance from trace data. +description: >- + Analyze AgentV evaluation traces and result JSONL files using `agentv trace` and `agentv compare` CLI commands. + Use when asked to inspect AgentV eval results, find regressions between AgentV evaluation runs, + identify failure patterns in AgentV trace data, analyze tool trajectories, or compute cost/latency/score statistics + from AgentV result files. + Do NOT use for benchmarking skill trigger accuracy, analyzing skill-creator eval performance, + or measuring skill description quality — those tasks belong to the skill-creator skill. --- # AgentV Trace Analyst