Skip to content

feat(cli): EVAL.yaml → evals.json transpiler (agentv transpile)#606

Merged
christso merged 1 commit intomainfrom
feat/598-eval-yaml-transpiler
Mar 15, 2026
Merged

feat(cli): EVAL.yaml → evals.json transpiler (agentv transpile)#606
christso merged 1 commit intomainfrom
feat/598-eval-yaml-transpiler

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Summary

  • Adds agentv transpile <input.yaml> CLI command that converts AgentV EVAL.yaml format to Agent Skills evals.json
  • Implements the full field mapping spec: prompt/files extraction from content blocks, trigger-judge → should_trigger, all assert types → natural language, root-level assertion distribution, multi-skill output
  • 43 unit tests covering all assertion types, input shorthands, multi-skill routing, and edge cases
  • Example EVAL.yaml + expected evals.json in examples/features/transpile/

Field mappings implemented

EVAL.yaml evals.json
tests[].input[role=user].content[type=text] evals[].prompt
tests[].input[role=user].content[type=file][].value evals[].files[]
tests[].input_files[] evals[].files[] (shorthand)
tests[].expected_output evals[].expected_output (flattened)
tests[].assert[type=trigger-judge].should_trigger evals[].should_trigger (omitted if absent)
tests[].criteria + other assert types evals[].assertions[] (NL strings)
Suite-level assert/assertions appended to every test's assertions[]

Risk

Low — new additive feature, no existing behavior changed.

Closes #598

Adds a transpiler that converts AgentV EVAL.yaml format to Agent Skills
evals.json format for consumption by the skill-creator pipeline.

- New core module: eval-yaml-transpiler (transpileEvalYaml, transpileEvalYamlFile, getOutputFilenames)
- New CLI command: agentv transpile <input.yaml> [--out-dir dir] [--stdout]
- Handles both assertions: (current) and assert: (deprecated alias)
- Extracts prompt and files from structured content blocks (type: text/file)
- Converts all assertion types to natural language strings
- trigger-judge maps to should_trigger bool, not assertions
- Root-level assertions distributed to every test
- Multi-skill input produces one evals.json per skill
- Tests without trigger-judge assigned to dominant skill or _no-skill
- 43 unit tests covering all field mappings and edge cases
- Example EVAL.yaml and expected evals.json in examples/features/transpile/
@christso christso merged commit 4028234 into main Mar 15, 2026
1 check was pending
@christso christso deleted the feat/598-eval-yaml-transpiler branch March 15, 2026 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: create EVAL.yaml to evals.json transpiler

1 participant