feat(pairing): add multi-agent review pipeline skill and eval suite#269
Conversation
|
Pre-flight self-review — PR #269 (pairing-multi-agent-review) #269 · draft · author: Base: main · Files changed: 52 (all added) · Diff size: +1038 / −3 A new Pairing-mode skill that fans the diff through three independent axis Correctness No findings. Eval-spec ↔ expected.json keys match exactly across all 6 step Security No findings. Standard injection-guard callout present in SKILL.md. The Conventions No findings. skill-validate --strict clean for this skill (the one violation Summary Ready — no blocking or advisory findings. Blocking: 0 Advisory: 0 |
|
Hi @justinmclean — same situation as #228 / #229. The Whenever you get a chance, a fresh rebase from your end would help. No rush. |
063bb37 to
38f3280
Compare
|
I will rebase/solve conflicts @justinmclean -> so that we can rename things - we can continue working on it later :) |
Implements work item 5 from the spec-loop plan: a new pairing-multi-agent-review skill that fans a local diff through three independent, axis-isolated review passes (correctness, security, conventions) and merges their findings into one structured report. Key design points: - Each sub-agent receives only its own axis scope to prevent one axis from anchoring or suppressing findings on another. - Sub-agents run in parallel (single Agent tool call message). - Deduplication annotates cross-axis findings with also_flagged_by rather than silently dropping them. - Injection-guard callout (Pattern 4) is present; injection attempts detected in diff content are flagged as blocking findings in the Security section. - Report format is identical to pairing-self-review for a consistent developer experience across the Pairing skill family. Includes a 15-case eval suite across 6 step-suites covering diff collection, per-axis sub-agent passes, merge/deduplication, and report composition — including an adversarial injection-resistance case per axis. Updates docs/modes.md to mark Pairing as experimental with 1 skill. Generated-by: Claude (Opus 4.7)
…base onto main Rebasing apache#269 onto current main surfaced a semantic conflict: main now requires a `capability` frontmatter key on every skill (enforced by the skill-and-tool validator), a rule that landed after this PR was opened. Adds `capability: capability:review` to the skill (it is a Pairing-mode multi-agent code-review pipeline, matching pairing-self-review) and the matching capability->skill map row in docs/labels-and-capabilities.md. Generated-by: Claude Code (Opus 4.8)
38f3280 to
f0236ff
Compare
Implements work item 5 from the spec-loop plan: a new pairing-multi-agent-review skill that fans a local diff through three independent, axis-isolated review passes (correctness, security, conventions) and merges their findings into one structured report.
Key design points:
Includes a 15-case eval suite across 6 step-suites covering diff collection, per-axis sub-agent passes, merge/deduplication, and report composition — including an adversarial injection-resistance case per axis.
Updates docs/modes.md to mark Pairing as experimental with 1 skill.
Generated-by: Claude (Opus 4.7)