From 6ea65541805abc7cbbc0effaeba871d15ced812c Mon Sep 17 00:00:00 2001 From: Jarek Potiuk Date: Wed, 27 May 2026 12:32:50 +0200 Subject: [PATCH] feat(labels): capability taxonomy + validator enforcement + sync check MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Introduce a canonical capability taxonomy for the framework, enforce it in the validator, and add a sync-check that keeps the docs and the live source aligned. == Taxonomy == Nine `capability:*` buckets orthogonal to the existing `area:*` labels: triage, review, fix, intake, reconciliation (new — covers and tools may carry more than one capability when they genuinely span lifecycle phases — `security-issue-fix` is fix+resolve, `setup-isolated-setup-doctor` is setup+reassess, `cve-org` is resolve+intake, etc. docs/labels-and-capabilities.md is the canonical reference: label- dimension definitions, capability bucket definitions, per-skill map for all 30 skills, per-tool map for all 18 tools. == Rule == AGENTS.md states the rule: every issue, PR, new tool, new skill, and (where applicable) new doc declares its capabilities. - Issues / PRs: `area:*` + every applicable `capability:*` label. - New tools: `**Capability:** capability:NAME` (or `capability:NAME + capability:NAME` for multi-value) in the README first paragraph. - New skills: `capability:` in SKILL.md frontmatter — single string or YAML list form. `setup-override-upstream` now picks both labels before opening a framework PR. `write-skill` requires the new frontmatter field on every new-skill scaffold. == Backfill == - All 30 existing skills got `capability:` added to their frontmatter via a one-shot script aligned with the per-skill map. - 10 existing tool READMEs got a `**Capability:**` line. - 8 tools without a README got a minimal stub README declaring capability + pointing at existing internal docs. == Validator (renamed skill-validator -> skill-and-tool-validator) == The validator outgrew its name. Renamed the directory, the Python module, the CLI entry point (`skill-validate` -> `skill-and-tool-validate`), and updated every cross-reference (.asf.yaml, .pre-commit-config.yaml, dependabot.yml, tests.yml, CONTRIBUTING.md, write-skill, init_skill.py). New checks: - `capability` is now in REQUIRED_FRONTMATTER_KEYS. Validates both single-string and YAML-list forms; rejects values outside the 9-bucket taxonomy. - `validate_tools()` — every `tools//` must have a README declaring its capabilities. Both "missing README" and "missing capability line" are HARD violations. - `validate_capability_sync()` — compares the two tables in docs/labels-and-capabilities.md against the live frontmatter + tool README declarations, bidirectionally. Drift in either direction is a HARD violation. Italic-parenthetical future-state notes (`*(+ capability:X once #N lands)*`) are stripped before comparison so the doc can flag planned capabilities without tripping the check. Prek hook trigger expanded so the sync check fires on `tools/*/README.md` and `docs/labels-and-capabilities.md` changes, not just skill files. == Tests == 12 new tests across 3 classes — single + list + missing + invalid + list-with-invalid for the frontmatter check; valid + missing- readme + missing-cap + invalid + multi-value + regex regression guard for the tool check; aligned + skill-doc-no-live + live-skill- no-doc + skill-mismatch + tool-doc-no-live + live-tool-no-doc + italic-parens for the sync check. All 218 tests green. Generated-by: Claude Code (Opus 4.7) --- .../skills/contributor-nomination/SKILL.md | 1 + .claude/skills/issue-fix-workflow/SKILL.md | 1 + .claude/skills/issue-reassess-stats/SKILL.md | 1 + .claude/skills/issue-reassess/SKILL.md | 1 + .claude/skills/issue-reproducer/SKILL.md | 1 + .claude/skills/issue-triage/SKILL.md | 1 + .claude/skills/list-steward-skills/SKILL.md | 1 + .claude/skills/pairing-self-review/SKILL.md | 1 + .../skills/pr-management-code-review/SKILL.md | 1 + .claude/skills/pr-management-mentor/SKILL.md | 1 + .claude/skills/pr-management-stats/SKILL.md | 1 + .claude/skills/pr-management-triage/SKILL.md | 1 + .claude/skills/security-cve-allocate/SKILL.md | 1 + .../security-issue-deduplicate/SKILL.md | 1 + .claude/skills/security-issue-fix/SKILL.md | 3 + .../security-issue-import-from-md/SKILL.md | 1 + .../security-issue-import-from-pr/SKILL.md | 1 + .claude/skills/security-issue-import/SKILL.md | 1 + .../skills/security-issue-invalidate/SKILL.md | 1 + .claude/skills/security-issue-sync/SKILL.md | 1 + .claude/skills/security-issue-triage/SKILL.md | 1 + .../security-tracker-stats-dashboard/SKILL.md | 1 + .../setup-isolated-setup-doctor/SKILL.md | 3 + .../setup-isolated-setup-install/SKILL.md | 1 + .../setup-isolated-setup-update/SKILL.md | 1 + .../setup-isolated-setup-verify/SKILL.md | 1 + .../skills/setup-override-upstream/SKILL.md | 23 +- .../skills/setup-shared-config-sync/SKILL.md | 3 + .claude/skills/setup-steward/SKILL.md | 1 + .claude/skills/write-skill/SKILL.md | 35 +- .../skills/write-skill/scripts/init_skill.py | 2 +- .../skills/write-skill/security-checklist.md | 2 +- .github/dependabot.yml | 4 +- .github/workflows/tests.yml | 4 +- .pre-commit-config.yaml | 47 ++- AGENTS.md | 61 +++ CONTRIBUTING.md | 16 +- docs/labels-and-capabilities.md | 279 +++++++++++++ tools/agent-isolation/README.md | 2 + tools/cve-org/README.md | 16 + tools/dashboard-generator/README.md | 2 + tools/dev/README.md | 16 + tools/github/README.md | 16 + tools/gmail/README.md | 16 + tools/jira/README.md | 2 + tools/mail-source/README.md | 16 + tools/ponymail/README.md | 16 + tools/pr-management-stats/README.md | 2 + tools/privacy-llm/README.md | 16 + tools/probe-templates/README.md | 2 + tools/sandbox-lint/README.md | 2 + .../README.md | 2 + .../README.md | 10 +- .../pyproject.toml | 6 +- .../src/skill_and_tool_validator}/__init__.py | 344 +++++++++++++++- .../tests/test_validator.py | 381 ++++++++++++++++-- .../uv.lock | 2 +- tools/skill-evals/README.md | 2 + tools/spec-loop/README.md | 2 + tools/spec-status-index/README.md | 2 + tools/vulnogram/README.md | 16 + 61 files changed, 1312 insertions(+), 86 deletions(-) create mode 100644 docs/labels-and-capabilities.md create mode 100644 tools/cve-org/README.md create mode 100644 tools/dev/README.md create mode 100644 tools/github/README.md create mode 100644 tools/gmail/README.md create mode 100644 tools/mail-source/README.md create mode 100644 tools/ponymail/README.md create mode 100644 tools/privacy-llm/README.md rename tools/{skill-validator => skill-and-tool-validator}/README.md (91%) rename tools/{skill-validator => skill-and-tool-validator}/pyproject.toml (93%) rename tools/{skill-validator/src/skill_validator => skill-and-tool-validator/src/skill_and_tool_validator}/__init__.py (79%) rename tools/{skill-validator => skill-and-tool-validator}/tests/test_validator.py (83%) rename tools/{skill-validator => skill-and-tool-validator}/uv.lock (99%) create mode 100644 tools/vulnogram/README.md diff --git a/.claude/skills/contributor-nomination/SKILL.md b/.claude/skills/contributor-nomination/SKILL.md index bed219fd..11bbc781 100644 --- a/.claude/skills/contributor-nomination/SKILL.md +++ b/.claude/skills/contributor-nomination/SKILL.md @@ -17,6 +17,7 @@ when_to_use: | provided and the user has not indicated they want to assess a contributor. argument-hint: " [window:Nm] [target:committer|pmc]" +capability: capability:stats license: Apache-2.0 --- diff --git a/.claude/skills/issue-fix-workflow/SKILL.md b/.claude/skills/issue-fix-workflow/SKILL.md index 2895510e..1cc42583 100644 --- a/.claude/skills/issue-fix-workflow/SKILL.md +++ b/.claude/skills/issue-fix-workflow/SKILL.md @@ -16,6 +16,7 @@ when_to_use: | to `issue-triage` for issues classified BUG or FEATURE-REQUEST. Skip when the fix is non-trivial enough to need design discussion — those go through an RFC first. +capability: capability:fix license: Apache-2.0 --- diff --git a/.claude/skills/issue-reassess-stats/SKILL.md b/.claude/skills/issue-reassess-stats/SKILL.md index 2fff08c7..d440cc02 100644 --- a/.claude/skills/issue-reassess-stats/SKILL.md +++ b/.claude/skills/issue-reassess-stats/SKILL.md @@ -13,6 +13,7 @@ when_to_use: | "which issues still fail across pool runs". Also as a pre-release check on whether the EOL pool has dropped, and as a periodic health-of-the-backlog view. +capability: capability:stats license: Apache-2.0 --- diff --git a/.claude/skills/issue-reassess/SKILL.md b/.claude/skills/issue-reassess/SKILL.md index e2d36a8b..e5d270f7 100644 --- a/.claude/skills/issue-reassess/SKILL.md +++ b/.claude/skills/issue-reassess/SKILL.md @@ -17,6 +17,7 @@ when_to_use: | audit before releases or after a major version cut. Skip when the goal is per-PR triage — that is `pr-management-triage` — or when the issues are still in active triage flow. +capability: capability:reassess license: Apache-2.0 --- diff --git a/.claude/skills/issue-reproducer/SKILL.md b/.claude/skills/issue-reproducer/SKILL.md index c404d0c5..b97515d6 100644 --- a/.claude/skills/issue-reproducer/SKILL.md +++ b/.claude/skills/issue-reproducer/SKILL.md @@ -16,6 +16,7 @@ when_to_use: | issue in its candidate set. Skip when the issue does not carry runnable example code — use `issue-triage` to assess instead. +capability: capability:reassess license: Apache-2.0 --- diff --git a/.claude/skills/issue-triage/SKILL.md b/.claude/skills/issue-triage/SKILL.md index 0fb4cf4f..ce9ca751 100644 --- a/.claude/skills/issue-triage/SKILL.md +++ b/.claude/skills/issue-triage/SKILL.md @@ -16,6 +16,7 @@ when_to_use: | Skip when team consensus has landed — invoke `/issue-fix-workflow` for confirmed bugs or the appropriate closure flow directly. +capability: capability:triage license: Apache-2.0 --- diff --git a/.claude/skills/list-steward-skills/SKILL.md b/.claude/skills/list-steward-skills/SKILL.md index 8398e7b9..032b2e4d 100644 --- a/.claude/skills/list-steward-skills/SKILL.md +++ b/.claude/skills/list-steward-skills/SKILL.md @@ -15,6 +15,7 @@ when_to_use: | repository — agents route via the live frontmatter `description` field directly and do not need this index to choose a skill. +capability: capability:stats license: Apache-2.0 --- diff --git a/.claude/skills/pairing-self-review/SKILL.md b/.claude/skills/pairing-self-review/SKILL.md index dd91ed55..980c0fdb 100644 --- a/.claude/skills/pairing-self-review/SKILL.md +++ b/.claude/skills/pairing-self-review/SKILL.md @@ -15,6 +15,7 @@ when_to_use: | whether their branch is ready before requesting a human maintainer review. Skip when a PR is already open — use `pr-management-code-review` for that. argument-hint: "[base:] [staged] [path:]" +capability: capability:review license: Apache-2.0 --- + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [Labels and capabilities](#labels-and-capabilities) + - [Label dimensions](#label-dimensions) + - [1. `area:*` — subject](#1-area--subject) + - [2. `capability:*` — what the tool does](#2-capability--what-the-tool-does) + - [3. `kind:*` — change type (pre-existing)](#3-kind--change-type-pre-existing) + - [4. `mode:*` — handling mode (pre-existing)](#4-mode--handling-mode-pre-existing) + - [Standalone labels](#standalone-labels) + - [Capability to skill map](#capability-to-skill-map) + - [Capability to tool map](#capability-to-tool-map) + - [The rule](#the-rule) + - [A GitHub issue](#a-github-issue) + - [A pull request](#a-pull-request) + - [A new tool under `tools/`](#a-new-tool-under-tools) + - [A new skill under `.claude/skills/`](#a-new-skill-under-claudeskills) + - [A new doc under `docs/`](#a-new-doc-under-docs) + - [Why this exists](#why-this-exists) + + + + + +# Labels and capabilities + +This page is the canonical reference for the label taxonomy used on +issues and pull requests in this framework repository +(`apache/airflow-steward`). It also defines the **capability** model +that classifies what each skill or tool in the framework actually +*does*, independent of which subject area it sits under. + +Every issue and pull request opened against this repository should +carry at least one **`area:*`** label and at least one +**`capability:*`** label. New tools and new skills must declare their +capability up front (see [The rule](#the-rule)). + +> **Scope caveat.** This taxonomy applies to *this framework +> repository*. Skills that create issues or PRs on an **adopter's +> tracker** (e.g. `security-issue-import`, `security-issue-fix`, +> `issue-fix-workflow`) use the adopter's own label scheme — adopters +> are free to mirror this taxonomy in their own repo but are not +> required to. + +--- + +## Label dimensions + +The repository's labels fall into four orthogonal dimensions. An issue +or PR typically carries one label from each dimension that applies. + +### 1. `area:*` — subject + +What part of the framework does this touch? + +| Label | Covers | +|---|---| +| `area:pr-management` | `pr-management-*` skills | +| `area:security` | `security-*` skills, `security-tracker-stats-dashboard` | +| `area:setup` | `setup-*` skills, framework adoption, agent-sandbox setup | +| `area:issue` | `issue-*` skills (`issue-triage`, `issue-fix-workflow`, `issue-reassess`, `issue-reassess-stats`, `issue-reproducer`) | +| `area:tools` | Substrate tools under `tools/*` (CLI bridges, agent-runtime adapters, mail-source backends) | +| `area:ci` | `.github/` workflows, prek, validators | +| `area:docs` | `docs/`, `MISSION.md`, READMEs | + +### 2. `capability:*` — what the tool does + +Nine buckets. A tool or skill carries **one or more** `capability:*` +labels. Most map cleanly to a single bucket; dual-capability cases +are real and explicitly enumerated below. Issues and PRs follow the +same rule — apply every capability the change is implementing. + +When a skill or tool spans multiple capabilities, list **all** of +them in its frontmatter / README. Do not pick a single "primary" +to be neat; that loses information the label system exists to +surface. + +| Label | Definition | +|---|---| +| `capability:triage` | Sweep a queue, classify candidates, propose dispositions for human confirmation. | +| `capability:review` | Deep per-item code review of a PR or local diff; also contributor mentoring (single-item teaching intervention). | +| `capability:fix` | Implement a code change against an upstream repo to resolve a triaged issue. | +| `capability:intake` | Import external signal (mailing list, scan report, public PR) into a tracker entry, or keep an existing entry reconciled with one of those sources. | +| `capability:reconciliation` | Compare tracker state against an external inventory (e.g. ASF security dashboard, organization-wide issue registry); surface drift; propose corrections. Does **not** write to either source. | +| `capability:resolve` | Close-out actions: invalidate, dedupe, CVE-allocate, post-announcement housekeeping. | +| `capability:reassess` | Re-run resolved or end-of-life issues against current code to verify still-fixed / still-broken. | +| `capability:stats` | Read-only dashboards, metrics, governance evidence, contributor nomination briefs. | +| `capability:setup` | Framework / agent / substrate infrastructure: install, verify, update, doctor, override-upstream, write-skill, plus new tools under `tools/*`. | + +The `capability:*` dimension is **orthogonal** to `area:*`. A single +query can answer "how is our triage stack doing across PR + issue + +security?" by filtering on `capability:triage` alone, without +enumerating per-area queries. + +### 3. `kind:*` — change type (pre-existing) + +| Label | Covers | +|---|---| +| `kind:dx` | Maintainer dev-loop / CLI UX | +| `kind:policy` | Rule changes (eligibility, thresholds, behaviour switches) | +| `kind:perf` | Token / latency / API-call budget | +| `kind:adopter-config` | Per-adopter knob | + +### 4. `mode:*` — handling mode (pre-existing) + +| Label | Covers | +|---|---| +| `mode:A` | Mode A — triage | +| `mode:B` | Mode B — mentoring | +| `mode:C` | Mode C — agent-authored fix with human review | +| `mode:D` | Mode D — narrowly-scoped auto-merge (off until A/B/C run 2 quarters) | +| `mode:cross-cutting` | Spans multiple modes | +| `mode:platform` | Substrate / infra — not a mode (sandbox, CI, validators) | + +### Standalone labels + +`marketing` (branding artefacts), `dependencies` (dependency-update +PRs), `python:uv` (Python uv-managed code), plus the default GitHub +labels (`bug`, `enhancement`, `documentation`, `good first issue`, +etc.). + +--- + +## Capability to skill map + +Capabilities for every skill currently in +[`.claude/skills/`](../.claude/skills/). Skills with two values +(separated by `+`) carry both labels. + +| Skill | Capability / capabilities | +|---|---| +| `pr-management-triage` | `capability:triage` | +| `issue-triage` | `capability:triage` | +| `security-issue-triage` | `capability:triage` | +| `pr-management-code-review` | `capability:review` | +| `pairing-self-review` | `capability:review` | +| `pr-management-mentor` | `capability:review` | +| `issue-fix-workflow` | `capability:fix` | +| `security-issue-fix` | `capability:fix` + `capability:resolve` *(opens the PR that closes the tracker — both phases)* | +| `security-issue-import` | `capability:intake` | +| `security-issue-import-from-md` | `capability:intake` | +| `security-issue-import-from-pr` | `capability:intake` | +| `security-issue-sync` | `capability:intake` *(+ `capability:reconciliation` once [#337](https://github.com/apache/airflow-steward/issues/337) lands the ASF-dashboard step)* | +| `setup-shared-config-sync` | `capability:intake` + `capability:setup` *(reconciles user-scope config to a sync repo; the act is intake, the subject is setup)* | +| `security-cve-allocate` | `capability:resolve` | +| `security-issue-invalidate` | `capability:resolve` | +| `security-issue-deduplicate` | `capability:resolve` | +| `issue-reassess` | `capability:reassess` | +| `issue-reproducer` | `capability:reassess` | +| `pr-management-stats` | `capability:stats` | +| `issue-reassess-stats` | `capability:stats` | +| `security-tracker-stats-dashboard` | `capability:stats` | +| `contributor-nomination` | `capability:stats` | +| `list-steward-skills` | `capability:stats` | +| `setup-steward` | `capability:setup` | +| `setup-isolated-setup-install` | `capability:setup` | +| `setup-isolated-setup-verify` | `capability:setup` | +| `setup-isolated-setup-update` | `capability:setup` | +| `setup-isolated-setup-doctor` | `capability:setup` + `capability:reassess` *(re-checks an installed sandbox against current spec — the phase is reassess on subject setup)* | +| `setup-override-upstream` | `capability:setup` | +| `write-skill` | `capability:setup` | + +## Capability to tool map + +Tools under [`tools/`](../tools/). Tools with two values (separated by +`+`) carry both labels — the dual role is explained in each row. + +| Tool | Capability / capabilities | Role | +|---|---|---| +| [`tools/agent-isolation`](../tools/agent-isolation/) | `capability:setup` | Secure-agent sandbox helpers | +| [`tools/cve-org`](../tools/cve-org/) | `capability:resolve` + `capability:intake` | Publishes to CVE.org *(resolve)* and records the resulting CVE state back into the tracker *(intake)* | +| [`tools/dashboard-generator`](../tools/dashboard-generator/) | `capability:stats` | Self-contained HTML dashboard generator | +| [`tools/dev`](../tools/dev/) | `capability:setup` | Framework dev-loop helpers | +| [`tools/github`](../tools/github/) | `capability:setup` | GitHub REST / GraphQL substrate (called by every lifecycle phase — pure substrate, no single phase) | +| [`tools/gmail`](../tools/gmail/) | `capability:setup` | Gmail API substrate | +| [`tools/jira`](../tools/jira/) | `capability:setup` | JIRA REST substrate (read-only today; write subcommands tracked in [#301](https://github.com/apache/airflow-steward/issues/301)) | +| [`tools/mail-source`](../tools/mail-source/) | `capability:setup` + `capability:intake` | Mail-source backend abstraction (mbox / IMAP / Mailman 3); the abstraction is setup, every concrete read is part of the intake pipeline | +| [`tools/ponymail`](../tools/ponymail/) | `capability:setup` + `capability:intake` | PonyMail archive substrate; same dual role as `mail-source` — substrate plus an intake-pipeline component | +| [`tools/pr-management-stats`](../tools/pr-management-stats/) | `capability:stats` | PR-backlog analytics engine | +| [`tools/privacy-llm`](../tools/privacy-llm/) | `capability:setup` | Privacy-LLM PII-scrubbing gate | +| [`tools/probe-templates`](../tools/probe-templates/) | `capability:setup` | Sandbox-doctor probe templates | +| [`tools/sandbox-lint`](../tools/sandbox-lint/) | `capability:setup` | Sandbox settings linter | +| [`tools/security-tracker-stats-dashboard`](../tools/security-tracker-stats-dashboard/) | `capability:stats` | Security-tracker analytics engine | +| [`tools/spec-loop`](../tools/spec-loop/) | `capability:setup` | Spec-driven build loop runner (Ralph-style) for framework development | +| [`tools/skill-evals`](../tools/skill-evals/) | `capability:setup` + `capability:stats` | Eval harness for skills; the harness is setup infrastructure, the run output is governance evidence | +| [`tools/skill-and-tool-validator`](../tools/skill-and-tool-validator/) | `capability:setup` | Skill-frontmatter and convention validator | +| [`tools/spec-status-index`](../tools/spec-status-index/) | `capability:setup` + `capability:stats` | Index of spec / RFC implementation status — substrate that also doubles as a governance/stats view | +| [`tools/vulnogram`](../tools/vulnogram/) | `capability:resolve` | ASF Vulnogram CVE-allocation client | + +A tool's capabilities are determined by its **use-case lifecycle +phases**, not by which skills happen to consume it. `tools/github` is +called by every triage / intake / fix / resolve skill but is tagged +only `capability:setup` because it doesn't encode any one lifecycle +phase — it is pure substrate. `tools/cve-org`, by contrast, exists +specifically to *do* CVE publication and to record that result; both +the resolve action and the intake of state into the tracker are +first-class jobs of the tool, so it carries both labels. + +When a tool grows to serve a new lifecycle phase as a first-class +feature (rather than as generic substrate that other skills happen +to compose), add the new `capability:*` label to its README and to +the table above. + +--- + +## The rule + +When you create any of the following on this repository, declare the +capability: + +### A GitHub issue + +Apply at least one `area:*` AND one `capability:*` label. If the issue +genuinely spans capabilities, apply both — for example, +[#337](https://github.com/apache/airflow-steward/issues/337) carries +both `capability:reconciliation` and `capability:setup` because it +covers a new substrate tool *and* a new sync-flow integration. + +### A pull request + +Same: `area:*` AND `capability:*`. Match the capability the change is +*implementing*, not the file paths it happens to touch. A PR that +adjusts the validator config to support a new triage rule is +`capability:triage` (the change's purpose), not `capability:setup` +(the file it edited). + +### A new tool under `tools/` + +Declare the tool's capability in the **first paragraph of its README** +using the line: + +```markdown +**Capability:** capability:NAME +``` + +If the tool serves more than one capability, list both. Substrate +bridges (`tools/github`, `tools/gmail`, …) default to +`capability:setup` unless they encode a specific lifecycle capability. + +### A new skill under `.claude/skills/` + +Declare the capability in the skill's frontmatter: + +```yaml +--- +name: my-new-skill +description: | + ... +capability: capability:NAME +--- +``` + +The [`write-skill`](../.claude/skills/write-skill/SKILL.md) skill +prompts for this on every new-skill scaffold. + +### A new doc under `docs/` + +Capability-specific docs (e.g. a guide for a single skill family) +should link to this page and name the capability in their first +paragraph. Cross-cutting docs (`MISSION.md`, top-level READMEs) need +no capability marker. + +--- + +## Why this exists + +The original `area:*` labels split issues by subject — useful for +"what part of the codebase is this?" but unable to answer "what kind +of thing is this?". The `capability:*` dimension fills that gap and +is orthogonal: a triage-rule change in PR management +(`area:pr-management` + `capability:triage`) and a triage-rule change +in security (`area:security` + `capability:triage`) become trivially +findable as a cohort even though they live in different families. + +Capability is also a forcing function for skill design: if a new skill +doesn't fit any of the nine buckets cleanly, that's a signal worth +inspecting before the skill ships. diff --git a/tools/agent-isolation/README.md b/tools/agent-isolation/README.md index b6e93f1b..7e745505 100644 --- a/tools/agent-isolation/README.md +++ b/tools/agent-isolation/README.md @@ -14,6 +14,8 @@ # `tools/agent-isolation/` — secure agent setup helpers +**Capability:** capability:setup + This directory ships the moving pieces the framework's [`docs/setup/secure-agent-setup.md`](../../docs/setup/secure-agent-setup.md) document references. It is not a Python project (unlike the sibling tools diff --git a/tools/cve-org/README.md b/tools/cve-org/README.md new file mode 100644 index 00000000..37a0e4ce --- /dev/null +++ b/tools/cve-org/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/cve-org/`](#toolscve-org) + + + + + +# `tools/cve-org/` + +**Capability:** capability:resolve + capability:intake + +CVE.org publication client. Submits CVE records via the CVE.org REST API; consumed by `security-cve-allocate` once a CVE has been allocated via the ASF Vulnogram path. See [`tool.md`](tool.md) for the protocol detail and `cve.org` field mapping. diff --git a/tools/dashboard-generator/README.md b/tools/dashboard-generator/README.md index c740a5f7..20e8a159 100644 --- a/tools/dashboard-generator/README.md +++ b/tools/dashboard-generator/README.md @@ -17,6 +17,8 @@ # Dashboard generator +**Capability:** capability:stats + Deterministic reference implementations of the dashboard that [`issue-reassess-stats`](../../.claude/skills/issue-reassess-stats/SKILL.md) produces. Adopters who want CI-rendered dashboards (refreshed on diff --git a/tools/dev/README.md b/tools/dev/README.md new file mode 100644 index 00000000..f0574f4f --- /dev/null +++ b/tools/dev/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/dev/`](#toolsdev) + + + + + +# `tools/dev/` + +**Capability:** capability:setup + +Framework dev-loop helpers (placeholder check, agent pre-commit hook). Invoked by prek and CI; not consumed by any skill directly. See the individual scripts in this directory for usage. diff --git a/tools/github/README.md b/tools/github/README.md new file mode 100644 index 00000000..478e6477 --- /dev/null +++ b/tools/github/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/github/`](#toolsgithub) + + + + + +# `tools/github/` + +**Capability:** capability:setup + +GitHub REST + GraphQL substrate. Pure read/write wrapper used by every lifecycle phase (triage / intake / fix / resolve / stats). See [`tool.md`](tool.md) for the operation catalogue and the per-area files ([`issue-template.md`](issue-template.md), [`labels.md`](labels.md), [`operations.md`](operations.md), [`project-board.md`](project-board.md), [`status-rollup.md`](status-rollup.md)) for specifics. diff --git a/tools/gmail/README.md b/tools/gmail/README.md new file mode 100644 index 00000000..73af9ba8 --- /dev/null +++ b/tools/gmail/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/gmail/`](#toolsgmail) + + + + + +# `tools/gmail/` + +**Capability:** capability:setup + +Gmail API substrate. Read + draft-only — never sends. Used by the security-issue-import / sync / invalidate flows for inbound report intake and outbound courtesy-reply drafting. See [`tool.md`](tool.md) for the operation catalogue and the per-area files for ASF relay routing, draft backends, threading, search queries. diff --git a/tools/jira/README.md b/tools/jira/README.md index 89e8b789..98955e86 100644 --- a/tools/jira/README.md +++ b/tools/jira/README.md @@ -20,6 +20,8 @@ # JIRA bridge +**Capability:** capability:setup + Read-only JIRA REST helpers for the `issue-*` skill family. Adopters with JIRA-based issue trackers wire this in as their tracker bridge; adopters using GitHub Issues or other trackers diff --git a/tools/mail-source/README.md b/tools/mail-source/README.md new file mode 100644 index 00000000..8aa915a8 --- /dev/null +++ b/tools/mail-source/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/mail-source/`](#toolsmail-source) + + + + + +# `tools/mail-source/` + +**Capability:** capability:setup + capability:intake + +Mail-source backend abstraction. Pluggable backends (mbox, IMAP, future Mailman 3 / Hyperkitty) that feed the security-issue-import intake pipeline a uniform thread/message view. See [`contract.md`](contract.md) for the backend interface. diff --git a/tools/ponymail/README.md b/tools/ponymail/README.md new file mode 100644 index 00000000..7617de54 --- /dev/null +++ b/tools/ponymail/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/ponymail/`](#toolsponymail) + + + + + +# `tools/ponymail/` + +**Capability:** capability:setup + capability:intake + +PonyMail archive substrate. Read-only ASF mailing-list archive client; complements `gmail` for threads not present in the inbox. Used by security-issue-import + sync to cross-reference public mailing-list discussions. See [`tool.md`](tool.md) for the operation catalogue and [`operations.md`](operations.md) for usage. diff --git a/tools/pr-management-stats/README.md b/tools/pr-management-stats/README.md index cab47f5e..017aea96 100644 --- a/tools/pr-management-stats/README.md +++ b/tools/pr-management-stats/README.md @@ -16,6 +16,8 @@ # pr-management-stats reference implementation +**Capability:** capability:stats + Deterministic reference implementation of the data-fetch + classification contract that backs the [`pr-management-stats`](../../.claude/skills/pr-management-stats/SKILL.md) skill. diff --git a/tools/privacy-llm/README.md b/tools/privacy-llm/README.md new file mode 100644 index 00000000..1443451e --- /dev/null +++ b/tools/privacy-llm/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/privacy-llm/`](#toolsprivacy-llm) + + + + + +# `tools/privacy-llm/` + +**Capability:** capability:setup + +Privacy-LLM PII-scrubbing gate. Standalone redactor / checker pair that screens content for PII before it reaches an external LLM. See [`tool.md`](tool.md) and [`wiring.md`](wiring.md) for integration details, [`models.md`](models.md) for the model catalogue, and [`pii.md`](pii.md) for the PII taxonomy. diff --git a/tools/probe-templates/README.md b/tools/probe-templates/README.md index 4a659042..90207614 100644 --- a/tools/probe-templates/README.md +++ b/tools/probe-templates/README.md @@ -15,6 +15,8 @@ # Probe templates +**Capability:** capability:setup + Runnable cross-family probe scripts that the [`issue-reproducer`](../../.claude/skills/issue-reproducer/SKILL.md) skill copies from when its Step 9 (optional cross-family probe) diff --git a/tools/sandbox-lint/README.md b/tools/sandbox-lint/README.md index 456f95c5..70300360 100644 --- a/tools/sandbox-lint/README.md +++ b/tools/sandbox-lint/README.md @@ -16,6 +16,8 @@ # `sandbox-lint` +**Capability:** capability:setup + Lints `.claude/settings.json` against the shipped baseline at `tools/sandbox-lint/expected.json`, and against the security invariants documented in `docs/security/threat-model.md` diff --git a/tools/security-tracker-stats-dashboard/README.md b/tools/security-tracker-stats-dashboard/README.md index 06dccedd..8afd9387 100644 --- a/tools/security-tracker-stats-dashboard/README.md +++ b/tools/security-tracker-stats-dashboard/README.md @@ -21,6 +21,8 @@ # security-tracker-stats-dashboard +**Capability:** capability:stats + Generate a self-contained HTML dashboard of `` repository statistics — issue-lifecycle bands (untriaged / triaged / PR-merged / fixed-released / closed-other), opened-vs-untriaged backlog, cumulative diff --git a/tools/skill-validator/README.md b/tools/skill-and-tool-validator/README.md similarity index 91% rename from tools/skill-validator/README.md rename to tools/skill-and-tool-validator/README.md index 42c3677d..4df62d24 100644 --- a/tools/skill-validator/README.md +++ b/tools/skill-and-tool-validator/README.md @@ -2,7 +2,7 @@ **Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* -- [skill-validator](#skill-validator) +- [skill-and-tool-validator](#skill-and-tool-validator) - [What it checks](#what-it-checks) - [Hard rules (failure)](#hard-rules-failure) - [SOFT advisories (warning, do not fail)](#soft-advisories-warning-do-not-fail) @@ -14,7 +14,9 @@ -# skill-validator +# skill-and-tool-validator + +**Capability:** capability:setup Validate framework skill definitions — YAML frontmatter, internal link integrity, and placeholder conventions. @@ -54,13 +56,13 @@ the run. The reviewer has the final say on borderline cases. From the repo root: ```bash -uv run --project tools/skill-validator --group dev pytest +uv run --project tools/skill-and-tool-validator --group dev pytest ``` Or install and run as CLI: ```bash -uv run --project tools/skill-validator --group dev skill-validate +uv run --project tools/skill-and-tool-validator --group dev skill-and-tool-validate ``` CLI flags: diff --git a/tools/skill-validator/pyproject.toml b/tools/skill-and-tool-validator/pyproject.toml similarity index 93% rename from tools/skill-validator/pyproject.toml rename to tools/skill-and-tool-validator/pyproject.toml index e1ad1a98..58072c84 100644 --- a/tools/skill-validator/pyproject.toml +++ b/tools/skill-and-tool-validator/pyproject.toml @@ -20,7 +20,7 @@ requires = ["hatchling"] build-backend = "hatchling.build" [project] -name = "skill-validator" +name = "skill-and-tool-validator" version = "0.1.0" description = "Validate framework skill definitions — YAML frontmatter, internal link integrity, and placeholder conventions." readme = "README.md" @@ -31,7 +31,7 @@ license = { text = "Apache-2.0" } dependencies = [] [project.scripts] -skill-validate = "skill_validator:main" +skill-and-tool-validate = "skill_and_tool_validator:main" [dependency-groups] dev = [ @@ -41,7 +41,7 @@ dev = [ ] [tool.hatch.build.targets.wheel] -packages = ["src/skill_validator"] +packages = ["src/skill_and_tool_validator"] [tool.ruff] line-length = 110 diff --git a/tools/skill-validator/src/skill_validator/__init__.py b/tools/skill-and-tool-validator/src/skill_and_tool_validator/__init__.py similarity index 79% rename from tools/skill-validator/src/skill_validator/__init__.py rename to tools/skill-and-tool-validator/src/skill_and_tool_validator/__init__.py index 8b33723c..802a63ff 100644 --- a/tools/skill-validator/src/skill_validator/__init__.py +++ b/tools/skill-and-tool-validator/src/skill_and_tool_validator/__init__.py @@ -44,9 +44,9 @@ failing the run unless ``--strict`` is passed. Run from repo root: - uv run --project tools/skill-validator --group dev pytest + uv run --project tools/skill-and-tool-validator --group dev pytest # or after install: - skill-validate + skill-and-tool-validate """ from __future__ import annotations @@ -62,13 +62,53 @@ # --------------------------------------------------------------------------- SKILLS_DIR = Path(".claude/skills") +TOOLS_DIR = Path("tools") DOCS_DIR = Path("docs") PROJECTS_TEMPLATE_DIR = Path("projects/_template") -REQUIRED_FRONTMATTER_KEYS = {"name", "description", "license"} +# Categories for the tool-validator block. Both HARD by default — every +# tool must have a README that declares its capability. +TOOL_README_CATEGORY = "tool-readme" +TOOL_CAPABILITY_CATEGORY = "tool-capability" + +# Matches `**Capability:** capability:NAME` (and multi-value +# `capability:NAME + capability:NAME + …`) on a single line. +TOOL_CAPABILITY_RE = re.compile(r"^\*\*Capability:\*\*[ \t]+(.+)$", re.MULTILINE) + +# Capability-sync check: keeps docs/labels-and-capabilities.md tables aligned +# with live skill frontmatter + tool README declarations. +DOCS_LABELS_AND_CAPABILITIES = Path("docs/labels-and-capabilities.md") +CAPABILITY_SYNC_CATEGORY = "capability-sync" +_SKILL_TABLE_HEADER = "## Capability to skill map" +_TOOL_TABLE_HEADER = "## Capability to tool map" +# Tokens like `capability:setup`. Optional backticks around the token. +_CAPABILITY_TOKEN_RE = re.compile(r"`?(capability:[a-z]+)`?") +# Italic-parenthetical annotation in the docs tables: `*( … )*` — used for +# future-state notes (e.g. "*(+ capability:reconciliation once #337 lands)*"). +# Stripped before extracting authoritative capability tokens. The terminator +# is the literal sequence ``)*`` (close-paren immediately followed by an +# asterisk), which lets the body span markdown links whose URLs contain +# parens. +_ITALIC_PARENS_RE = re.compile(r"\*\(.*?\)\*") + +REQUIRED_FRONTMATTER_KEYS = {"name", "description", "license", "capability"} OPTIONAL_FRONTMATTER_KEYS = {"when_to_use", "mode"} ALLOWED_LICENSES = {"Apache-2.0"} +# Canonical capability taxonomy — docs/labels-and-capabilities.md is authoritative. +# Skills may declare a single capability (string form) or several (YAML list form). +ALLOWED_CAPABILITIES = { + "capability:triage", + "capability:review", + "capability:fix", + "capability:intake", + "capability:reconciliation", + "capability:resolve", + "capability:reassess", + "capability:stats", + "capability:setup", +} + def _read_mode_table() -> dict[str, str]: """Read the canonical MISSION mode table from ``docs/modes.md``.""" @@ -454,6 +494,31 @@ def validate_frontmatter(path: Path, text: str) -> Iterable[Violation]: f"frontmatter mode '{fm['mode']}' not in {sorted(ALLOWED_MODES)} (see docs/modes.md)", ) + if fm.get("capability"): + # The frontmatter parser stores both forms as a single string: + # single — `capability: capability:triage` → "capability:triage" + # list — `capability:\n - capability:intake\n …` → "- capability:intake\n- capability:setup" + # Split on lines, strip `- ` prefix when present. + entries: list[str] = [] + for raw_line in fm["capability"].splitlines(): + line = raw_line.strip() + if not line: + continue + if line.startswith("- "): + entries.append(line[2:].strip()) + else: + entries.append(line) + if not entries: + yield Violation(path, 1, "frontmatter key 'capability' is empty") + for entry in entries: + if entry not in ALLOWED_CAPABILITIES: + yield Violation( + path, + 1, + f"frontmatter capability '{entry}' not in {sorted(ALLOWED_CAPABILITIES)} " + f"(see docs/labels-and-capabilities.md)", + ) + desc_len = len(fm.get("description", "")) wtu_len = len(fm.get("when_to_use", "")) total = desc_len + wtu_len @@ -1148,6 +1213,267 @@ def collect_files_to_check(root: Path | None = None) -> list[Path]: return list(base.rglob("*.md")) +def collect_tool_dirs(root: Path | None = None) -> list[Path]: + """Return every immediate sub-directory under tools/ that should be checked.""" + base = (root or find_repo_root()) / TOOLS_DIR + if not base.exists(): + return [] + return sorted(d for d in base.iterdir() if d.is_dir() and not d.name.startswith(".")) + + +def validate_tools(root: Path | None = None) -> Iterable[Violation]: + """For each ``tools//`` directory, require: + + 1. A ``README.md`` to exist at the tool root. + 2. The README to contain a ``**Capability:** capability:NAME`` line, + with NAME drawn from ``ALLOWED_CAPABILITIES``. Multi-value form is + ``**Capability:** capability:NAME + capability:NAME``. + + Both are HARD checks — every tool must declare its capabilities so + the per-tool map in ``docs/labels-and-capabilities.md`` stays + authoritative. + """ + for tool_dir in collect_tool_dirs(root): + readme = tool_dir / "README.md" + if not readme.exists(): + yield Violation( + readme, + None, + f"tool '{tool_dir.name}' missing README.md — every tools// must " + f"have a README declaring its capability per " + f"docs/labels-and-capabilities.md", + category=TOOL_README_CATEGORY, + ) + continue + + try: + text = readme.read_text(encoding="utf-8") + except OSError as exc: + yield Violation(readme, None, f"cannot read README.md: {exc}") + continue + + match = TOOL_CAPABILITY_RE.search(text) + if match is None: + yield Violation( + readme, + 1, + f"tool '{tool_dir.name}' README missing '**Capability:** capability:NAME' " + f"declaration (see docs/labels-and-capabilities.md)", + category=TOOL_CAPABILITY_CATEGORY, + ) + continue + + line_no = text[: match.start()].count("\n") + 1 + # Split multi-value: `capability:NAME + capability:NAME + …` + raw = match.group(1).strip() + entries = [e.strip() for e in raw.split("+") if e.strip()] + if not entries: + yield Violation( + readme, + line_no, + f"tool '{tool_dir.name}' has '**Capability:**' line but no values parsed", + category=TOOL_CAPABILITY_CATEGORY, + ) + continue + for entry in entries: + if entry not in ALLOWED_CAPABILITIES: + yield Violation( + readme, + line_no, + f"tool '{tool_dir.name}' capability '{entry}' not in " + f"{sorted(ALLOWED_CAPABILITIES)} (see docs/labels-and-capabilities.md)", + category=TOOL_CAPABILITY_CATEGORY, + ) + + +def _parse_capability_doc_table(text: str, header: str) -> dict[str, set[str]]: + """Parse a markdown table rooted at *header* in labels-and-capabilities.md. + + Returns a {entity-name: {capability:foo, capability:bar}} mapping. The + entity name is the first cell's bare identifier (drops the path prefix + for tools: ``tools/foo`` → ``foo``). Italic-parenthetical annotations + in the capability cell (``*(+ capability:X once #N lands)*``) are + stripped before parsing — they are future-state notes, not the + authoritative declaration. + """ + if header not in text: + return {} + section = text.split(header, 1)[1] + next_h2 = section.find("\n## ") + if next_h2 > 0: + section = section[:next_h2] + + out: dict[str, set[str]] = {} + for line in section.splitlines(): + if not line.startswith("|"): + continue + # Skip the header / separator rows. + if line.startswith("|---") or line.startswith("| --- "): + continue + if "Capability" in line and ("Skill" in line or "Tool" in line or "skill" in line or "tool" in line): + continue + cells = [c.strip() for c in line.strip("|").split("|")] + if len(cells) < 2: + continue + name_cell, cap_cell = cells[0], cells[1] + # Entity name: `name` or [`name`](path) — pull the backtick-quoted token. + name_match = re.search(r"`([a-zA-Z0-9/_-]+)`", name_cell) + if not name_match: + continue + raw_name = name_match.group(1) + # Tools live under `tools/` in the table; strip prefix. + name = raw_name.rsplit("/", 1)[-1] + # Strip italic-parenthetical future-state notes before token extraction. + cap_cell_clean = _ITALIC_PARENS_RE.sub("", cap_cell) + caps = set(_CAPABILITY_TOKEN_RE.findall(cap_cell_clean)) + if caps: + out[name] = caps + return out + + +def _live_skill_capabilities(repo_root: Path) -> dict[str, set[str]]: + """Read the {skill-name: {capability:foo, …}} mapping from live frontmatter.""" + out: dict[str, set[str]] = {} + skills_dir = repo_root / SKILLS_DIR + if not skills_dir.exists(): + return out + for skill_md in skills_dir.glob("*/SKILL.md"): + try: + text = skill_md.read_text(encoding="utf-8") + except OSError: + continue + fm = parse_frontmatter(text) + if fm is None or "capability" not in fm or not fm["capability"]: + continue + entries: set[str] = set() + for raw_line in fm["capability"].splitlines(): + line = raw_line.strip() + if not line: + continue + if line.startswith("- "): + entries.add(line[2:].strip()) + else: + entries.add(line) + if entries: + out[skill_md.parent.name] = entries + return out + + +def _live_tool_capabilities(repo_root: Path) -> dict[str, set[str]]: + """Read the {tool-name: {capability:foo, …}} mapping from live tool READMEs.""" + out: dict[str, set[str]] = {} + for tool_dir in collect_tool_dirs(repo_root): + readme = tool_dir / "README.md" + if not readme.exists(): + continue + try: + text = readme.read_text(encoding="utf-8") + except OSError: + continue + match = TOOL_CAPABILITY_RE.search(text) + if match is None: + continue + raw = match.group(1).strip() + entries = {e.strip() for e in raw.split("+") if e.strip()} + if entries: + out[tool_dir.name] = entries + return out + + +def validate_capability_sync(root: Path | None = None) -> Iterable[Violation]: + """Compare the docs/labels-and-capabilities.md tables against live state. + + Both directions are checked: + + - Every row in either table must correspond to a live skill / tool with + the same capability set (modulo italic-parenthetical future-state notes). + - Every live skill (with a ``capability:`` frontmatter field) and every + live tool (with a ``**Capability:**`` README declaration) must have a + matching row in the corresponding doc table. + + Drift in either direction is a HARD ``capability-sync`` violation — + the docs are the canonical reference and must stay aligned with the + source. + """ + repo_root = root or find_repo_root() + doc_path = repo_root / DOCS_LABELS_AND_CAPABILITIES + if not doc_path.exists(): + yield Violation( + doc_path, + None, + "docs/labels-and-capabilities.md missing — cannot run capability-sync check", + category=CAPABILITY_SYNC_CATEGORY, + ) + return + + try: + doc_text = doc_path.read_text(encoding="utf-8") + except OSError as exc: + yield Violation(doc_path, None, f"cannot read labels-and-capabilities.md: {exc}") + return + + doc_skills = _parse_capability_doc_table(doc_text, _SKILL_TABLE_HEADER) + doc_tools = _parse_capability_doc_table(doc_text, _TOOL_TABLE_HEADER) + live_skills = _live_skill_capabilities(repo_root) + live_tools = _live_tool_capabilities(repo_root) + + # Skills — docs vs live, both directions. + for name, doc_caps in sorted(doc_skills.items()): + if name not in live_skills: + yield Violation( + doc_path, + None, + f"skill table row for '{name}' but no live SKILL.md with a 'capability:' field " + f"found under .claude/skills/{name}/", + category=CAPABILITY_SYNC_CATEGORY, + ) + continue + if doc_caps != live_skills[name]: + yield Violation( + doc_path, + None, + f"skill '{name}' capability mismatch — docs={sorted(doc_caps)} live={sorted(live_skills[name])}", + category=CAPABILITY_SYNC_CATEGORY, + ) + for name in sorted(live_skills): + if name not in doc_skills: + yield Violation( + doc_path, + None, + f"live skill '{name}' has 'capability:' frontmatter but no row in the skill table " + f"in docs/labels-and-capabilities.md", + category=CAPABILITY_SYNC_CATEGORY, + ) + + # Tools — docs vs live, both directions. + for name, doc_caps in sorted(doc_tools.items()): + if name not in live_tools: + yield Violation( + doc_path, + None, + f"tool table row for '{name}' but no live tools/{name}/README.md with a " + f"'**Capability:**' declaration found", + category=CAPABILITY_SYNC_CATEGORY, + ) + continue + if doc_caps != live_tools[name]: + yield Violation( + doc_path, + None, + f"tool '{name}' capability mismatch — docs={sorted(doc_caps)} live={sorted(live_tools[name])}", + category=CAPABILITY_SYNC_CATEGORY, + ) + for name in sorted(live_tools): + if name not in doc_tools: + yield Violation( + doc_path, + None, + f"live tool '{name}' has '**Capability:**' declaration but no row in the tool table " + f"in docs/labels-and-capabilities.md", + category=CAPABILITY_SYNC_CATEGORY, + ) + + # --------------------------------------------------------------------------- # Lowercase -f field check (Pattern 2) # --------------------------------------------------------------------------- @@ -1298,6 +1624,12 @@ def run_validation(root: Path | None = None) -> list[Violation]: violations.extend(validate_gh_list_limit(path, text)) violations.extend(validate_lowercase_f_field(path, text)) + # Tool-level checks: every tools// has a README that declares its capability. + violations.extend(validate_tools(repo_root)) + + # Capability-sync check: the doc tables and the source must agree. + violations.extend(validate_capability_sync(repo_root)) + return violations @@ -1330,14 +1662,14 @@ def main(argv: list[str] | None = None) -> int: soft = [v for v in filtered if v.category in SOFT_CATEGORIES] if not filtered: - print("skill-validator: OK (no violations)") + print("skill-and-tool-validator: OK (no violations)") return 0 if soft: _print_soft_warnings(soft) if hard: - print(f"skill-validator: {len(hard)} violation(s) found\n") + print(f"skill-and-tool-validator: {len(hard)} violation(s) found\n") for v in hard: print(v) return 1 @@ -1383,7 +1715,7 @@ def _print_soft_warnings(soft: list[Violation]) -> None: by_file[v.path].append(v) print( - f"skill-validator: {len(soft)} SOFT warning(s) across " + f"skill-and-tool-validator: {len(soft)} SOFT warning(s) across " f"{len(by_file)} skill(s) — advisory, not blocking\n", file=sys.stderr, ) diff --git a/tools/skill-validator/tests/test_validator.py b/tools/skill-and-tool-validator/tests/test_validator.py similarity index 83% rename from tools/skill-validator/tests/test_validator.py rename to tools/skill-and-tool-validator/tests/test_validator.py index 03992302..4998f415 100644 --- a/tools/skill-validator/tests/test_validator.py +++ b/tools/skill-and-tool-validator/tests/test_validator.py @@ -23,7 +23,7 @@ import pytest -from skill_validator import ( +from skill_and_tool_validator import ( _MODE_STATUS_BY_NAME, _MODE_TAXONOMY, _OFF_MODES, @@ -56,6 +56,7 @@ resolve_link, run_validation, slugify, + validate_capability_sync, validate_frontmatter, validate_gh_list_limit, validate_injection_guard, @@ -65,6 +66,7 @@ validate_principle_compliance, validate_privacy_patterns, validate_security_patterns, + validate_tools, validate_trigger_preservation, ) @@ -75,7 +77,7 @@ class TestParseFrontmatter: def test_valid_frontmatter(self) -> None: - text = "---\nname: foo\ndescription: bar\nlicense: Apache-2.0\n---\n# heading\n" + text = "---\nname: foo\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n# heading\n" fm = parse_frontmatter(text) assert fm is not None assert fm["name"] == "foo" @@ -89,7 +91,7 @@ def test_folded_scalar(self) -> None: "description: |\n" " First line of description.\n" " Second line.\n" - "license: Apache-2.0\n" + "capability: capability:setup\nlicense: Apache-2.0\n" "---\n" ) fm = parse_frontmatter(text) @@ -112,7 +114,7 @@ def test_block_scalar_preserves_internal_blank_line(self) -> None: " Paragraph one.\n" "\n" " Paragraph two, which used to be dropped.\n" - "license: Apache-2.0\n" + "capability: capability:setup\nlicense: Apache-2.0\n" "---\n" ) fm = parse_frontmatter(text) @@ -136,13 +138,13 @@ def test_no_closing_delimiter(self) -> None: class TestValidateFrontmatter: def test_valid(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" - text = "---\nname: foo\ndescription: bar\nlicense: Apache-2.0\n---\n" + text = "---\nname: foo\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" violations = list(validate_frontmatter(path, text)) assert violations == [] def test_missing_name(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" - text = "---\ndescription: bar\nlicense: Apache-2.0\n---\n" + text = "---\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" violations = list(validate_frontmatter(path, text)) assert len(violations) == 1 assert "name" in violations[0].message @@ -158,7 +160,7 @@ def test_missing_multiple_keys(self, tmp_path: Path) -> None: def test_empty_value(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" - text = "---\nname: \ndescription: bar\nlicense: Apache-2.0\n---\n" + text = "---\nname: \ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" violations = list(validate_frontmatter(path, text)) assert any("name' is empty" in v.message for v in violations) @@ -171,20 +173,20 @@ def test_invalid_license(self, tmp_path: Path) -> None: def test_valid_mode(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" for mode in ("Triage", "Mentoring", "Drafting", "Pairing"): - text = f"---\nname: foo\ndescription: bar\nlicense: Apache-2.0\nmode: {mode}\n---\n" + text = f"---\nname: foo\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\nmode: {mode}\n---\n" violations = list(validate_frontmatter(path, text)) assert violations == [], f"mode '{mode}' should be valid" def test_invalid_mode(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" - text = "---\nname: foo\ndescription: bar\nlicense: Apache-2.0\nmode: Auto-merge\n---\n" + text = "---\nname: foo\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\nmode: Auto-merge\n---\n" violations = list(validate_frontmatter(path, text)) assert any("mode" in v.message and "'Auto-merge'" in v.message for v in violations) def test_mode_optional(self, tmp_path: Path) -> None: # Skills without a mode (e.g. setup-* infrastructure) must not fail. path = tmp_path / "SKILL.md" - text = "---\nname: foo\ndescription: bar\nlicense: Apache-2.0\n---\n" + text = "---\nname: foo\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" violations = list(validate_frontmatter(path, text)) assert violations == [] @@ -212,7 +214,7 @@ def test_metadata_under_limit(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" desc = "a" * 800 wtu = "b" * 700 - text = f"---\nname: foo\ndescription: {desc}\nwhen_to_use: {wtu}\nlicense: Apache-2.0\n---\n" + text = f"---\nname: foo\ndescription: {desc}\nwhen_to_use: {wtu}\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" violations = list(validate_frontmatter(path, text)) assert violations == [] @@ -220,7 +222,7 @@ def test_metadata_over_limit(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" desc = "a" * 1000 wtu = "b" * (MAX_METADATA_CHARS - 1000 + 1) - text = f"---\nname: foo\ndescription: {desc}\nwhen_to_use: {wtu}\nlicense: Apache-2.0\n---\n" + text = f"---\nname: foo\ndescription: {desc}\nwhen_to_use: {wtu}\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" violations = list(validate_frontmatter(path, text)) assert any("truncates" in v.message and str(MAX_METADATA_CHARS) in v.message for v in violations) @@ -233,7 +235,7 @@ def test_argument_hint_accepted(self, tmp_path: Path) -> None: "---\n" "name: foo\n" "description: bar\n" - "license: Apache-2.0\n" + "capability: capability:setup\nlicense: Apache-2.0\n" "argument-hint: [--quick|--standard|--deep] \n" "---\n" ) @@ -249,7 +251,7 @@ def test_argument_hint_pipe_notation_with_spaces_in_option(self, tmp_path: Path) "---\n" "name: setup-steward\n" "description: bar\n" - "license: Apache-2.0\n" + "capability: capability:setup\nlicense: Apache-2.0\n" "argument-hint: [adopt|upgrade|worktree-init|verify|override skill-name|unadopt]\n" "---\n" ) @@ -272,7 +274,7 @@ def test_argument_hint_does_not_inflate_metadata_budget(self, tmp_path: Path) -> f"name: foo\n" f"description: {desc}\n" f"when_to_use: {wtu}\n" - f"license: Apache-2.0\n" + f"capability: capability:setup\nlicense: Apache-2.0\n" f"argument-hint: {hint}\n" f"---\n" ) @@ -280,12 +282,58 @@ def test_argument_hint_does_not_inflate_metadata_budget(self, tmp_path: Path) -> assert violations == [], "argument-hint must not count toward description+when_to_use budget" def test_metadata_block_scalar_indicator_not_counted(self) -> None: - text = f"---\nname: foo\ndescription: |\n {'a' * 100}\nlicense: Apache-2.0\n---\n" + text = f"---\nname: foo\ndescription: |\n {'a' * 100}\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" fm = parse_frontmatter(text) assert fm is not None assert not fm["description"].startswith("|") assert len(fm["description"]) == 100 + def test_capability_single_string(self, tmp_path: Path) -> None: + path = tmp_path / "SKILL.md" + text = "---\nname: foo\ndescription: bar\ncapability: capability:triage\nlicense: Apache-2.0\n---\n" + violations = list(validate_frontmatter(path, text)) + assert violations == [] + + def test_capability_yaml_list(self, tmp_path: Path) -> None: + path = tmp_path / "SKILL.md" + text = ( + "---\nname: foo\ndescription: bar\n" + "capability:\n - capability:intake\n - capability:setup\n" + "license: Apache-2.0\n---\n" + ) + violations = list(validate_frontmatter(path, text)) + assert violations == [] + + def test_capability_missing(self, tmp_path: Path) -> None: + path = tmp_path / "SKILL.md" + text = "---\nname: foo\ndescription: bar\nlicense: Apache-2.0\n---\n" + violations = list(validate_frontmatter(path, text)) + assert any("capability" in v.message for v in violations) + + def test_capability_invalid_value(self, tmp_path: Path) -> None: + path = tmp_path / "SKILL.md" + text = "---\nname: foo\ndescription: bar\ncapability: capability:bogus\nlicense: Apache-2.0\n---\n" + violations = list(validate_frontmatter(path, text)) + assert any("capability:bogus" in v.message for v in violations) + + def test_capability_list_with_one_invalid_value(self, tmp_path: Path) -> None: + path = tmp_path / "SKILL.md" + text = ( + "---\nname: foo\ndescription: bar\n" + "capability:\n - capability:setup\n - capability:invented\n" + "license: Apache-2.0\n---\n" + ) + violations = list(validate_frontmatter(path, text)) + # The "subject" of each violation is the first single-quoted token after + # the `capability ` word — e.g. "frontmatter capability 'capability:invented' not in [...]". + # The valid entry should never be the subject; only the invalid one should be. + flagged_subjects = [ + v.message.split("capability '")[1].split("'")[0] + for v in violations + if "capability '" in v.message and "not in" in v.message + ] + assert flagged_subjects == ["capability:invented"] + # --------------------------------------------------------------------------- # Heading / anchor helpers @@ -500,7 +548,7 @@ def test_locates_root_from_validator_subtree(self, monkeypatch: pytest.MonkeyPat # Regression: the silent-pass bug fired only when CWD was inside the validator subtree. repo = Path(__file__).resolve().parents[3] assert (repo / ".claude" / "skills").is_dir(), "test setup precondition" - monkeypatch.chdir(repo / "tools" / "skill-validator") + monkeypatch.chdir(repo / "tools" / "skill-and-tool-validator") assert find_repo_root() == repo def test_explicit_start_outside_repo(self, tmp_path: Path) -> None: @@ -522,11 +570,29 @@ def test_explicit_start_outside_repo(self, tmp_path: Path) -> None: class TestSubDocFiles: def _make_skill_dir(self, root: Path, skill_name: str = "setup-foo") -> Path: - """Return a skill directory pre-populated with a valid SKILL.md.""" + """Return a skill directory pre-populated with a valid SKILL.md. + + Also seeds a matching ``docs/labels-and-capabilities.md`` row so the + capability-sync check is satisfied — these tests are about sub-doc + handling, not the sync check. + """ skill_dir = root / ".claude" / "skills" / skill_name skill_dir.mkdir(parents=True) (skill_dir / "SKILL.md").write_text( - f"---\nname: {skill_name}\ndescription: bar\nlicense: Apache-2.0\n---\n# body\n", + f"---\nname: {skill_name}\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n# body\n", + encoding="utf-8", + ) + docs = root / "docs" + docs.mkdir(parents=True, exist_ok=True) + (docs / "labels-and-capabilities.md").write_text( + "# Labels and capabilities\n\n" + "## Capability to skill map\n\n" + "| Skill | Capability / capabilities |\n" + "|---|---|\n" + f"| `{skill_name}` | `capability:setup` |\n\n" + "## Capability to tool map\n\n" + "| Tool | Capability / capabilities | Role |\n" + "|---|---|---|\n", encoding="utf-8", ) return skill_dir @@ -621,7 +687,7 @@ def test_real_repo_passes(self) -> None: are excluded — they are advisory and surface as warnings, not failures. The main runtime gate is `--strict`. """ - from skill_validator import SOFT_CATEGORIES + from skill_and_tool_validator import SOFT_CATEGORIES violations = [v for v in run_validation() if v.category not in SOFT_CATEGORIES] if violations: @@ -834,7 +900,9 @@ def test_quoted_phrase_diff_reports_missing(self, tmp_path: Path) -> None: # --------------------------------------------------------------------------- # Minimal valid SKILL.md frontmatter used across injection-guard tests. -_GUARD_FM = "---\nname: test-skill\ndescription: bar\nlicense: Apache-2.0\n---\n" +_GUARD_FM = ( + "---\nname: test-skill\ndescription: bar\ncapability: capability:setup\nlicense: Apache-2.0\n---\n" +) # A gh-pr-view signal that unambiguously looks like a workflow fetch step. _GH_PR_VIEW_SIGNAL = "2. **Fetch the PR.** `gh pr view --json title,body`\n" @@ -1212,7 +1280,7 @@ def test_all_violations_are_soft_category(self, tmp_path: Path) -> None: def _fenced_skill_lf(cmd: str) -> str: """Wrap *cmd* in a minimal SKILL.md with a fenced bash block.""" - return f"---\nname: test\ndescription: test\nlicense: Apache-2.0\n---\n\n```bash\n{cmd}\n```\n" + return f"---\nname: test\ndescription: test\ncapability: capability:setup\nlicense: Apache-2.0\n---\n\n```bash\n{cmd}\n```\n" class TestLowercaseFField: @@ -1282,7 +1350,7 @@ def test_prose_mention_not_flagged(self, tmp_path: Path) -> None: """Inline backtick prose like ``-f title='...'`` must not fire.""" path = tmp_path / "SKILL.md" text = ( - "---\nname: test\ndescription: test\nlicense: Apache-2.0\n---\n\n" + "---\nname: test\ndescription: test\ncapability: capability:setup\nlicense: Apache-2.0\n---\n\n" "Avoid using `-f title='value'` — use `-F title=@file` instead.\n" ) violations = list(validate_lowercase_f_field(path, text)) @@ -1292,7 +1360,7 @@ def test_outside_fenced_block_not_flagged(self, tmp_path: Path) -> None: """Bare prose outside a fenced block must not fire.""" path = tmp_path / "SKILL.md" text = ( - "---\nname: test\ndescription: test\nlicense: Apache-2.0\n---\n\n" + "---\nname: test\ndescription: test\ncapability: capability:setup\nlicense: Apache-2.0\n---\n\n" "Run: gh api milestones -f title='v1'\n" ) violations = list(validate_lowercase_f_field(path, text)) @@ -1309,9 +1377,9 @@ def test_checklist_file_skipped(self, tmp_path: Path) -> None: def test_violation_line_number_correct(self, tmp_path: Path) -> None: path = tmp_path / "SKILL.md" text = _fenced_skill_lf("gh api repos//milestones -f title='v1.0'") - # Layout: 1:--- 2:name 3:description 4:license 5:--- 6:blank 7:```bash 8:command + # Layout: 1:--- 2:name 3:description 4:capability 5:license 6:--- 7:blank 8:```bash 9:command violations = list(validate_lowercase_f_field(path, text)) - assert violations[0].line == 8 + assert violations[0].line == 9 def test_lowercase_f_field_in_soft_categories(self) -> None: assert LOWERCASE_F_FIELD_CATEGORY in SOFT_CATEGORIES @@ -1574,9 +1642,25 @@ def test_silent_on_sub_doc(self, tmp_path: Path) -> None: def _skill_root(tmp_path: Path) -> Path: - """Create a minimal repo tree with .claude/skills/ and return the root.""" + """Create a minimal repo tree with .claude/skills/ and return the root. + + Also seeds an empty ``docs/labels-and-capabilities.md`` so the + capability-sync check doesn't fire its "missing doc" violation in + tests that don't exercise the sync check directly. + """ skills = tmp_path / ".claude" / "skills" skills.mkdir(parents=True) + docs = tmp_path / "docs" + docs.mkdir(parents=True, exist_ok=True) + (docs / "labels-and-capabilities.md").write_text( + "# Labels and capabilities\n\n" + "## Capability to skill map\n\n" + "| Skill | Capability / capabilities |\n" + "|---|---|\n\n" + "## Capability to tool map\n\n" + "| Tool | Capability / capabilities | Role |\n" + "|---|---|---|\n" + ) return tmp_path @@ -1774,12 +1858,23 @@ def test_does_not_return_non_md_files(self, tmp_path: Path) -> None: def _make_valid_skill(root: Path, name: str) -> Path: - """Write a minimal valid SKILL.md under .claude/skills//.""" + """Write a minimal valid SKILL.md under .claude/skills// and add a + matching row to docs/labels-and-capabilities.md so the capability-sync + check stays satisfied.""" skill_dir = root / ".claude" / "skills" / name skill_dir.mkdir(parents=True, exist_ok=True) (skill_dir / "SKILL.md").write_text( - f"---\nname: {name}\ndescription: A test skill.\nlicense: Apache-2.0\n---\n# Body\nSome content.\n" + f"---\nname: {name}\ndescription: A test skill.\ncapability: capability:setup\nlicense: Apache-2.0\n---\n# Body\nSome content.\n" ) + # Inject a row into the skill table of the seeded doc. + doc = root / "docs" / "labels-and-capabilities.md" + if doc.exists(): + text = doc.read_text() + row = f"| `{name}` | `capability:setup` |\n" + # Insert right after the skill table's separator row. + marker = "## Capability to skill map\n\n| Skill | Capability / capabilities |\n|---|---|\n" + if marker in text and row not in text: + doc.write_text(text.replace(marker, marker + row, 1)) return skill_dir @@ -1830,15 +1925,241 @@ def test_strict_promotes_soft_violations_to_hard( "---\n" "name: soft-skill\n" "description: A test skill.\n" - "license: Apache-2.0\n" + "capability: capability:setup\nlicense: Apache-2.0\n" "---\n" "```bash\n" 'gh pr comment 1 --body "attacker content"\n' "```\n" ) + # Add a matching row to the seeded doc so the capability-sync check stays clean. + doc = root / "docs" / "labels-and-capabilities.md" + text = doc.read_text() + marker = "## Capability to skill map\n\n| Skill | Capability / capabilities |\n|---|---|\n" + doc.write_text(text.replace(marker, marker + "| `soft-skill` | `capability:setup` |\n", 1)) monkeypatch.chdir(root) rc_normal = main([]) rc_strict = main(["--strict"]) assert rc_normal == 0 assert rc_strict == 1 + + +# --------------------------------------------------------------------------- +# Tool README + capability declaration validation +# --------------------------------------------------------------------------- + + +def _make_tools_root(tmp_path: Path) -> Path: + """Create a minimal repo layout: /tools/ + /.claude/skills/.""" + root = tmp_path / "repo" + (root / "tools").mkdir(parents=True) + (root / ".claude" / "skills").mkdir(parents=True) + return root + + +class TestValidateTools: + def test_tool_with_valid_readme(self, tmp_path: Path) -> None: + root = _make_tools_root(tmp_path) + tool = root / "tools" / "foo" + tool.mkdir() + (tool / "README.md").write_text("# tools/foo\n\n**Capability:** capability:setup\n\nFoo tool.\n") + violations = list(validate_tools(root)) + assert violations == [] + + def test_tool_missing_readme(self, tmp_path: Path) -> None: + root = _make_tools_root(tmp_path) + (root / "tools" / "no-readme").mkdir() + violations = list(validate_tools(root)) + assert len(violations) == 1 + assert "missing README.md" in violations[0].message + assert violations[0].category == "tool-readme" + + def test_tool_readme_without_capability(self, tmp_path: Path) -> None: + root = _make_tools_root(tmp_path) + tool = root / "tools" / "bare" + tool.mkdir() + (tool / "README.md").write_text("# bare\n\nDescription only, no capability line.\n") + violations = list(validate_tools(root)) + assert len(violations) == 1 + assert "missing '**Capability:**" in violations[0].message + assert violations[0].category == "tool-capability" + + def test_tool_capability_invalid_value(self, tmp_path: Path) -> None: + root = _make_tools_root(tmp_path) + tool = root / "tools" / "bad" + tool.mkdir() + (tool / "README.md").write_text("# bad\n\n**Capability:** capability:bogus\n") + violations = list(validate_tools(root)) + assert any("capability:bogus" in v.message for v in violations) + + def test_tool_capability_multi_value(self, tmp_path: Path) -> None: + root = _make_tools_root(tmp_path) + tool = root / "tools" / "dual" + tool.mkdir() + (tool / "README.md").write_text("# dual\n\n**Capability:** capability:setup + capability:intake\n") + violations = list(validate_tools(root)) + assert violations == [] + + def test_tool_capability_regex_does_not_slurp_past_line(self, tmp_path: Path) -> None: + # Regression guard: an earlier version of the regex matched `[A-Za-z0-9:+\s]+` + # which included newlines, so the parser captured prose from the next + # paragraph and reported false "invalid capability" errors. + root = _make_tools_root(tmp_path) + tool = root / "tools" / "with-prose" + tool.mkdir() + (tool / "README.md").write_text( + "# tools/with-prose\n\n" + "**Capability:** capability:setup\n\n" + "Some prose that follows the capability line and should NOT be parsed as part of it.\n" + ) + violations = list(validate_tools(root)) + assert violations == [] + + +# --------------------------------------------------------------------------- +# Capability sync check: docs/labels-and-capabilities.md ↔ live source +# --------------------------------------------------------------------------- + + +def _seed_capability_repo( + tmp_path: Path, + *, + doc_skills: dict[str, str], + doc_tools: dict[str, str], + live_skills: dict[str, str], + live_tools: dict[str, str], +) -> Path: + """Build a tiny repo with a labels-and-capabilities.md doc, skills, and tool READMEs. + + `*_skills` maps skill-name → capability cell text (e.g. ``capability:triage``). + `*_tools` maps tool-name → capability cell text (e.g. ``capability:setup + capability:intake``). + """ + root = tmp_path / "repo" + (root / "docs").mkdir(parents=True) + (root / ".claude" / "skills").mkdir(parents=True) + (root / "tools").mkdir(parents=True) + + skill_rows = "\n".join(f"| `{n}` | `{c}` |" for n, c in doc_skills.items()) + tool_rows = "\n".join(f"| [`tools/{n}`](../tools/{n}/) | `{c}` | role |" for n, c in doc_tools.items()) + doc_body = ( + "# Labels and capabilities\n\n" + "## Capability to skill map\n\n" + "| Skill | Capability / capabilities |\n" + "|---|---|\n" + f"{skill_rows}\n\n" + "## Capability to tool map\n\n" + "| Tool | Capability / capabilities | Role |\n" + "|---|---|---|\n" + f"{tool_rows}\n" + ) + (root / "docs" / "labels-and-capabilities.md").write_text(doc_body) + + for skill, cap in live_skills.items(): + d = root / ".claude" / "skills" / skill + d.mkdir() + (d / "SKILL.md").write_text( + f"---\nname: {skill}\ndescription: test\ncapability: {cap}\nlicense: Apache-2.0\n---\n" + ) + + for tool, cap in live_tools.items(): + d = root / "tools" / tool + d.mkdir() + (d / "README.md").write_text(f"# {tool}\n\n**Capability:** {cap}\n") + + return root + + +class TestValidateCapabilitySync: + def test_aligned_passes(self, tmp_path: Path) -> None: + root = _seed_capability_repo( + tmp_path, + doc_skills={"alpha": "capability:triage"}, + doc_tools={"omega": "capability:setup"}, + live_skills={"alpha": "capability:triage"}, + live_tools={"omega": "capability:setup"}, + ) + violations = list(validate_capability_sync(root)) + assert violations == [] + + def test_skill_in_doc_but_not_live(self, tmp_path: Path) -> None: + root = _seed_capability_repo( + tmp_path, + doc_skills={"alpha": "capability:triage", "ghost": "capability:fix"}, + doc_tools={}, + live_skills={"alpha": "capability:triage"}, + live_tools={}, + ) + violations = list(validate_capability_sync(root)) + assert any("'ghost'" in v.message and "no live SKILL.md" in v.message for v in violations) + + def test_live_skill_missing_from_doc(self, tmp_path: Path) -> None: + root = _seed_capability_repo( + tmp_path, + doc_skills={"alpha": "capability:triage"}, + doc_tools={}, + live_skills={"alpha": "capability:triage", "extra": "capability:fix"}, + live_tools={}, + ) + violations = list(validate_capability_sync(root)) + assert any("'extra'" in v.message and "no row in the skill table" in v.message for v in violations) + + def test_skill_capability_mismatch(self, tmp_path: Path) -> None: + root = _seed_capability_repo( + tmp_path, + doc_skills={"alpha": "capability:triage"}, + doc_tools={}, + live_skills={"alpha": "capability:fix"}, + live_tools={}, + ) + violations = list(validate_capability_sync(root)) + assert any("'alpha' capability mismatch" in v.message for v in violations) + + def test_tool_in_doc_but_not_live(self, tmp_path: Path) -> None: + root = _seed_capability_repo( + tmp_path, + doc_skills={}, + doc_tools={"omega": "capability:setup", "ghost-tool": "capability:setup"}, + live_skills={}, + live_tools={"omega": "capability:setup"}, + ) + violations = list(validate_capability_sync(root)) + assert any("'ghost-tool'" in v.message and "no live tools/" in v.message for v in violations) + + def test_live_tool_missing_from_doc(self, tmp_path: Path) -> None: + root = _seed_capability_repo( + tmp_path, + doc_skills={}, + doc_tools={"omega": "capability:setup"}, + live_skills={}, + live_tools={"omega": "capability:setup", "extra-tool": "capability:stats"}, + ) + violations = list(validate_capability_sync(root)) + assert any( + "'extra-tool'" in v.message and "no row in the tool table" in v.message for v in violations + ) + + def test_italic_parens_annotation_is_stripped(self, tmp_path: Path) -> None: + # Doc row carries an italic-parenthetical future-state note. + # The token inside *( ... )* must NOT count as a declared capability. + root = tmp_path / "repo" + (root / "docs").mkdir(parents=True) + (root / ".claude" / "skills" / "alpha").mkdir(parents=True) + doc = ( + "# Labels and capabilities\n\n" + "## Capability to skill map\n\n" + "| Skill | Capability / capabilities |\n" + "|---|---|\n" + "| `alpha` | `capability:intake` *(+ `capability:reconciliation` once [#1](https://x.y/issues/1) lands)* |\n\n" + "## Capability to tool map\n\n" + "| Tool | Capability / capabilities | Role |\n" + "|---|---|---|\n" + ) + (root / "docs" / "labels-and-capabilities.md").write_text(doc) + (root / ".claude" / "skills" / "alpha" / "SKILL.md").write_text( + "---\nname: alpha\ndescription: test\ncapability: capability:intake\nlicense: Apache-2.0\n---\n" + ) + (root / "tools").mkdir() + violations = list(validate_capability_sync(root)) + # The parenthetical capability:reconciliation must NOT be flagged as a doc-side declared capability; + # the row's authoritative capability is just intake, which matches the live skill. + assert violations == [], [v.message for v in violations] diff --git a/tools/skill-validator/uv.lock b/tools/skill-and-tool-validator/uv.lock similarity index 99% rename from tools/skill-validator/uv.lock rename to tools/skill-and-tool-validator/uv.lock index ff72e422..2be7473b 100644 --- a/tools/skill-validator/uv.lock +++ b/tools/skill-and-tool-validator/uv.lock @@ -277,7 +277,7 @@ wheels = [ ] [[package]] -name = "skill-validator" +name = "skill-and-tool-validator" version = "0.1.0" source = { editable = "." } diff --git a/tools/skill-evals/README.md b/tools/skill-evals/README.md index d3cfd66b..86e0932b 100644 --- a/tools/skill-evals/README.md +++ b/tools/skill-evals/README.md @@ -1,5 +1,7 @@ # skill-evals +**Capability:** capability:setup + capability:stats + Behavioral eval harness for Apache Steward skills. Each eval suite tests a skill pipeline step by step, verifying that the model produces the correct structured JSON output for a fixed set of fixture cases. Nineteen suites are currently implemented: diff --git a/tools/spec-loop/README.md b/tools/spec-loop/README.md index 8255be05..9d8fc673 100644 --- a/tools/spec-loop/README.md +++ b/tools/spec-loop/README.md @@ -3,6 +3,8 @@ # spec-loop +**Capability:** capability:setup + A spec-driven build loop for this framework, in the general [Ralph](https://ghuntley.com/ralph/) style (run a fresh agent context against a fixed prompt, repeat), adapted to the framework's diff --git a/tools/spec-status-index/README.md b/tools/spec-status-index/README.md index 844a709e..3331d06b 100644 --- a/tools/spec-status-index/README.md +++ b/tools/spec-status-index/README.md @@ -14,6 +14,8 @@ # spec-status-index +**Capability:** capability:setup + capability:stats + A deterministic `uv` tool that reads spec-loop specs from `tools/spec-loop/specs/` and prints them grouped by status, so build iterations can choose the next work item mechanically. diff --git a/tools/vulnogram/README.md b/tools/vulnogram/README.md new file mode 100644 index 00000000..a6079a50 --- /dev/null +++ b/tools/vulnogram/README.md @@ -0,0 +1,16 @@ + + +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [`tools/vulnogram/`](#toolsvulnogram) + + + + + +# `tools/vulnogram/` + +**Capability:** capability:resolve + +ASF Vulnogram CVE-allocation client. OAuth-authenticated API that allocates a CVE ID through the ASF's Vulnogram instance and publishes the CVE record. Consumed by `security-cve-allocate`. See [`allocation.md`](allocation.md), [`record.md`](record.md), and [`bot-credits-policy.md`](bot-credits-policy.md) for the protocol.