feat: add supermodel skill command by greynewell · Pull Request #126 · supermodeltools/cli

greynewell · 2026-04-13T21:38:08Z

Summary

Adds supermodel skill command that emits an optimized skill prompt for Claude Code
Skill prompt instructs Claude Code to use .graph shard files for smarter navigation and context
Includes regression tests for the skill prompt content

CI fixes (rebased from main)

Windows: added TMP/TEMP alongside TMPDIR in find/zip_test.go
Windows: filepath separator fix already present via rebase on main

Test plan

All three CI platforms pass (ubuntu, macos, windows)
supermodel skill outputs a valid skill prompt
Regression tests cover key phrases in the prompt

Originally authored by @jonathanpopham — rebased and CI-fixed for merge.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a new skill command to display guidance for using code relationship data files.
Documentation
- Updated benchmark results with expanded performance comparisons (60% cheaper, 4× faster).
- Added documentation for code relationship graph file conventions and usage patterns.
Tests
- Enhanced test coverage for error handling scenarios.

coderabbitai · 2026-04-13T21:38:20Z

Caution

Review failed

The head commit changed during the review from bd084de to f3f4ebd.

Walkthrough

This PR introduces a new skill CLI command that outputs instructions to AI agents on how to use .graph.* sidecar files for understanding code relationships, accompanied by test coverage, documentation, and updated benchmark results reflecting a new 4-way comparison setup.

Changes

Cohort / File(s)	Summary
Skill Command & Tests `cmd/skill.go`, `cmd/skill_test.go`	New `skill` Cobra subcommand that prints an instruction prompt about `.graph.*` file conventions, sections, and usage patterns. Tests verify the prompt contains required keywords and has substantial content.
Benchmark Documentation `benchmark/CLAUDE.skill.md`, `benchmark/results/blog-post-draft.md`, `benchmark/results/summary.md`	New guide for graph file usage; updated benchmark narrative and results table expanding from 2-way to 4-way configuration comparison with revised metrics (60% cheaper, 4× faster, 55% fewer turns).
Error Handling Test `internal/find/zip_test.go`	Enhanced temp directory error test to set additional environment variables (`TMP`, `TEMP`) alongside `TMPDIR` for broader failure condition coverage.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

revert: remove skill command (needs redesign) #122: Directly reverts the skill command and skillPrompt added in this PR.
feat: add supermodel skill command for agent awareness prompt #119: Also adds the identical skill subcommand and skillPrompt to guide agents on .graph.* file usage.
fix(focus,find): fix relationship type case, empty CalledBy/Imports, stale API #68: Introduces the .graph.* sidecar files that this PR's documentation teaches agents how to consume.

Suggested reviewers

jonathanpopham

Poem

🧠 A skill to share with minds of code,
.graph.* files light the road,
Dependencies, calls, and impact shown,
Now AI knows what must be known. ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The PR description covers the main changes but doesn't follow the repository's template structure with explicit 'What', 'Why', and 'Test plan' sections.	Restructure the description using the template format: create explicit 'What' and 'Why' sections, and expand the 'Test plan' section to clarify implementation details.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: add supermodel skill command' clearly and concisely describes the main change: a new CLI command being added.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/skill-command

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

internal/find/zip_test.go (1)
174-175: Update the test comment to match the new setup.

Line 175 says failure is due to invalid TMPDIR, but the test now also relies on TMP and TEMP. Small wording tweak will keep the intent clear.
Suggested comment tweak
-// os.CreateTemp fails due to an invalid TMPDIR.
+// os.CreateTemp fails due to invalid temp environment directories (TMPDIR/TMP/TEMP).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/find/zip_test.go` around lines 174 - 175, Update the test comment
for TestCreateZip_CreateTempError to reflect the new environment setup: mention
that the failure is caused by invalid temporary directory environment variables
(TMPDIR, TMP, and TEMP) rather than only TMPDIR; locate the comment above the
TestCreateZip_CreateTempError test in internal/find/zip_test.go and change the
wording to something like "createZip returns an error when os.CreateTemp fails
due to invalid temporary directory environment variables (TMPDIR, TMP, TEMP)" so
the comment matches the test's current dependencies.
cmd/skill.go (2)
9-19: Consider a single source of truth for the skill prompt text.

Right now this content is duplicated with benchmark/CLAUDE.skill.md, so it can drift over time. Worth centralizing or adding an equality check test between the two artifacts.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/skill.go` around lines 9 - 19, The skillPrompt constant content is
duplicated in benchmark/CLAUDE.skill.md which can drift; centralize the
canonical text or add a test to assert equality. Either move the string out of
cmd/skill.go into a single shared resource (e.g., a new package-level file or
const in a shared package) and import/use it from cmd/skill.go and benchmark
code, or add a unit/integration test that reads benchmark/CLAUDE.skill.md and
compares it to the skillPrompt string (or vice versa) to fail the build on
drift; update references to the symbol skillPrompt and the
benchmark/CLAUDE.skill.md path accordingly.
31-33: Use Cobra's output redirection instead of fmt.Println.

Right now, fmt.Println writes directly to stdout, completely bypassing Cobra's output plumbing. This makes it impossible to redirect output in tests (via cmd.SetOut()) or in integrations where you might want to capture the command's output elsewhere.

The fix: Use either fmt.Fprintln(cmd.OutOrStdout(), skillPrompt) or—even simpler—cmd.Println(skillPrompt). Both respect the output redirection that tests and integrations expect.
♻️ Suggested patch
 		Args: cobra.NoArgs,
 		Run: func(cmd *cobra.Command, args []string) {
-			fmt.Println(skillPrompt)
+			cmd.Println(skillPrompt)
 		},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/skill.go` around lines 31 - 33, Replace the direct stdout write in the
Run func (the call to fmt.Println(skillPrompt)) with Cobra-aware output so
output redirection via cmd.SetOut()/tests works; update the Run closure in the
command where skillPrompt is printed to use either cmd.Println(skillPrompt) or
fmt.Fprintln(cmd.OutOrStdout(), skillPrompt) instead of fmt.Println.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmark/results/blog-post-draft.md`:
- Around line 32-43: The downstream prose still contains old benchmark numbers
that conflict with the updated table; update any occurrences of the outdated
stats (for example the strings "13 turns, $0.22", "7 turns, $0.13", and "Net
result: 40% cheaper") so they match the table values (Naked Claude $0.30/20
turns/122s; + Supermodel (crafted) $0.12/9 turns/29s; + Supermodel (auto)
$0.15/11 turns/42s; Three-file shards $0.25/16 turns/73s) and replace the
summary line with the new aggregate claim "60% cheaper. 4× faster. 55% fewer
turns."; search the file for any other numeric mentions of cost/turns/duration
and reconcile them to these table values and recalculated percentages so all
prose is consistent with the table.

---

Nitpick comments:
In `@cmd/skill.go`:
- Around line 9-19: The skillPrompt constant content is duplicated in
benchmark/CLAUDE.skill.md which can drift; centralize the canonical text or add
a test to assert equality. Either move the string out of cmd/skill.go into a
single shared resource (e.g., a new package-level file or const in a shared
package) and import/use it from cmd/skill.go and benchmark code, or add a
unit/integration test that reads benchmark/CLAUDE.skill.md and compares it to
the skillPrompt string (or vice versa) to fail the build on drift; update
references to the symbol skillPrompt and the benchmark/CLAUDE.skill.md path
accordingly.
- Around line 31-33: Replace the direct stdout write in the Run func (the call
to fmt.Println(skillPrompt)) with Cobra-aware output so output redirection via
cmd.SetOut()/tests works; update the Run closure in the command where
skillPrompt is printed to use either cmd.Println(skillPrompt) or
fmt.Fprintln(cmd.OutOrStdout(), skillPrompt) instead of fmt.Println.

In `@internal/find/zip_test.go`:
- Around line 174-175: Update the test comment for TestCreateZip_CreateTempError
to reflect the new environment setup: mention that the failure is caused by
invalid temporary directory environment variables (TMPDIR, TMP, and TEMP) rather
than only TMPDIR; locate the comment above the TestCreateZip_CreateTempError
test in internal/find/zip_test.go and change the wording to something like
"createZip returns an error when os.CreateTemp fails due to invalid temporary
directory environment variables (TMPDIR, TMP, TEMP)" so the comment matches the
test's current dependencies.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 483b0074-426f-405a-81f9-b0bc1f449d35

📥 Commits

Reviewing files that changed from the base of the PR and between 999ac58 and de04ed2.

⛔ Files ignored due to path filters (1)

benchmark/results/benchmark_results.zip is excluded by !**/*.zip

📒 Files selected for processing (7)

benchmark/CLAUDE.skill.md
benchmark/results/blog-post-draft.md
benchmark/results/skill-v2.txt
benchmark/results/summary.md
cmd/skill.go
cmd/skill_test.go
internal/find/zip_test.go

coderabbitai · 2026-04-13T21:42:14Z

+|                     | Naked Claude | + Supermodel (crafted) | + Supermodel (auto) | Three-file shards |
+|---------------------|-------------|------------------------|---------------------|-------------------|
+| **Cost**            | $0.30       | $0.12                  | $0.15               | $0.25             |
+| **Turns**           | 20          | 9                      | 11                  | 16                |
+| **Duration**        | 122s        | 29s                    | 42s                 | 73s               |
+| **Tests passed**    | ✓ YES       | ✓ YES                  | ✓ YES               | ✓ YES             |

-**40% cheaper. 6 fewer turns. 72 seconds faster.**
+**60% cheaper. 4× faster. 55% fewer turns.**

-Both got the right answer. The only difference was how much digging each one had to do first.
+All four got the right answer. The only difference was how much digging each one had to do first.
+
+"Crafted" is a hand-written CLAUDE.md with Django-specific hints. "Auto" is what `supermodel skill` generates — a generic prompt that works on any repo. The auto prompt captured 83% of the crafted prompt's savings with zero manual effort.


⚠️ Potential issue | 🟠 Major

Benchmark numbers are now internally inconsistent with later narrative sections.

After updating this table, the body still reports older values (for example, Line 49 shows 13 turns, $0.22, Line 70 shows 7 turns, $0.13, and Line 103 says Net result: 40% cheaper). This makes the post read as contradictory and weakens credibility.

Please align the downstream prose with this updated table before publish.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@benchmark/results/blog-post-draft.md` around lines 32 - 43, The downstream prose still contains old benchmark numbers that conflict with the updated table; update any occurrences of the outdated stats (for example the strings "13 turns, $0.22", "7 turns, $0.13", and "Net result: 40% cheaper") so they match the table values (Naked Claude $0.30/20 turns/122s; + Supermodel (crafted) $0.12/9 turns/29s; + Supermodel (auto) $0.15/11 turns/42s; Three-file shards $0.25/16 turns/73s) and replace the summary line with the new aggregate claim "60% cheaper. 4× faster. 55% fewer turns."; search the file for any other numeric mentions of cost/turns/duration and reconcile them to these table values and recalculated percentages so all prose is consistent with the table.

Revised the generic skill prompt based on benchmark trace analysis. Three changes: teach the .graph naming convention so agents construct paths directly, bold the read-order directive, and tell agents to check graph files before grepping for structure. Skill v2: $0.11, 31s, 7 turns (was $0.15, 42s, 11 turns) Matches Grey's hand-crafted Django prompt: $0.12, 29s, 9 turns

Locks in the six key elements that drove benchmark results: graph extension, three section names, naming convention example, and read-order directive.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Section headers and cost percentages updated to match the results table: - Naked: 20 turns, \$0.30 - Auto prompt: 11 turns, \$0.15 (50% cheaper) - Crafted: 9 turns, \$0.12 (60% cheaper) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greynewell requested a review from jonathanpopham as a code owner April 13, 2026 21:38

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

jonathanpopham and others added 3 commits April 13, 2026 17:44

test: add skill prompt regression tests

eecc084

Locks in the six key elements that drove benchmark results: graph extension, three section names, naming convention example, and read-order directive.

fix(ci): add TMP/TEMP alongside TMPDIR in find zip_test for Windows

bd084de

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greynewell force-pushed the feat/skill-command branch from de04ed2 to bd084de Compare April 13, 2026 21:44

greynewell merged commit fd65e1c into main Apr 13, 2026
6 checks passed

coderabbitai Bot mentioned this pull request Apr 30, 2026

CLI cleanup: production-ready file-mode workflow #167

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add supermodel skill command#126

feat: add supermodel skill command#126
greynewell merged 4 commits into
mainfrom
feat/skill-command

greynewell commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading

Review failed

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

greynewell commented Apr 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

CI fixes (rebased from main)

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greynewell commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading