Automated QA system for Claude Code skills with screenshots and scoring.
╔═══════════════════════════════════════════════════════════════════════════════╗
║ ║
║ ███████╗██╗ ██╗██╗██╗ ██╗ ████████╗███████╗███████╗████████╗ ║
║ ██╔════╝██║ ██╔╝██║██║ ██║ ╚══██╔══╝██╔════╝██╔════╝╚══██╔══╝ ║
║ ███████╗█████╔╝ ██║██║ ██║ ██║ █████╗ ███████╗ ██║ ║
║ ╚════██║██╔═██╗ ██║██║ ██║ ██║ ██╔══╝ ╚════██║ ██║ ║
║ ███████║██║ ██╗██║███████╗███████╗ ██║ ███████╗███████║ ██║ ║
║ ╚══════╝╚═╝ ╚═╝╚═╝╚══════╝╚══════╝ ╚═╝ ╚══════╝╚══════╝ ╚═╝ ║
║ ║
║ 🧪 Automated Skill Testing Swarm 🧪 ║
║ ║
╚═══════════════════════════════════════════════════════════════════════════════╝
An 8-agent swarm that automatically:
- Discovers skills from GitHub and skill marketplaces
- Installs each skill in an isolated environment
- Generates realistic test prompts based on triggers
- Executes tests and captures all output
- Takes screenshots of terminal output and file results
- Evaluates quality with objective scoring (A-F grades)
- Generates comprehensive reports with pass/fail
- Publishes an interactive dashboard to GitHub Pages
┌─────────────────────────────────────────────────────────────────────────────┐
│ SKILL TESTER SWARM PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Agent 0: Skill Discoverer │
│ │ Crawl GitHub, SkillsMP for skills │
│ ▼ │
│ Agent 1: Skill Installer │
│ │ Install to temp directory │
│ ▼ │
│ Agent 2: Test Generator │
│ │ Create test prompts from triggers │
│ ▼ │
│ Agent 3: Test Executor │
│ │ Run tests, capture output │
│ ▼ │
│ Agent 4: Screenshot Capturer │
│ │ Terminal + file screenshots │
│ ▼ │
│ Agent 5: Quality Evaluator │
│ │ Score and grade each skill │
│ ▼ │
│ Agent 6: Report Generator │
│ │ Markdown, JSON, HTML reports │
│ ▼ │
│ Agent 7: Dashboard Publisher │
│ Deploy to GitHub Pages │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
# Clone the repository
git clone https://github.com/username/skill-tester-swarm
cd skill-tester-swarm
# Make scripts executable
chmod +x scripts/*.sh
# Run the swarm
./scripts/orchestrator.sh run
# View results
open reports/index.htmlFor each test, the swarm captures:
- Terminal output - The skill's response
- File previews - Any files created (PDF, images, etc.)
- Summary cards - Visual test summary
| Grade | Score | Meaning |
|---|---|---|
| A | 90-100 | Production ready |
| B | 80-89 | Good, minor issues |
| C | 70-79 | Functional, needs work |
| D | 60-69 | Unstable |
| F | 0-59 | Not recommended |
Interactive web dashboard with:
- Search and filter skills
- Click to view screenshots
- Download JSON reports
- Grade distribution charts
skill-tester-swarm/
├── agents/ # Agent definitions
│ ├── agent-0-skill-discoverer.md
│ ├── agent-1-skill-installer.md
│ ├── agent-2-test-generator.md
│ ├── agent-3-test-executor.md
│ ├── agent-4-screenshot-capturer.md
│ ├── agent-5-quality-evaluator.md
│ ├── agent-6-report-generator.md
│ └── agent-7-dashboard-publisher.md
├── config/
│ └── swarm-config.json # Configuration
├── scripts/
│ └── orchestrator.sh # Main runner
├── workspace/ # Runtime data
├── screenshots/ # Captured screenshots
├── reports/ # Generated reports
└── dashboard/ # Published dashboard
Edit config/swarm-config.json:
{
"discovery": {
"sources": {
"github_search": true,
"skillsmp": true
}
},
"test_generation": {
"tests_per_skill": 7,
"categories": ["basic", "standard", "complex", "edge", "error"]
},
"screenshots": {
"enabled": true,
"types": ["terminal", "file_preview", "summary_card"]
},
"evaluation": {
"grades": {
"A": { "min": 90 },
"B": { "min": 80 },
"C": { "min": 70 },
"D": { "min": 60 },
"F": { "min": 0 }
}
}
}# Skill: pdf (Grade: B - 85.3)
## Tests
| Test | Status | Score | Screenshot |
|------|--------|-------|------------|
| basic | ✅ PASS | 92 | [View] |
| standard | ✅ PASS | 88 | [View] |
| complex | ✅ PASS | 85 | [View] |
| edge | ✅ PASS | 75 | [View] |
| error | ⚠️ WARN | 68 | [View] |
## Screenshots
[Terminal Output] [PDF Preview] [Summary Card]┌──────────────────────────────────────────────────────────────────────────┐
│ 🧪 SKILL TESTER DASHBOARD │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ 68 Skills 88.2% Pass 78.5 Avg Score 25 Grade A │
│ │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │ xlsx │ │ pdf │ │ docx │ │github│ │vercel│ │
│ │ A │ │ B │ │ B │ │ B │ │ B │ │
│ │ 94.2 │ │ 85.3 │ │ 82.1 │ │ 81.5 │ │ 80.2 │ │
│ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────┘
- Skill Authors - Test your skill before publishing
- Marketplace Curators - Validate skill quality
- Teams - Ensure internal skills meet standards
- CI/CD - Automated regression testing
MIT