Skip to content

niveshdandyan/skill-tester-swarm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Skill Tester Swarm

Automated QA system for Claude Code skills with screenshots and scoring.

╔═══════════════════════════════════════════════════════════════════════════════╗
║                                                                                ║
║   ███████╗██╗  ██╗██╗██╗     ██╗         ████████╗███████╗███████╗████████╗   ║
║   ██╔════╝██║ ██╔╝██║██║     ██║         ╚══██╔══╝██╔════╝██╔════╝╚══██╔══╝   ║
║   ███████╗█████╔╝ ██║██║     ██║            ██║   █████╗  ███████╗   ██║      ║
║   ╚════██║██╔═██╗ ██║██║     ██║            ██║   ██╔══╝  ╚════██║   ██║      ║
║   ███████║██║  ██╗██║███████╗███████╗       ██║   ███████╗███████║   ██║      ║
║   ╚══════╝╚═╝  ╚═╝╚═╝╚══════╝╚══════╝       ╚═╝   ╚══════╝╚══════╝   ╚═╝      ║
║                                                                                ║
║                    🧪 Automated Skill Testing Swarm 🧪                         ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝

What It Does

An 8-agent swarm that automatically:

  1. Discovers skills from GitHub and skill marketplaces
  2. Installs each skill in an isolated environment
  3. Generates realistic test prompts based on triggers
  4. Executes tests and captures all output
  5. Takes screenshots of terminal output and file results
  6. Evaluates quality with objective scoring (A-F grades)
  7. Generates comprehensive reports with pass/fail
  8. Publishes an interactive dashboard to GitHub Pages

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                        SKILL TESTER SWARM PIPELINE                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  Agent 0: Skill Discoverer                                                  │
│      │    Crawl GitHub, SkillsMP for skills                                 │
│      ▼                                                                       │
│  Agent 1: Skill Installer                                                   │
│      │    Install to temp directory                                         │
│      ▼                                                                       │
│  Agent 2: Test Generator                                                    │
│      │    Create test prompts from triggers                                 │
│      ▼                                                                       │
│  Agent 3: Test Executor                                                     │
│      │    Run tests, capture output                                         │
│      ▼                                                                       │
│  Agent 4: Screenshot Capturer                                               │
│      │    Terminal + file screenshots                                       │
│      ▼                                                                       │
│  Agent 5: Quality Evaluator                                                 │
│      │    Score and grade each skill                                        │
│      ▼                                                                       │
│  Agent 6: Report Generator                                                  │
│      │    Markdown, JSON, HTML reports                                      │
│      ▼                                                                       │
│  Agent 7: Dashboard Publisher                                               │
│           Deploy to GitHub Pages                                            │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Quick Start

# Clone the repository
git clone https://github.com/username/skill-tester-swarm
cd skill-tester-swarm

# Make scripts executable
chmod +x scripts/*.sh

# Run the swarm
./scripts/orchestrator.sh run

# View results
open reports/index.html

Output

Screenshots

For each test, the swarm captures:

  • Terminal output - The skill's response
  • File previews - Any files created (PDF, images, etc.)
  • Summary cards - Visual test summary

Scoring

Grade Score Meaning
A 90-100 Production ready
B 80-89 Good, minor issues
C 70-79 Functional, needs work
D 60-69 Unstable
F 0-59 Not recommended

Dashboard

Interactive web dashboard with:

  • Search and filter skills
  • Click to view screenshots
  • Download JSON reports
  • Grade distribution charts

Directory Structure

skill-tester-swarm/
├── agents/                    # Agent definitions
│   ├── agent-0-skill-discoverer.md
│   ├── agent-1-skill-installer.md
│   ├── agent-2-test-generator.md
│   ├── agent-3-test-executor.md
│   ├── agent-4-screenshot-capturer.md
│   ├── agent-5-quality-evaluator.md
│   ├── agent-6-report-generator.md
│   └── agent-7-dashboard-publisher.md
├── config/
│   └── swarm-config.json      # Configuration
├── scripts/
│   └── orchestrator.sh        # Main runner
├── workspace/                 # Runtime data
├── screenshots/               # Captured screenshots
├── reports/                   # Generated reports
└── dashboard/                 # Published dashboard

Configuration

Edit config/swarm-config.json:

{
  "discovery": {
    "sources": {
      "github_search": true,
      "skillsmp": true
    }
  },
  "test_generation": {
    "tests_per_skill": 7,
    "categories": ["basic", "standard", "complex", "edge", "error"]
  },
  "screenshots": {
    "enabled": true,
    "types": ["terminal", "file_preview", "summary_card"]
  },
  "evaluation": {
    "grades": {
      "A": { "min": 90 },
      "B": { "min": 80 },
      "C": { "min": 70 },
      "D": { "min": 60 },
      "F": { "min": 0 }
    }
  }
}

Sample Output

Test Report

# Skill: pdf (Grade: B - 85.3)

## Tests

| Test | Status | Score | Screenshot |
|------|--------|-------|------------|
| basic | ✅ PASS | 92 | [View] |
| standard | ✅ PASS | 88 | [View] |
| complex | ✅ PASS | 85 | [View] |
| edge | ✅ PASS | 75 | [View] |
| error | ⚠️ WARN | 68 | [View] |

## Screenshots

[Terminal Output] [PDF Preview] [Summary Card]

Dashboard Preview

┌──────────────────────────────────────────────────────────────────────────┐
│  🧪 SKILL TESTER DASHBOARD                                               │
├──────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   68 Skills    88.2% Pass    78.5 Avg Score    25 Grade A               │
│                                                                          │
│   ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐                         │
│   │ xlsx │ │ pdf  │ │ docx │ │github│ │vercel│                         │
│   │  A   │ │  B   │ │  B   │ │  B   │ │  B   │                         │
│   │ 94.2 │ │ 85.3 │ │ 82.1 │ │ 81.5 │ │ 80.2 │                         │
│   └──────┘ └──────┘ └──────┘ └──────┘ └──────┘                         │
│                                                                          │
└──────────────────────────────────────────────────────────────────────────┘

Use Cases

  1. Skill Authors - Test your skill before publishing
  2. Marketplace Curators - Validate skill quality
  3. Teams - Ensure internal skills meet standards
  4. CI/CD - Automated regression testing

License

MIT

About

Automated QA system for Claude Code skills. Discovers skills, generates tests, captures screenshots, scores quality (A-F grades), and publishes an interactive dashboard.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages