Skip to content

Commit e1427d9

Browse files
ahpaleusclaudedguido
authored
Add semgrep-rule-variant-creator skill (trailofbits#17)
* Add semgrep-rule-variant-creator skill New skill for porting existing Semgrep rules to target languages with proper applicability analysis and test-driven validation. Each target language goes through an independent 4-phase cycle: applicability analysis, test creation, rule creation, and validation. Key features: - Dynamic applicability analysis per target language - Test-first approach with language-specific test cases - AST-based pattern translation guidance - Detailed workflow with troubleshooting examples Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Clarify usage examples in README - Explicitly mention "Semgrep rule" in usage examples - Add more example invocations for skill triggering Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Minor wording fixes in SKILL.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add per-language pattern examples link to documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add pattern examples link to language-syntax-guide Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Minor cleanup and wording fixes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Clarify taint mode documentation in references Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add reference to semgrep-rule-creator as foundational knowledge The semgrep-rule-creator skill is the authoritative source for rule creation fundamentals (taint mode, test-first, anti-patterns, etc.). This skill focuses on porting, but the core principles remain the same. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add semgrep-rule-variant-creator to README.md Add plugin entry to the Code Auditing section. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Dan Guido <dan@trailofbits.com>
1 parent fd46c1c commit e1427d9

File tree

8 files changed

+1403
-0
lines changed

8 files changed

+1403
-0
lines changed

.claude-plugin/marketplace.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,16 @@
125125
},
126126
"source": "./plugins/semgrep-rule-creator"
127127
},
128+
{
129+
"name": "semgrep-rule-variant-creator",
130+
"version": "1.0.0",
131+
"description": "Creates language variants of existing Semgrep rules with proper applicability analysis and test-driven validation",
132+
"author": {
133+
"name": "Maciej Domanski",
134+
"email": "opensource@trailofbits.com"
135+
},
136+
"source": "./plugins/semgrep-rule-variant-creator"
137+
},
128138
{
129139
"name": "sharp-edges",
130140
"version": "1.0.0",

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ cd /path/to/parent # e.g., if repo is at ~/projects/skills, be in ~/projects
4242
| [burpsuite-project-parser](plugins/burpsuite-project-parser/) | Search and extract data from Burp Suite project files |
4343
| [differential-review](plugins/differential-review/) | Security-focused differential review of code changes with git history analysis |
4444
| [semgrep-rule-creator](plugins/semgrep-rule-creator/) | Create and refine Semgrep rules for custom vulnerability detection |
45+
| [semgrep-rule-variant-creator](plugins/semgrep-rule-variant-creator/) | Port existing Semgrep rules to new target languages with test-driven validation |
4546
| [sharp-edges](plugins/sharp-edges/) | Identify error-prone APIs, dangerous configurations, and footgun designs |
4647
| [static-analysis](plugins/static-analysis/) | Static analysis toolkit with CodeQL, Semgrep, and SARIF parsing |
4748
| [testing-handbook-skills](plugins/testing-handbook-skills/) | Skills from the [Testing Handbook](https://appsec.guide): fuzzers, static analysis, sanitizers, coverage |
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"name": "semgrep-rule-variant-creator",
3+
"version": "1.0.0",
4+
"description": "Creates language variants of existing Semgrep rules with proper applicability analysis and test-driven validation",
5+
"author": {
6+
"name": "Maciej Domanski",
7+
"email": "opensource@trailofbits.com"
8+
}
9+
}
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Semgrep Rule Variant Creator
2+
3+
A Claude Code skill for porting existing Semgrep rules to new target languages with proper applicability analysis and test-driven validation.
4+
5+
## Overview
6+
7+
This skill takes an existing Semgrep rule and one or more target languages, then generates independent rule variants for each applicable language. Each variant goes through a complete 4-phase cycle:
8+
9+
1. **Applicability Analysis** - Determine if the vulnerability pattern applies to the target language
10+
2. **Test Creation** - Write test-first with vulnerable and safe cases
11+
3. **Rule Creation** - Translate patterns and adapt for target language idioms
12+
4. **Validation** - Ensure all tests pass before proceeding
13+
14+
## Prerequisites
15+
16+
- [Semgrep](https://semgrep.dev/docs/getting-started/) installed and available in PATH
17+
- Existing Semgrep rule to port (in YAML)
18+
- Target languages specified
19+
20+
## Usage
21+
22+
Invoke the skill when you want to port an existing Semgrep rule:
23+
24+
```
25+
Port the sql-injection.yaml Semgrep rule to Go and Java
26+
```
27+
28+
```
29+
Create Semgrep rule variants of my-rule.yaml for TypeScript, Rust, and C#
30+
```
31+
32+
```
33+
Create the same Semgrep rule for JavaScript and Ruby
34+
```
35+
36+
```
37+
Port this Semgrep rule to Golang
38+
```
39+
40+
## Output Structure
41+
42+
For each applicable target language, the skill produces:
43+
44+
```
45+
<original-rule-id>-<language>/
46+
├── <original-rule-id>-<language>.yaml # Ported rule
47+
└── <original-rule-id>-<language>.<ext> # Test file
48+
```
49+
50+
## Example
51+
52+
**Input:**
53+
- Rule: `python-command-injection.yaml`
54+
- Target languages: Go, Java
55+
56+
**Output:**
57+
```
58+
python-command-injection-golang/
59+
├── python-command-injection-golang.yaml
60+
└── python-command-injection-golang.go
61+
62+
python-command-injection-java/
63+
├── python-command-injection-java.yaml
64+
└── python-command-injection-java.java
65+
```
66+
67+
## Key Differences from semgrep-rule-creator
68+
69+
| Aspect | semgrep-rule-creator | semgrep-rule-variant-creator |
70+
|--------|---------------------|------------------------------|
71+
| Input | Bug pattern description | Existing rule + target languages |
72+
| Output | Single rule+test | Multiple rule+test directories |
73+
| Workflow | Single creation cycle | Independent cycle per language |
74+
| Phase 1 | Problem analysis | Applicability analysis |
75+
76+
## Skill Files
77+
78+
- `skills/semgrep-rule-variant-creator/SKILL.md` - Main entry point
79+
- `skills/semgrep-rule-variant-creator/references/applicability-analysis.md` - Phase 1 guidance
80+
- `skills/semgrep-rule-variant-creator/references/language-syntax-guide.md` - Pattern translation guidance
81+
- `skills/semgrep-rule-variant-creator/references/workflow.md` - Detailed 4-phase workflow
82+
83+
## Related Skills
84+
85+
- **semgrep-rule-creator** - Create new Semgrep rules from scratch
86+
- **static-analysis** - Run existing Semgrep rules against code
Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
---
2+
name: semgrep-rule-variant-creator
3+
description: Creates language variants of existing Semgrep rules. Use when porting a Semgrep rule to specified target languages. Takes an existing rule and target languages as input, produces independent rule+test directories for each language.
4+
allowed-tools:
5+
- Bash
6+
- Read
7+
- Write
8+
- Edit
9+
- Glob
10+
- Grep
11+
- WebFetch
12+
---
13+
14+
# Semgrep Rule Variant Creator
15+
16+
Port existing Semgrep rules to new target languages with proper applicability analysis and test-driven validation.
17+
18+
## When to Use
19+
20+
**Ideal scenarios:**
21+
- Porting an existing Semgrep rule to one or more target languages
22+
- Creating language-specific variants of a universal vulnerability pattern
23+
- Expanding rule coverage across a polyglot codebase
24+
- Translating rules between languages with equivalent constructs
25+
26+
## When NOT to Use
27+
28+
Do NOT use this skill for:
29+
- Creating a new Semgrep rule from scratch (use `semgrep-rule-creator` instead)
30+
- Running existing rules against code
31+
- Languages where the vulnerability pattern fundamentally doesn't apply
32+
- Minor syntax variations within the same language
33+
34+
## Input Specification
35+
36+
This skill requires:
37+
1. **Existing Semgrep rule** - YAML file path or YAML rule content
38+
2. **Target languages** - One or more languages to port to (e.g., "Golang and Java")
39+
40+
## Output Specification
41+
42+
For each applicable target language, produces:
43+
```
44+
<original-rule-id>-<language>/
45+
├── <original-rule-id>-<language>.yaml # Ported Semgrep rule
46+
└── <original-rule-id>-<language>.<ext> # Test file with annotations
47+
```
48+
49+
Example output for porting `sql-injection` to Go and Java:
50+
```
51+
sql-injection-golang/
52+
├── sql-injection-golang.yaml
53+
└── sql-injection-golang.go
54+
55+
sql-injection-java/
56+
├── sql-injection-java.yaml
57+
└── sql-injection-java.java
58+
```
59+
60+
## Rationalizations to Reject
61+
62+
When porting Semgrep rules, reject these common shortcuts:
63+
64+
| Rationalization | Why It Fails | Correct Approach |
65+
|-----------------|--------------|------------------|
66+
| "Pattern structure is identical" | Different ASTs across languages | Always dump AST for target language |
67+
| "Same vulnerability, same detection" | Data flow differs between languages | Analyze target language idioms |
68+
| "Rule doesn't need tests since original worked" | Language edge cases differ | Write NEW test cases for target |
69+
| "Skip applicability - it obviously applies" | Some patterns are language-specific | Complete applicability analysis first |
70+
| "I'll create all variants then test" | Errors compound, hard to debug | Complete full cycle per language |
71+
| "Library equivalent is close enough" | Surface similarity hides differences | Verify API semantics match |
72+
| "Just translate the syntax 1:1" | Languages have different idioms | Research target language patterns |
73+
74+
## Strictness Level
75+
76+
This workflow is **strict** - do not skip steps:
77+
- **Applicability analysis is mandatory**: Don't assume patterns translate
78+
- **Each language is independent**: Complete full cycle before moving to next
79+
- **Test-first for each variant**: Never write a rule without test cases
80+
- **100% test pass required**: "Most tests pass" is not acceptable
81+
82+
## Overview
83+
84+
This skill guides the creation of language-specific variants of existing Semgrep rules. Each target language goes through an independent 4-phase cycle:
85+
86+
```
87+
FOR EACH target language:
88+
Phase 1: Applicability Analysis → Verdict
89+
Phase 2: Test Creation (Test-First)
90+
Phase 3: Rule Creation
91+
Phase 4: Validation
92+
(Complete full cycle before moving to next language)
93+
```
94+
95+
## Foundational Knowledge
96+
97+
**The `semgrep-rule-creator` skill is the authoritative reference for Semgrep rule creation fundamentals.** While this skill focuses on porting existing rules to new languages, the core principles of writing quality rules remain the same.
98+
99+
Consult `semgrep-rule-creator` for guidance on:
100+
- **When to use taint mode vs pattern matching** - Choosing the right approach for the vulnerability type
101+
- **Test-first methodology** - Why tests come before rules and how to write effective test cases
102+
- **Anti-patterns to avoid** - Common mistakes like overly broad or overly specific patterns
103+
- **Iterating until tests pass** - The validation loop and debugging techniques
104+
- **Rule optimization** - Removing redundant patterns after tests pass
105+
106+
When porting a rule, you're applying these same principles in a new language context. If uncertain about rule structure or approach, refer to `semgrep-rule-creator` first.
107+
108+
## Four-Phase Workflow
109+
110+
### Phase 1: Applicability Analysis
111+
112+
Before porting, determine if the pattern applies to the target language.
113+
114+
**Analysis criteria:**
115+
1. Does the vulnerability class exist in the target language?
116+
2. Does an equivalent construct exist (function, pattern, library)?
117+
3. Are the semantics similar enough for meaningful detection?
118+
119+
**Verdict options:**
120+
- `APPLICABLE` → Proceed with variant creation
121+
- `APPLICABLE_WITH_ADAPTATION` → Proceed but significant changes needed
122+
- `NOT_APPLICABLE` → Skip this language, document why
123+
124+
See [applicability-analysis.md]({baseDir}/references/applicability-analysis.md) for detailed guidance.
125+
126+
### Phase 2: Test Creation (Test-First)
127+
128+
**Always write tests before the rule.**
129+
130+
Create test file with target language idioms:
131+
- Minimum 2 vulnerable cases (`ruleid:`)
132+
- Minimum 2 safe cases (`ok:`)
133+
- Include language-specific edge cases
134+
135+
```go
136+
// ruleid: sql-injection-golang
137+
db.Query("SELECT * FROM users WHERE id = " + userInput)
138+
139+
// ok: sql-injection-golang
140+
db.Query("SELECT * FROM users WHERE id = ?", userInput)
141+
```
142+
143+
### Phase 3: Rule Creation
144+
145+
1. **Analyze AST**: `semgrep --dump-ast -l <lang> test-file`
146+
2. **Translate patterns** to target language syntax
147+
3. **Update metadata**: language key, message, rule ID
148+
4. **Adapt for idioms**: Handle language-specific constructs
149+
150+
See [language-syntax-guide.md]({baseDir}/references/language-syntax-guide.md) for translation guidance.
151+
152+
### Phase 4: Validation
153+
154+
```bash
155+
# Validate YAML
156+
semgrep --validate --config rule.yaml
157+
158+
# Run tests
159+
semgrep --test --config rule.yaml test-file
160+
```
161+
162+
**Checkpoint**: Output MUST show `All tests passed`.
163+
164+
For taint rule debugging:
165+
```bash
166+
semgrep --dataflow-traces -f rule.yaml test-file
167+
```
168+
169+
See [workflow.md]({baseDir}/references/workflow.md) for detailed workflow and troubleshooting.
170+
171+
## Quick Reference
172+
173+
| Task | Command |
174+
|------|---------|
175+
| Run tests | `semgrep --test --config rule.yaml test-file` |
176+
| Validate YAML | `semgrep --validate --config rule.yaml` |
177+
| Dump AST | `semgrep --dump-ast -l <lang> <file>` |
178+
| Debug taint flow | `semgrep --dataflow-traces -f rule.yaml file` |
179+
180+
181+
## Key Differences from Rule Creation
182+
183+
| Aspect | semgrep-rule-creator | This skill |
184+
|--------|---------------------|------------|
185+
| Input | Bug pattern description | Existing rule + target languages |
186+
| Output | Single rule+test | Multiple rule+test directories |
187+
| Workflow | Single creation cycle | Independent cycle per language |
188+
| Phase 1 | Problem analysis | Applicability analysis per language |
189+
| Library research | Always relevant | Optional (when original uses libraries) |
190+
191+
## Documentation
192+
193+
**REQUIRED**: Before porting rules, read relevant Semgrep documentation:
194+
195+
- [Rule Syntax](https://semgrep.dev/docs/writing-rules/rule-syntax) - YAML structure and operators
196+
- [Pattern Syntax](https://semgrep.dev/docs/writing-rules/pattern-syntax) - Pattern matching and metavariables
197+
- [Pattern Examples](https://semgrep.dev/docs/writing-rules/pattern-examples) - Per-language pattern references
198+
- [Testing Rules](https://semgrep.dev/docs/writing-rules/testing-rules) - Testing annotations
199+
- [Trail of Bits Testing Handbook](https://appsec.guide/docs/static-analysis/semgrep/advanced/) - Advanced patterns
200+
201+
## Next Steps
202+
203+
- For applicability analysis guidance, see [applicability-analysis.md]({baseDir}/references/applicability-analysis.md)
204+
- For language translation guidance, see [language-syntax-guide.md]({baseDir}/references/language-syntax-guide.md)
205+
- For detailed workflow and examples, see [workflow.md]({baseDir}/references/workflow.md)

0 commit comments

Comments
 (0)