-
Notifications
You must be signed in to change notification settings - Fork 424
perf: fix +320% regression in CompileComplexWorkflow by eliminating redundant yaml.Unmarshal #40662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b1d2f01
2ec897a
f09f481
347fb15
14c53a0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| # ADR-40662: Use Targeted Text Scans to Gate yaml.Unmarshal in Template-Injection Validation | ||
|
|
||
| **Date**: 2026-06-21 | ||
| **Status**: Accepted | ||
|
|
||
| ## Context | ||
|
|
||
| `BenchmarkCompileComplexWorkflow` regressed ~320% (from ~3ms/op to ~12.7ms/op). CPU profiling traced the cost to `validateTemplateInjection` Path B — the code path taken whenever `skipValidation=true`, which is the default in `NewCompiler()`. That path used `hasAnyExpressionInRunContent`, a pre-check that matched **any** `${{ }}` expression in a `run:` block, including compiler-owned references such as `${{ runner.temp }}`. Because compiled workflows almost always contain such safe expressions, the pre-check effectively always returned true and unconditionally triggered a full `yaml.Unmarshal` of the ~1200-line compiled YAML on every compilation, even when no injection or regression violation was possible. | ||
|
|
||
| ## Decision | ||
|
|
||
| We will replace the broad any-expression pre-check in `validateTemplateInjection` Path B with two targeted text scans that map one-to-one onto the two downstream sub-validators, and only call `yaml.Unmarshal` when at least one scan signals a potential issue. `needsUnsafeCheck` uses `hasExpressionInRunContent(UnsafeContextPattern)` to detect user-controlled contexts (template-injection risk), and `needsDisallowedCheck` uses the new `hasNonAllowedExpressionInRunContent` to detect expressions outside the compiler-owned allow-list (`allowedRunScriptExpressionRegex`). Both scans share the same raw-YAML `run:` walker, including flow-style and quoted-key support, so they stay aligned with each other and with the parsed-YAML validators. For well-formed workflows both scans return false and the YAML re-parse is skipped entirely. | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| ### Alternative 1: Reuse the schema-validation parsed tree in Path B | ||
| We could enable schema validation (or share its parsed `map[string]any`) so Path B never re-parses. This was rejected because `skipValidation=true` is the intentional default fast path; forcing a parse there would reintroduce the very `yaml.Unmarshal` cost this change eliminates and change compiler semantics for callers that deliberately skip schema checks. | ||
|
|
||
| ### Alternative 2: Narrow `hasAnyExpressionInRunContent` to ignore allowed expressions only | ||
| We could keep a single combined pre-check that skips allow-listed expressions but still gate both validators together. This was rejected because the two validators have genuinely different trigger conditions (unsafe-context vs. non-allowed-expression); collapsing them would run one validator unnecessarily whenever the other's condition is met, and splitting the scans keeps each sub-validator's cost proportional to its own risk signal. | ||
|
|
||
| ## Consequences | ||
|
|
||
| ### Positive | ||
| - `BenchmarkCompileComplexWorkflow` drops from ~6.8ms/op to ~1.64ms/op and allocations from 98,282 to 10,894 per op. | ||
| - `yaml.Unmarshal` is skipped entirely in the common case (well-formed compiled workflows with only compiler-owned expressions). | ||
| - The two scans now correspond directly to the two sub-validators, making the trigger logic easier to reason about. | ||
|
|
||
| ### Negative | ||
| - The scan logic is more complex: the shared raw-YAML `run:` walker must stay consistent with the YAML parser's view of the same structure, including block scalars, flow-style mappings, and quoted keys. | ||
| - Two regexes (`UnsafeContextPattern`, `allowedRunScriptExpressionRegex`) plus block detection now encode security-relevant gating; a bug that wrongly returns false would silently skip a security validator. | ||
|
|
||
| ### Neutral | ||
| - Behavior is unchanged when a violation is actually present — the YAML is still parsed and the same validators run. | ||
| - `allowedRunScriptExpressionRegex` becomes part of the performance-critical fast path; extending the compiler's allow-list now also affects which expressions can skip the parse. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -145,11 +145,9 @@ func (c *Compiler) generateAndValidateYAML(workflowData *WorkflowData, markdownP | |
| // result between the two validators to avoid redundant yaml.Unmarshal calls. | ||
| // | ||
| // Performance note: when schema validation is enabled (needsSchemaCheck=true) the | ||
| // YAML is parsed regardless. hasUnsafeExpressionInRunContent performs an expensive | ||
| // text scan (regex + strings.Split + full line walk) that would be redundant in that | ||
| // path; we skip it and reuse the pre-parsed result for template injection instead. | ||
| // The text scan is only used when schema validation is disabled (skipValidation=true), | ||
| // where it avoids an otherwise unnecessary yaml.Unmarshal call. | ||
| // YAML is parsed regardless. The text scan in validateTemplateInjection is only | ||
| // used when schema validation is disabled (skipValidation=true), where targeted | ||
| // fast-path checks avoid an unnecessary yaml.Unmarshal. | ||
| needsSchemaCheck := !c.skipValidation | ||
|
|
||
| var parsedWorkflow map[string]any | ||
|
|
@@ -169,8 +167,8 @@ func (c *Compiler) generateAndValidateYAML(workflowData *WorkflowData, markdownP | |
| // parsedWorkflow != nil means the YAML was already parsed for schema validation; | ||
| // validateTemplateInjection reuses the pre-parsed tree (inspects only run: block values) | ||
| // rather than re-scanning the full YAML string. When parsedWorkflow is nil (schema | ||
| // validation disabled), the lightweight hasUnsafeExpressionInRunContent text scan is | ||
| // used first to avoid an unnecessary yaml.Unmarshal. | ||
| // validation disabled), the validator uses targeted text scans to avoid an unnecessary | ||
| // yaml.Unmarshal when all run-block expressions are in the compiler-owned allowed list. | ||
| if err := c.validateTemplateInjection(yamlContent, lockFile, markdownPath, parsedWorkflow); err != nil { | ||
| return "", nil, nil, err | ||
| } | ||
|
|
@@ -304,11 +302,20 @@ func (c *Compiler) writeWorkflowOutput(lockFile, yamlContent string, markdownPat | |
| // this function reuses it directly by walking the run: block values in the pre-parsed | ||
| // tree, which is faster than re-scanning the full YAML string with a regex. | ||
| // | ||
| // When parsedWorkflow is nil (schema validation disabled via skipValidation), the | ||
| // function first uses the lightweight hasAnyExpressionInRunContent text scan | ||
| // to avoid an unnecessary yaml.Unmarshal call. When the scan detects expressions | ||
| // in run blocks, the YAML is parsed with github.com/goccy/go-yaml for consistency | ||
| // with the parsed-workflow validators. | ||
| // When parsedWorkflow is nil (schema validation disabled via skipValidation), this | ||
| // function uses two targeted text scans instead of a broad any-expression check: | ||
| // | ||
| // 1. hasExpressionInRunContent(UnsafeContextPattern) — detects user-controlled | ||
| // contexts (github.event.*, steps.*.outputs.*, inputs.*) that represent genuine | ||
| // template-injection risks. | ||
| // | ||
| // 2. hasNonAllowedExpressionInRunContent — detects compiler-regression cases where | ||
| // a ${{ ... }} expression that is NOT in the compiler-owned allow-list slipped | ||
| // into a run: block. | ||
| // | ||
| // For well-formed compiled workflows (the common case) both scans return false, so | ||
| // no yaml.Unmarshal is needed at all. A yaml.Unmarshal is only performed when at | ||
| // least one scan detects a potential issue. | ||
| func (c *Compiler) validateTemplateInjection(yamlContent, lockFile, markdownPath string, parsedWorkflow map[string]any) error { | ||
| var templateErr error | ||
|
|
||
|
|
@@ -323,10 +330,20 @@ func (c *Compiler) validateTemplateInjection(yamlContent, lockFile, markdownPath | |
| } | ||
| } else { | ||
| // Path B: schema validation is disabled (parsedWorkflow is nil). | ||
| // Use the text scan to cheaply determine whether any expressions (safe or | ||
| // unsafe) appear inside a run: block before paying the cost of a full | ||
| // yaml.Unmarshal for layered security + regression validation. | ||
| if hasAnyExpressionInRunContent(yamlContent) { | ||
| // | ||
| // Use two targeted text scans with a shared run-block walker so we can skip the | ||
| // expensive yaml.Unmarshal when all run-block expressions are compiler-owned safe | ||
| // references (e.g. ${{ runner.temp }}, ${{ env.FOO }}). | ||
| // | ||
| // needsUnsafeCheck: true when user-controlled contexts appear in run: blocks | ||
| // → triggers validateNoTemplateInjectionFromParsed | ||
| // needsDisallowedCheck: true when any expression outside the compiler allow-list | ||
| // appears in run: blocks | ||
| // → triggers validateNoGitHubExpressionsInRunScriptsFromParsed | ||
| needsUnsafeCheck := hasExpressionInRunContent(yamlContent, UnsafeContextPattern) | ||
| needsDisallowedCheck := hasNonAllowedExpressionInRunContent(yamlContent) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Behavioral regression: disallowed expressions in undetected run blocks can now slip through validation entirely. The old pre-check ( The new pre-check only fires when a disallowed expression appears in a text-detected run block. If a workflow has only allowed expressions in block-style steps plus a disallowed expression in a step the scanner does not detect, 💡 Concrete regression scenario# flow-style step: text scanner misses it
- { run: "echo ${{ github.actor }}" } # disallowed, not in allow-list
# block-style step: scanner detects it, but expression is allowed
- run: node ${{ runner.temp }}/foo.cjsOld code ( New code: Suggested fix: keep if hasExpressionInRunContent(yamlContent, InlineExpressionPattern) {
var reparsed map[string]any
if err := yaml.Unmarshal(...); err == nil && reparsed != nil {
if hasExpressionInRunContent(yamlContent, UnsafeContextPattern) {
templateErr = validateNoTemplateInjectionFromParsed(reparsed)
}
if templateErr == nil && hasNonAllowedExpressionInRunContent(yamlContent) {
templateErr = validateNoGitHubExpressionsInRunScriptsFromParsed(reparsed)
}
}
}This skips |
||
|
|
||
| if needsUnsafeCheck || needsDisallowedCheck { | ||
| workflowLog.Print("Validating for template injection vulnerabilities") | ||
| var reparsed map[string]any | ||
| if err := yaml.Unmarshal([]byte(yamlContent), &reparsed); err != nil { | ||
|
|
@@ -335,8 +352,12 @@ func (c *Compiler) validateTemplateInjection(yamlContent, lockFile, markdownPath | |
| reparsed = nil | ||
| } | ||
| if reparsed != nil { | ||
| templateErr = validateNoTemplateInjectionFromParsed(reparsed) | ||
| if templateErr == nil { | ||
| // validateNoTemplateInjectionFromParsed only catches user-controlled contexts, | ||
| // so it is intentionally skipped when the UnsafeContextPattern scan found none. | ||
| if needsUnsafeCheck { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/diagnose] 💡 Suggested comment// validateNoTemplateInjectionFromParsed is intentionally skipped when
// needsUnsafeCheck=false: the validator only catches user-controlled contexts,
// and the UnsafeContextPattern scan above has already confirmed none are present.
if needsUnsafeCheck {
templateErr = validateNoTemplateInjectionFromParsed(reparsed)
} |
||
| templateErr = validateNoTemplateInjectionFromParsed(reparsed) | ||
| } | ||
| if templateErr == nil && needsDisallowedCheck { | ||
| templateErr = validateNoGitHubExpressionsInRunScriptsFromParsed(reparsed) | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[/tdd] The new Path B logic (both-scans-false → skip
yaml.Unmarshal) is the core of this performance fix, but there is no compiler-level integration test that exercises this fast path end-to-end. The new unit tests coverhasNonAllowedExpressionInRunContentin isolation; a test at thevalidateTemplateInjectionorCompilelevel that feeds a workflow containing only${{ runner.temp }}and asserts no error would anchor the behaviour and catch future regressions at the seam.💡 Sketch