Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions docs/adr/40662-targeted-text-scans-before-yaml-unmarshal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# ADR-40662: Use Targeted Text Scans to Gate yaml.Unmarshal in Template-Injection Validation

**Date**: 2026-06-21
**Status**: Accepted

## Context

`BenchmarkCompileComplexWorkflow` regressed ~320% (from ~3ms/op to ~12.7ms/op). CPU profiling traced the cost to `validateTemplateInjection` Path B — the code path taken whenever `skipValidation=true`, which is the default in `NewCompiler()`. That path used `hasAnyExpressionInRunContent`, a pre-check that matched **any** `${{ }}` expression in a `run:` block, including compiler-owned references such as `${{ runner.temp }}`. Because compiled workflows almost always contain such safe expressions, the pre-check effectively always returned true and unconditionally triggered a full `yaml.Unmarshal` of the ~1200-line compiled YAML on every compilation, even when no injection or regression violation was possible.

## Decision

We will replace the broad any-expression pre-check in `validateTemplateInjection` Path B with two targeted text scans that map one-to-one onto the two downstream sub-validators, and only call `yaml.Unmarshal` when at least one scan signals a potential issue. `needsUnsafeCheck` uses `hasExpressionInRunContent(UnsafeContextPattern)` to detect user-controlled contexts (template-injection risk), and `needsDisallowedCheck` uses the new `hasNonAllowedExpressionInRunContent` to detect expressions outside the compiler-owned allow-list (`allowedRunScriptExpressionRegex`). Both scans share the same raw-YAML `run:` walker, including flow-style and quoted-key support, so they stay aligned with each other and with the parsed-YAML validators. For well-formed workflows both scans return false and the YAML re-parse is skipped entirely.

## Alternatives Considered

### Alternative 1: Reuse the schema-validation parsed tree in Path B
We could enable schema validation (or share its parsed `map[string]any`) so Path B never re-parses. This was rejected because `skipValidation=true` is the intentional default fast path; forcing a parse there would reintroduce the very `yaml.Unmarshal` cost this change eliminates and change compiler semantics for callers that deliberately skip schema checks.

### Alternative 2: Narrow `hasAnyExpressionInRunContent` to ignore allowed expressions only
We could keep a single combined pre-check that skips allow-listed expressions but still gate both validators together. This was rejected because the two validators have genuinely different trigger conditions (unsafe-context vs. non-allowed-expression); collapsing them would run one validator unnecessarily whenever the other's condition is met, and splitting the scans keeps each sub-validator's cost proportional to its own risk signal.

## Consequences

### Positive
- `BenchmarkCompileComplexWorkflow` drops from ~6.8ms/op to ~1.64ms/op and allocations from 98,282 to 10,894 per op.
- `yaml.Unmarshal` is skipped entirely in the common case (well-formed compiled workflows with only compiler-owned expressions).
- The two scans now correspond directly to the two sub-validators, making the trigger logic easier to reason about.

### Negative
- The scan logic is more complex: the shared raw-YAML `run:` walker must stay consistent with the YAML parser's view of the same structure, including block scalars, flow-style mappings, and quoted keys.
- Two regexes (`UnsafeContextPattern`, `allowedRunScriptExpressionRegex`) plus block detection now encode security-relevant gating; a bug that wrongly returns false would silently skip a security validator.

### Neutral
- Behavior is unchanged when a violation is actually present — the YAML is still parsed and the same validators run.
- `allowedRunScriptExpressionRegex` becomes part of the performance-critical fast path; extending the compiler's allow-list now also affects which expressions can skip the parse.
57 changes: 39 additions & 18 deletions pkg/workflow/compiler.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,11 +145,9 @@ func (c *Compiler) generateAndValidateYAML(workflowData *WorkflowData, markdownP
// result between the two validators to avoid redundant yaml.Unmarshal calls.
//
// Performance note: when schema validation is enabled (needsSchemaCheck=true) the
// YAML is parsed regardless. hasUnsafeExpressionInRunContent performs an expensive
// text scan (regex + strings.Split + full line walk) that would be redundant in that
// path; we skip it and reuse the pre-parsed result for template injection instead.
// The text scan is only used when schema validation is disabled (skipValidation=true),
// where it avoids an otherwise unnecessary yaml.Unmarshal call.
// YAML is parsed regardless. The text scan in validateTemplateInjection is only
// used when schema validation is disabled (skipValidation=true), where targeted
// fast-path checks avoid an unnecessary yaml.Unmarshal.
needsSchemaCheck := !c.skipValidation

var parsedWorkflow map[string]any
Expand All @@ -169,8 +167,8 @@ func (c *Compiler) generateAndValidateYAML(workflowData *WorkflowData, markdownP
// parsedWorkflow != nil means the YAML was already parsed for schema validation;
// validateTemplateInjection reuses the pre-parsed tree (inspects only run: block values)
// rather than re-scanning the full YAML string. When parsedWorkflow is nil (schema
// validation disabled), the lightweight hasUnsafeExpressionInRunContent text scan is
// used first to avoid an unnecessary yaml.Unmarshal.
// validation disabled), the validator uses targeted text scans to avoid an unnecessary
// yaml.Unmarshal when all run-block expressions are in the compiler-owned allowed list.
if err := c.validateTemplateInjection(yamlContent, lockFile, markdownPath, parsedWorkflow); err != nil {
return "", nil, nil, err
}
Expand Down Expand Up @@ -304,11 +302,20 @@ func (c *Compiler) writeWorkflowOutput(lockFile, yamlContent string, markdownPat
// this function reuses it directly by walking the run: block values in the pre-parsed
// tree, which is faster than re-scanning the full YAML string with a regex.
//
// When parsedWorkflow is nil (schema validation disabled via skipValidation), the
// function first uses the lightweight hasAnyExpressionInRunContent text scan
// to avoid an unnecessary yaml.Unmarshal call. When the scan detects expressions
// in run blocks, the YAML is parsed with github.com/goccy/go-yaml for consistency
// with the parsed-workflow validators.
// When parsedWorkflow is nil (schema validation disabled via skipValidation), this
// function uses two targeted text scans instead of a broad any-expression check:
//
// 1. hasExpressionInRunContent(UnsafeContextPattern) — detects user-controlled
// contexts (github.event.*, steps.*.outputs.*, inputs.*) that represent genuine
// template-injection risks.
//
// 2. hasNonAllowedExpressionInRunContent — detects compiler-regression cases where
// a ${{ ... }} expression that is NOT in the compiler-owned allow-list slipped
// into a run: block.
//
// For well-formed compiled workflows (the common case) both scans return false, so
// no yaml.Unmarshal is needed at all. A yaml.Unmarshal is only performed when at
// least one scan detects a potential issue.
func (c *Compiler) validateTemplateInjection(yamlContent, lockFile, markdownPath string, parsedWorkflow map[string]any) error {
var templateErr error

Expand All @@ -323,10 +330,20 @@ func (c *Compiler) validateTemplateInjection(yamlContent, lockFile, markdownPath
}
} else {
// Path B: schema validation is disabled (parsedWorkflow is nil).
// Use the text scan to cheaply determine whether any expressions (safe or
// unsafe) appear inside a run: block before paying the cost of a full
// yaml.Unmarshal for layered security + regression validation.
if hasAnyExpressionInRunContent(yamlContent) {
//
// Use two targeted text scans with a shared run-block walker so we can skip the
// expensive yaml.Unmarshal when all run-block expressions are compiler-owned safe
// references (e.g. ${{ runner.temp }}, ${{ env.FOO }}).
//
// needsUnsafeCheck: true when user-controlled contexts appear in run: blocks
// → triggers validateNoTemplateInjectionFromParsed
// needsDisallowedCheck: true when any expression outside the compiler allow-list
// appears in run: blocks
// → triggers validateNoGitHubExpressionsInRunScriptsFromParsed
needsUnsafeCheck := hasExpressionInRunContent(yamlContent, UnsafeContextPattern)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The new Path B logic (both-scans-false → skip yaml.Unmarshal) is the core of this performance fix, but there is no compiler-level integration test that exercises this fast path end-to-end. The new unit tests cover hasNonAllowedExpressionInRunContent in isolation; a test at the validateTemplateInjection or Compile level that feeds a workflow containing only ${{ runner.temp }} and asserts no error would anchor the behaviour and catch future regressions at the seam.

💡 Sketch
// Verify that a compiled workflow with only compiler-owned expressions
// in run blocks does not trigger a yaml.Unmarshal error path.
func TestValidateTemplateInjection_AllowedExpressionsOnly(t *testing.T) {
    yamlWithAllowed := `on: push
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - run: node ${{ runner.temp }}/foo.cjs
`
    c := NewCompiler()
    err := c.validateTemplateInjection(yamlWithAllowed, "", "test.md", nil)
    assert.NoError(t, err)
}

needsDisallowedCheck := hasNonAllowedExpressionInRunContent(yamlContent)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behavioral regression: disallowed expressions in undetected run blocks can now slip through validation entirely.

The old pre-check (hasAnyExpressionInRunContent) triggered yaml.Unmarshal + both validators whenever any ${{ }} appeared in a text-detected run block—including allowed ones like ${{ runner.temp }}. That side-effect also caused validateNoGitHubExpressionsInRunScriptsFromParsed to inspect all run blocks from the parsed YAML, including flow-style steps and quoted-key steps that the text scanner misses.

The new pre-check only fires when a disallowed expression appears in a text-detected run block. If a workflow has only allowed expressions in block-style steps plus a disallowed expression in a step the scanner does not detect, needsDisallowedCheck stays false, no yaml.Unmarshal is performed, and validateNoGitHubExpressionsInRunScriptsFromParsed is silently skipped.

💡 Concrete regression scenario
# flow-style step: text scanner misses it
- { run: "echo ${{ github.actor }}" }  # disallowed, not in allow-list
# block-style step: scanner detects it, but expression is allowed
- run: node ${{ runner.temp }}/foo.cjs

Old code (hasAnyExpressionInRunContent): finds ${{ runner.temp }} in the block-style step → trueyaml.Unmarshal → parsed validator sees both steps → flags ${{ github.actor }}.

New code: hasNonAllowedExpressionInRunContent finds ${{ runner.temp }} (allowed, skip); misses the flow-style step entirely → needsDisallowedCheck = false → no yaml.Unmarshal${{ github.actor }} passes undetected.

Suggested fix: keep InlineExpressionPattern as the yaml.Unmarshal gate (preserving the original trigger), and use the two targeted flags only to decide which validators to call inside the already-parsed block:

if hasExpressionInRunContent(yamlContent, InlineExpressionPattern) {
    var reparsed map[string]any
    if err := yaml.Unmarshal(...); err == nil && reparsed != nil {
        if hasExpressionInRunContent(yamlContent, UnsafeContextPattern) {
            templateErr = validateNoTemplateInjectionFromParsed(reparsed)
        }
        if templateErr == nil && hasNonAllowedExpressionInRunContent(yamlContent) {
            templateErr = validateNoGitHubExpressionsInRunScriptsFromParsed(reparsed)
        }
    }
}

This skips yaml.Unmarshal when there are no expressions at all in detectable run blocks (the dominant fast path) while preserving the original validation safety net.


if needsUnsafeCheck || needsDisallowedCheck {
workflowLog.Print("Validating for template injection vulnerabilities")
var reparsed map[string]any
if err := yaml.Unmarshal([]byte(yamlContent), &reparsed); err != nil {
Expand All @@ -335,8 +352,12 @@ func (c *Compiler) validateTemplateInjection(yamlContent, lockFile, markdownPath
reparsed = nil
}
if reparsed != nil {
templateErr = validateNoTemplateInjectionFromParsed(reparsed)
if templateErr == nil {
// validateNoTemplateInjectionFromParsed only catches user-controlled contexts,
// so it is intentionally skipped when the UnsafeContextPattern scan found none.
if needsUnsafeCheck {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] validateNoTemplateInjectionFromParsed is now conditionally skipped when needsUnsafeCheck=false, which is a behaviour change from the old code (which called both validators whenever any expression existed). The reasoning is sound — the injection validator only catches user-controlled contexts, and if UnsafeContextPattern found nothing there's nothing to inject — but this invariant is implicit. A short comment would help future reviewers confirm no security check was accidentally dropped.

💡 Suggested comment
// validateNoTemplateInjectionFromParsed is intentionally skipped when
// needsUnsafeCheck=false: the validator only catches user-controlled contexts,
// and the UnsafeContextPattern scan above has already confirmed none are present.
if needsUnsafeCheck {
    templateErr = validateNoTemplateInjectionFromParsed(reparsed)
}

templateErr = validateNoTemplateInjectionFromParsed(reparsed)
}
if templateErr == nil && needsDisallowedCheck {
templateErr = validateNoGitHubExpressionsInRunScriptsFromParsed(reparsed)
}
}
Expand Down
18 changes: 18 additions & 0 deletions pkg/workflow/compiler_template_injection_both_paths_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,17 @@ jobs:
run: echo "${{ job.services['redis'].ports['6379'] }}"
`

flowStyleRegressionYAML := `
name: flow-style-regression
on: workflow_dispatch
jobs:
test:
runs-on: ubuntu-latest
steps:
- { run: "echo ${{ github.actor }}" }
- run: node ${{ runner.temp }}/actions/foo.cjs
`

heredocExpressionYAML := `
name: heredoc-expression
on: workflow_dispatch
Expand Down Expand Up @@ -159,6 +170,13 @@ jobs:
assert.NoError(t, err, "compiler-owned job.services expression should be allowed")
})

t.Run("Path B - schema disabled - flow-style regression detected alongside allowed expression", func(t *testing.T) {
err := compiler.validateTemplateInjection(flowStyleRegressionYAML, lockFile, markdownPath, nil)
require.Error(t, err, "flow-style run expressions should still be detected in the fallback path")
assert.Contains(t, err.Error(), "compiler regression detected")
assert.Contains(t, err.Error(), "github.actor")
})

t.Run("Path A - schema enabled - heredoc expression passes", func(t *testing.T) {
err := compiler.validateTemplateInjection(heredocExpressionYAML, lockFile, markdownPath, parseYAML(t, heredocExpressionYAML))
assert.NoError(t, err, "expressions inside heredoc content should not be flagged")
Expand Down
93 changes: 58 additions & 35 deletions pkg/workflow/template_injection_validation.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,80 +59,103 @@ var (
// allowedRunScriptExpressionRegex matches trusted compiler-owned expressions that are
// intentionally rendered in generated run scripts and are not user-controlled.
allowedRunScriptExpressionRegex = regexp.MustCompile(`^\$\{\{\s*(env\.[^}]+|vars\.[^}]+|runner\.[^}]+|github\.(repository|run_id|workspace)|steps\.parse-guard-vars\.outputs\.(approval_labels|blocked_users|trusted_users)|job\.services\[[^]]+\]\.ports\[[^]]+\])\s*\}\}$`)
runKeyPattern = regexp.MustCompile(`(?:^|[\s{,])(?:run|["']run["']):`)
)

// hasAnyExpressionInRunContent performs a fast line-by-line text scan to determine
// whether any GitHub Actions expression (${{ ... }}) appears inside a YAML run: block.
// Used by the compiler regression guardrail to detect expressions that should have
// been rewritten to env variables.
func hasAnyExpressionInRunContent(yamlContent string) bool {
return hasExpressionInRunContent(yamlContent, InlineExpressionPattern)
}

func hasExpressionInRunContent(yamlContent string, expressionRegex *regexp.Regexp) bool {
// Fast-path: no matching expressions anywhere → definitely no violation.
if !expressionRegex.MatchString(yamlContent) {
return false
func findRunValue(keyPart string) (string, bool) {
loc := runKeyPattern.FindStringIndex(keyPart)
if loc == nil {
return "", false
}
return strings.TrimSpace(keyPart[loc[1]:]), true
}

// Matching expressions exist somewhere; scan for any that appear inside a run: block
// without doing a full YAML parse.
// Use SplitSeq to iterate over lines lazily, avoiding the up-front allocation of the
// full []string slice that strings.Split would create for large YAML content.
// walkRunBlockLines scans raw YAML text and visits each inline run value or line inside a
// multiline run block. It recognizes plain run: keys as well as quoted and flow-style forms
// so Path B stays aligned with the parsed-YAML validators.
func walkRunBlockLines(yamlContent string, visit func(line string) bool) bool {
inRunBlock := false
runBlockIndent := 0

for line := range strings.SplitSeq(yamlContent, "\n") {
// Compute indentation first; skip blank and all-whitespace lines in one step.
trimmed := strings.TrimLeft(line, " \t")
if trimmed == "" {
// Blank / all-whitespace lines are allowed inside block scalars.
continue
}
indent := len(line) - len(trimmed)

if inRunBlock {
// A non-blank line at the same or lesser indentation ends the block.
if indent <= runBlockIndent {
inRunBlock = false
// Fall through: check whether this line starts a new run: block.
} else {
// Inside run block content — check for matching expressions.
if expressionRegex.MatchString(line) {
if visit(line) {
return true
}
continue
}
}

// Outside a run block: look for a run: key.
// Handle both "run: ..." (map key) and "- run: ..." (inline sequence item).
keyPart := trimmed
if strings.HasPrefix(keyPart, "-") {
keyPart = strings.TrimSpace(keyPart[1:])
}
if !strings.HasPrefix(keyPart, "run:") {

rest, ok := findRunValue(keyPart)
if !ok {
continue
}
rest := strings.TrimSpace(keyPart[4:]) // text after "run:"

if rest == "" {
// Empty run: value is unusual; treat conservatively as if block content follows.
inRunBlock = true
runBlockIndent = indent
} else if rest[0] == '|' || rest[0] == '>' {
// Literal or folded block scalar — content is on subsequent lines.
if rest == "" || rest[0] == '|' || rest[0] == '>' {
inRunBlock = true
runBlockIndent = indent
} else {
// Inline run value, e.g. run: echo "hello ${{ github.event.foo }}".
if expressionRegex.MatchString(rest) {
continue
}

if visit(rest) {
return true
}
}

return false
}

// hasNonAllowedExpressionInRunContent performs a fast line-by-line text scan to
// determine whether any GitHub Actions expression (${{ ... }}) that is NOT in the
// compiler-owned allow-list appears inside a YAML run: block.
//
// Unlike hasAnyExpressionInRunContent, this function skips expressions that match
// allowedRunScriptExpressionRegex (e.g. ${{ runner.temp }}, ${{ env.FOO }}). It is
// used as a fast pre-check for the regression guardrail inside validateTemplateInjection
// (Path B) to avoid a yaml.Unmarshal when every run-block expression is compiler-owned.
func hasNonAllowedExpressionInRunContent(yamlContent string) bool {
// Fast-path: no inline expressions anywhere → nothing to check.
if !InlineExpressionPattern.MatchString(yamlContent) {
return false
}

return walkRunBlockLines(yamlContent, func(line string) bool {
if !InlineExpressionPattern.MatchString(line) {
return false
}
for _, expr := range InlineExpressionPattern.FindAllString(line, -1) {
if !allowedRunScriptExpressionRegex.MatchString(expr) {
return true
}
}
return false
})
}

func hasExpressionInRunContent(yamlContent string, expressionRegex *regexp.Regexp) bool {
// Fast-path: no matching expressions anywhere → definitely no violation.
if !expressionRegex.MatchString(yamlContent) {
return false
}

return false
// Matching expressions exist somewhere; scan for any that appear inside a run: block
// without doing a full YAML parse.
return walkRunBlockLines(yamlContent, expressionRegex.MatchString)
}

// validateNoTemplateInjectionFromParsed checks a pre-parsed workflow map for template
Expand Down
Loading
Loading