Skip to content

fix(ir): reject cross-pipeline duplicate StepIds; strict de-indent in lower_raw_yaml; document EnvValue::Literal contract#992

Merged
jamesadevine merged 1 commit into
mainfrom
fix-ir-follow-ups
Jun 13, 2026
Merged

fix(ir): reject cross-pipeline duplicate StepIds; strict de-indent in lower_raw_yaml; document EnvValue::Literal contract#992
jamesadevine merged 1 commit into
mainfrom
fix-ir-follow-ups

Conversation

@jamesadevine

Copy link
Copy Markdown
Collaborator

Three follow-ups from the post-merge review of the native IR PR (#960).

1. Bug — cross-pipeline duplicate StepIds silently overwrite step_locations

src/compile/ir/graph.rs::index_job_steps only rejected duplicate StepIds within the same job. Two jobs each carrying a step with the same StepId (e.g. a future extension author accidentally reuses "synthPr" or "marker") would silently overwrite step_locations, and every OutputRef pointing to that StepId would resolve to whichever producer was indexed last.

The IR''s OutputRef carries only StepId (no job qualifier), so the implicit contract is that StepIds must be pipeline-wide unique — but that constraint was neither enforced nor documented.

  • index_job_steps now rejects any duplicate StepId regardless of job / stage, with a typed error citing both previous and current locations.
  • Uniqueness contract is now documented on the StepId module-level doc (src/compile/ir/ids.rs) and on the OutputRef type-level doc (src/compile/ir/output.rs).
  • Two new tests cover the cross-job and cross-stage cases.

2. Suggestion — strict de-indent in lower_raw_yaml

The 2-space de-indent for - prefixed bodies used .unwrap_or(line) as fallback. For inputs from step_to_raw_yaml_string the fallback was dead code, but any future caller with a different indentation assumption would silently produce misaligned YAML — surfacing as a serde_yaml parse failure several stack frames later.

Replaced with strict ok_or_else returning a typed error citing the offending line. Empty / whitespace-only continuation lines stay verbatim (legal between mapping keys, between block scalars, etc.).

Two new tests cover the strict rejection and the blank-continuation-line allowance.

3. Suggestion — document EnvValue::Literal compiler-internal contract

lower_env_value emits Literal(s) verbatim with no sanitisation. All current construction sites use hardcoded strings, but the type accepts impl Into<String> — a future EnvValue::literal(user_supplied_string) would silently embed raw user input in the env: block.

The variant doc-comment now reads "Compiler-internal use only." and lists the safe input categories (hardcoded strings, constants, values pre-routed through reject_pipeline_injection) with pointers to PipelineVar / Secret / AdoMacro as the right choices for runtime-resolved values. The EnvValue::literal constructor doc references the contract.

Validation

  • cargo build clean
  • cargo test1833 passed, +4 new tests
  • cargo clippy --all-targets --all-features clean
  • cargo test --test bash_lint_tests clean
  • cargo run -- compile --force across all 33 fixtures produces zero lock-file drift — the new validation rejects only contract-violating input that the production compile path never emits today.

… lower_raw_yaml; document EnvValue::Literal contract

Three follow-ups from the post-merge review of the native IR PR.

1. Bug: src/compile/ir/graph.rs index_job_steps only rejected
   duplicate StepIds within the SAME job. Two jobs each carrying a
   step with the same StepId (e.g. a future extension author
   accidentally reuses "synthPr" or "marker") would silently
   overwrite step_locations, and every OutputRef pointing to that
   StepId would resolve to whichever producer was indexed last. The
   IR's OutputRef carries only StepId (no job qualifier), so the
   implicit contract is that StepIds must be pipeline-wide unique
   for correct producer resolution - but that constraint was
   neither enforced nor documented.

   - index_job_steps now rejects any duplicate StepId regardless of
     job / stage, with a typed error citing both previous and
     current locations.
   - Uniqueness contract is now documented on the StepId module-level
     doc (src/compile/ir/ids.rs) and on the OutputRef type-level
     doc (src/compile/ir/output.rs).
   - Two new tests: cross_job_duplicate_step_id_is_rejected and
     cross_stage_duplicate_step_id_is_rejected.

2. Suggestion: src/compile/ir/lower.rs lower_raw_yaml's 2-space
   de-indent for `- ` prefixed bodies used .unwrap_or(line) as
   fallback. For inputs from step_to_raw_yaml_string the fallback
   was dead code, but for any future caller passing a string with
   different indentation the fallback would silently produce
   misaligned YAML (deferred parse failure several stack frames
   later in serde_yaml).

   Replaced with strict ok_or_else returning a typed error citing
   the offending line. Empty / whitespace-only continuation lines
   are kept verbatim (legal between mapping keys, between block
   scalars, etc.). Two new tests:
   raw_yaml_dash_prefix_rejects_unindented_continuation and
   raw_yaml_dash_prefix_accepts_blank_continuation_lines.

3. Suggestion: src/compile/ir/env.rs EnvValue::Literal had no
   doc-comment marking the compiler-internal contract. lower_env_value
   emits Literal(s) verbatim with no sanitisation; all current
   construction sites use hardcoded strings, but the type accepts
   any impl Into<String>. A future EnvValue::literal(user_string)
   would silently embed raw user input in the env: block.

   Doc-comment on the variant now says "**Compiler-internal use
   only.**" and lists the safe input categories (hardcoded strings,
   constants, values pre-routed through reject_pipeline_injection)
   with pointers to PipelineVar / Secret / AdoMacro as the right
   choices for runtime-resolved values. Constructor doc updated to
   reference the contract.

Validation: cargo build, cargo test (1833 passed, +4 new), cargo
clippy --all-targets --all-features, cargo test --test
bash_lint_tests, cargo run -- compile --force across all 33 fixtures
produces zero lock-file drift - the new validation rejects only
contract-violating input that the production compile path never
emits today.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jamesadevine jamesadevine merged commit 28403fe into main Jun 13, 2026
11 checks passed
@jamesadevine jamesadevine deleted the fix-ir-follow-ups branch June 13, 2026 12:21
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: looks good — three well-targeted fixes with clear rationale, good error messages, and adequate test coverage.

Findings

✅ What Looks Good

graph.rs — cross-pipeline duplicate StepId rejection is the most impactful change. The old guard (prev.stage == stage && prev.job == job.id) only caught same-job duplicates; two jobs independently holding StepId("synthPr") would silently shadow each other in step_locations. The new code correctly lifts the check to pipeline-wide scope. The error message (previously declared in <stage>.<job>) cites both collision sites, which makes it immediately actionable.

lower.rs — strict de-indent correctly converts the silent-misalignment path into a hard compiler error. The blank-line fast-path (line.trim().is_empty()) handles block-scalar terminators and spacing between mapping keys cleanly. The test for blank continuation lines correctly validates displayName is present in the parsed mapping — a stronger assertion than just checking no error is returned.

Documentation trilogy (ids.rs / output.rs / env.rs) fills the right gaps: the StepId uniqueness contract was an implicit assumption; now it's a first-class invariant with cross-references pointing from the type to the enforcing site.

⚠️ Suggestions

  • src/compile/ir/env.rsEnvValue::Literal is documented as "compiler-internal use only" but remains pub. Since this is a binary crate the blast radius is limited, but a pub(crate) visibility would make the contract mechanically enforced rather than advisory-only. Worth a follow-up if the IR is ever extracted into a library crate.

  • lower.rs — capacity hint (very minor): String::with_capacity(rest.len()) is fine but slightly over-estimated — every stripped continuation line is 2 bytes shorter in the output than in rest. Not a bug, just a note if memory footprint for large YAML blocks ever matters.

  • cross_stage_duplicate_step_id_is_rejected test only asserts one message fragment vs. three in the cross-job counterpart. Both tests are correct, but aligning the assertion depth would make the stage test equally diagnostic on a regression.

Generated by Rust PR Reviewer for issue #992 · sonnet46 939.7K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant