Specify Copilot SDK bash parser semantics and add language-agnostic conformance vectors#37390
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Implemented in 021f9c0. Updated the specification to remove JavaScript-specific references, added a language-neutral type contract, defined a required minimal conformance suite (15 core tests), and added explicit testing strategies for deterministic vectors, metamorphic checks, coverage tracking, typed oracles, robustness testing, and differential replay across implementations. |
There was a problem hiding this comment.
Pull request overview
This PR formalizes the existing Copilot SDK bash command parser behavior as a portable, language-agnostic specification and adds machine-readable conformance vectors (plus a Vitest runner) so equivalent test suites can be generated and verified across implementations.
Changes:
- Added a W3C-style normative spec describing splitter/extractor/pipeline semantics and driver-integration behavior.
- Added a versioned JSON corpus of deterministic vectors plus metamorphic relation cases.
- Added a Vitest test runner that executes the vector corpus against the current JS reference parser.
Show a summary per file
| File | Description |
|---|---|
| docs/src/content/docs/specs/bash-command-parser-specification.md | Introduces the normative spec, grammar, and conformance guidance for the parser. |
| actions/setup/js/bash_command_parser_spec_vectors.json | Adds portable, versioned conformance vectors and metamorphic relations. |
| actions/setup/js/bash_command_parser_spec_vectors.test.cjs | Executes the vectors/relations against the reference JS implementation via Vitest. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 6
| env_assign = ident , "=" , nonspace* ; | ||
| ident = ("_" | letter) , { "_" | letter | digit } ; | ||
|
|
||
| core = negation | brace | keyword | redirection | word | empty ; |
| redirection = ("<" | ">") , nonspace* | ||
| | digits , ("<" | ">" | "&") , nonspace* ; | ||
| word = nonspace , nonspace* ; |
|
|
||
| - `StringLike` includes any runtime value accepted by the host language. | ||
| - `Segment` and `CommandName` are textual values. | ||
| - `Option<T>` is either `None` (no command identified) or `Some(T)`. |
| 3. Operators inside single quotes, double quotes, or `$(` `)` regions MUST NOT split. | ||
| 4. Output segments MUST be trimmed. | ||
| 5. Empty segments MUST be removed. | ||
| 6. The function SHOULD be non-throwing for malformed input. |
| 1. Non-string or blank segment MUST return `null`. | ||
| 2. Leading environment assignments (`IDENTIFIER=\S*`) MUST be stripped repeatedly. | ||
| 3. If the first token is redirection (`^[<>]` or `^\d+[<>&]`), return `null`. | ||
| 4. If the first token is `!`, `{`, or `}`, extraction MUST recurse on the remainder. | ||
| 5. If the first token is a shell keyword (`then`, `else`, `elif`, `fi`, `do`, `done`, `esac`, `in`, `function`, `time`, `coproc`), return `null`. |
| 1. Non-string or blank input MUST return `[]`. | ||
| 2. Input MUST be split using §4.1. | ||
| 3. Each segment MUST be extracted using §4.2. | ||
| 4. Null extraction results MUST be ignored. |
The bash command parser used by
copilot_sdk_driver.cjshad behavior encoded in JS/tests but no portable spec. This PR codifies the parser as a formal spec plus reusable test-vector artifacts so equivalent suites can be generated in any language.W3C-style normative specification
docs/src/content/docs/specs/bash-command-parser-specification.md.Formal grammar and executable semantics
&&,||,|,;)!/braces)Model-based + verification-based test generation method
TopLevel, quote states, subshell depth) to generate vectors systematically.Minimal conformance suite definition
Portable conformance artifacts
actions/setup/js/bash_command_parser_spec_vectors.jsonas machine-readable vectors with stable IDs and provenance (model-based,verification).actions/setup/js/bash_command_parser_spec_vectors.test.cjsto execute vectors and metamorphic relations against current parser behavior.{ "id": "BP-MR-004", "function": "extractCommandNamesFromPipeline", "relation": "duplicate-collapse", "left": "echo a && cat b && echo c", "right": "echo a && cat b", "expected": ["echo", "cat"] }