Skip to content

Specify Copilot SDK bash parser semantics and add language-agnostic conformance vectors#37390

Merged
pelikhan merged 6 commits into
mainfrom
copilot/create-bash-command-parser-specification
Jun 6, 2026
Merged

Specify Copilot SDK bash parser semantics and add language-agnostic conformance vectors#37390
pelikhan merged 6 commits into
mainfrom
copilot/create-bash-command-parser-specification

Conversation

Copilot AI commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

The bash command parser used by copilot_sdk_driver.cjs had behavior encoded in JS/tests but no portable spec. This PR codifies the parser as a formal spec plus reusable test-vector artifacts so equivalent suites can be generated in any language.

  • W3C-style normative specification

    • Added docs/src/content/docs/specs/bash-command-parser-specification.md.
    • Defines conformance classes, RFC2119 requirements, and integration semantics used by driver fallback permission logic.
    • Captures parser boundaries explicitly (lightweight recognizer, not a full POSIX shell parser).
    • Removes implementation-language references from normative sections to keep the spec language-agnostic.
    • Adds a language-neutral type contract for parser APIs (total functions, return-shape guarantees, non-throwing behavior).
  • Formal grammar and executable semantics

    • Added EBNF for:
      • operator-aware splitting (&&, ||, |, ;)
      • quote/subshell shielding
      • command-name extraction (env assignments, redirection, keywords, recursion for !/braces)
    • Documents malformed-input tolerance and default-deny implications at integration layer.
  • Model-based + verification-based test generation method

    • Specifies a state-model approach (TopLevel, quote states, subshell depth) to generate vectors systematically.
    • Specifies metamorphic/property families (whitespace, env-prefix, redirection-suffix, duplicate-collapse, quoted-operator shielding) for verification-driven expansion.
    • Adds explicit testing strategies for deterministic vectors, metamorphic verification, state-space coverage tracking, typed oracle validation, robustness testing, and differential replay across implementations.
  • Minimal conformance suite definition

    • Defines a required minimal suite with core splitter, extractor, pipeline, and robustness categories.
    • Requires 15 baseline conformance tests for cross-language implementations.
  • Portable conformance artifacts

    • Added actions/setup/js/bash_command_parser_spec_vectors.json as machine-readable vectors with stable IDs and provenance (model-based, verification).
    • Added actions/setup/js/bash_command_parser_spec_vectors.test.cjs to execute vectors and metamorphic relations against current parser behavior.
{
  "id": "BP-MR-004",
  "function": "extractCommandNamesFromPipeline",
  "relation": "duplicate-collapse",
  "left": "echo a && cat b && echo c",
  "right": "echo a && cat b",
  "expected": ["echo", "cat"]
}

Copilot AI and others added 4 commits June 6, 2026 19:59
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Add W3C bash parser specification and conformance vectors Specify Copilot SDK bash parser semantics and add language-agnostic conformance vectors Jun 6, 2026
Copilot AI requested a review from pelikhan June 6, 2026 20:02
@pelikhan

pelikhan commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

@copilot

  • remove javascript references from the specification
  • review and improve specification as a Parser generator expert / Type System specialist. Define a minimal test suite. Define testing strategies.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 6, 2026

Copy link
Copy Markdown
Contributor Author

@copilot

  • remove javascript references from the specification
  • review and improve specification as a Parser generator expert / Type System specialist. Define a minimal test suite. Define testing ...

Implemented in 021f9c0.

Updated the specification to remove JavaScript-specific references, added a language-neutral type contract, defined a required minimal conformance suite (15 core tests), and added explicit testing strategies for deterministic vectors, metamorphic checks, coverage tracking, typed oracles, robustness testing, and differential replay across implementations.

@pelikhan pelikhan marked this pull request as ready for review June 6, 2026 20:26
Copilot AI review requested due to automatic review settings June 6, 2026 20:26
@pelikhan pelikhan merged commit 69916a5 into main Jun 6, 2026
4 of 8 checks passed
@pelikhan pelikhan deleted the copilot/create-bash-command-parser-specification branch June 6, 2026 20:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR formalizes the existing Copilot SDK bash command parser behavior as a portable, language-agnostic specification and adds machine-readable conformance vectors (plus a Vitest runner) so equivalent test suites can be generated and verified across implementations.

Changes:

  • Added a W3C-style normative spec describing splitter/extractor/pipeline semantics and driver-integration behavior.
  • Added a versioned JSON corpus of deterministic vectors plus metamorphic relation cases.
  • Added a Vitest test runner that executes the vector corpus against the current JS reference parser.
Show a summary per file
File Description
docs/src/content/docs/specs/bash-command-parser-specification.md Introduces the normative spec, grammar, and conformance guidance for the parser.
actions/setup/js/bash_command_parser_spec_vectors.json Adds portable, versioned conformance vectors and metamorphic relations.
actions/setup/js/bash_command_parser_spec_vectors.test.cjs Executes the vectors/relations against the reference JS implementation via Vitest.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 6

Comment on lines +127 to +130
env_assign = ident , "=" , nonspace* ;
ident = ("_" | letter) , { "_" | letter | digit } ;

core = negation | brace | keyword | redirection | word | empty ;
Comment on lines +135 to +137
redirection = ("<" | ">") , nonspace*
| digits , ("<" | ">" | "&") , nonspace* ;
word = nonspace , nonspace* ;

- `StringLike` includes any runtime value accepted by the host language.
- `Segment` and `CommandName` are textual values.
- `Option<T>` is either `None` (no command identified) or `Some(T)`.
3. Operators inside single quotes, double quotes, or `$(` `)` regions MUST NOT split.
4. Output segments MUST be trimmed.
5. Empty segments MUST be removed.
6. The function SHOULD be non-throwing for malformed input.
Comment on lines +165 to +169
1. Non-string or blank segment MUST return `null`.
2. Leading environment assignments (`IDENTIFIER=\S*`) MUST be stripped repeatedly.
3. If the first token is redirection (`^[<>]` or `^\d+[<>&]`), return `null`.
4. If the first token is `!`, `{`, or `}`, extraction MUST recurse on the remainder.
5. If the first token is a shell keyword (`then`, `else`, `elif`, `fi`, `do`, `done`, `esac`, `in`, `function`, `time`, `coproc`), return `null`.
1. Non-string or blank input MUST return `[]`.
2. Input MUST be split using §4.1.
3. Each segment MUST be extracted using §4.2.
4. Null extraction results MUST be ignored.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants