Skip to content

[cli-consistency] Normalize CLI help text and setup docs across run/upgrade/secrets/mcp-server#26515

Merged
pelikhan merged 7 commits into
mainfrom
copilot/cli-consistency-issues-fix
Apr 15, 2026
Merged

[cli-consistency] Normalize CLI help text and setup docs across run/upgrade/secrets/mcp-server#26515
pelikhan merged 7 commits into
mainfrom
copilot/cli-consistency-issues-fix

Conversation

Copilot AI commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

CLI consistency inspection reported 7 user-facing mismatches in command help and setup docs (grammar, incomplete engine list, inconsistent --approve text, missing MCP tool in docs, ambiguous phrasing, and capitalization drift). This PR aligns those surfaces so command UX and documentation read consistently.

  • CLI help text normalization

    • upgrade: corrected long-description grammar (“for” → “to”).
    • run: clarified interactive-mode wording (“this command enters interactive mode and shows…”).
    • run --approve and upgrade --approve: aligned to the same full safe-update/strict-mode semantics used by compile --approve.
    • secrets bootstrap --engine: updated the advertised engine list to include gemini.
    • mcp add --transport: normalized casing to Docker in help text.
  • Documentation parity (docs/src/content/docs/setup/cli.md)

    • secrets bootstrap options now include gemini engine.
    • mcp-server available tools list now includes checks.
  • Guardrails to prevent regression

    • Added focused tests to assert help/flag text invariants:
      • cmd/gh-aw/main_help_text_test.go
      • pkg/cli/upgrade_command_test.go
      • updates in pkg/cli/secrets_command_test.go
      • updates in pkg/cli/mcp_add_test.go

Example of the normalized --approve text used across commands:

"Approve all safe update changes. When strict mode is active (the default), the compiler emits warnings for new restricted secrets or unapproved action additions/removals not present in the existing gh-aw-manifest. Use this flag to approve and skip safe update enforcement"

Copilot AI and others added 3 commits April 15, 2026 23:33
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a10343a0-9495-4c9d-a3e7-b955ffde414c

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
This reverts commit f9f1777.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a10343a0-9495-4c9d-a3e7-b955ffde414c

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix CLI consistency issues in command help text [cli-consistency] Normalize CLI help text and setup docs across run/upgrade/secrets/mcp-server Apr 15, 2026
Copilot AI requested a review from pelikhan April 15, 2026 23:37
Comment thread docs/src/content/docs/setup/cli.md Outdated
```

**Options:** `--engine` (copilot, claude, codex), `--non-interactive`, `--repo`
**Options:** `--engine` (copilot, claude, codex, custom), `--non-interactive`, `--repo`

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot custom -> Gemini

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 4d63ba4: changed the docs engine list from custom to gemini in /docs/src/content/docs/setup/cli.md.

Comment thread pkg/cli/tokens_bootstrap.go Outdated

cmd.Flags().BoolVar(&nonInteractiveFlag, "non-interactive", false, "Check secrets without prompting (display-only mode)")
cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex)")
cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex, custom)")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot custom -> gemini

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 4d63ba4: changed secrets bootstrap --engine help text from custom to gemini in /pkg/cli/tokens_bootstrap.go, and adjusted the related test.

@pelikhan pelikhan marked this pull request as ready for review April 15, 2026 23:46
Copilot AI review requested due to automatic review settings April 15, 2026 23:46
@github-actions github-actions Bot mentioned this pull request Apr 15, 2026
Copilot AI and others added 2 commits April 15, 2026 23:50
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/91a00b5c-7527-4a62-ad3d-9019c58347c5

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
This reverts commit b028adc.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 60/100

⚠️ Acceptable — with suggestions

Metric Value
New/modified tests analyzed 4
✅ Design tests (behavioral contracts) 4 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 0 (0%)
Duplicate test clusters 0
Test inflation detected Yes (contextually expected — see note)
🚨 Coding-guideline violations None

Test Classification Details

View all 4 tests
Test File Classification Issues Detected
TestRunCommandHelpTextConsistency cmd/gh-aw/main_help_text_test.go:12 ✅ Design No error/edge cases; all assertions have messages ✅
TestMCPAddTransportFlagDescriptionUsesDockerCapitalization pkg/cli/mcp_add_test.go:158 ✅ Design No error/edge cases; uses stdlib t.Fatal/t.Fatalf
TestSecretsBootstrapEngineFlagIncludesCustom pkg/cli/secrets_command_test.go:67 ✅ Design No error/edge cases; all assertions have messages ✅
TestUpgradeCommandHelpTextConsistency pkg/cli/upgrade_command_test.go:12 ✅ Design No error/edge cases; all assertions have messages ✅

Flagged Tests — Suggestions for Improvement

All four tests are design tests enforcing user-visible behavioral contracts (CLI flag names, help text content, flag description consistency). No tests are implementation tests. No coding-guideline violations.

One area for improvement: zero tests include error paths or edge cases.

⚠️ Missing Edge Cases (All 4 Tests)

These tests only verify that flags exist with specific strings in their Usage field — a happy-path contract. Consider adding coverage for:

  • Negative/typo guard: assert the flag description does not contain outdated or incorrect text (e.g., the wrong capitalization that this PR fixed)
  • Cross-command consistency: TestRunCommandHelpTextConsistency already does this for --approve — consider extending the pattern to additional flags that must stay in sync

These tests are intentionally narrow (CLI metadata verification), so zero error paths is understandable. This is a suggestion, not a blocker.


Test Inflation Note

All test files show a >2:1 ratio of test lines added vs. production lines changed (e.g., upgrade_command_test.go adds 21 lines against 2 production lines modified). Per the scoring formula this applies the 10-point penalty — but the context here is normal: this PR specifically adds regression coverage for previously untested CLI help-text contracts, so the production changes are intentionally minimal (string fixes only). The inflation metric is designed to catch filler tests; these tests are genuinely high-value.


Score Breakdown

Component Score
Behavioral Coverage (4/4 design tests × 40) 40/40
Error/Edge Case Coverage (0/4 × 30) 0/30
Low Duplication (0 clusters) 20/20
Proportional Growth (inflation detected) 0/10
Total 60/100

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 4 tests — unit (//go:build !integration)
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). All 4 tests verify user-observable CLI behavioral contracts. The 60-point score reflects only the lack of error/edge-case coverage, which is expected for CLI help-text regression tests.


📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §24484213320

🧪 Test quality analysis by Test Quality Sentinel · ● 627.5K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 60/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 4 tests enforce user-visible CLI behavioral contracts (help text consistency, flag descriptions). Edge-case coverage is low (0/4 tests) but expected for CLI metadata regression tests.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes CLI help text and setup documentation to improve consistency across run, upgrade, secrets bootstrap, and MCP-related commands, and adds regression tests to prevent future drift.

Changes:

  • Normalized user-facing help text (grammar fixes, aligned --approve semantics, added custom to the advertised --engine list, and adjusted MCP transport casing).
  • Updated setup docs to match CLI surfaces (secrets bootstrap engine list and MCP server available tools).
  • Added/updated tests asserting help/flag text invariants for run, upgrade, secrets bootstrap, and mcp add.
Show a summary per file
File Description
pkg/cli/upgrade_command.go Updates upgrade help grammar and aligns --approve usage text.
pkg/cli/upgrade_command_test.go Adds a focused test to lock in upgrade help/flag text invariants.
cmd/gh-aw/main.go Clarifies run interactive-mode wording and aligns run --approve usage with compile.
cmd/gh-aw/main_help_text_test.go Adds a test to ensure run long help and --approve usage matches compile.
pkg/cli/tokens_bootstrap.go Updates secrets bootstrap --engine help text to include custom.
pkg/cli/secrets_command_test.go Adds a test to ensure --engine help includes custom on secrets bootstrap.
pkg/cli/mcp_add.go Changes --transport help text casing to Docker.
pkg/cli/mcp_add_test.go Adds a test asserting the Docker capitalization in --transport usage.
docs/src/content/docs/setup/cli.md Updates setup docs for custom engine and adds checks to MCP server tools list.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 135/135 changed files
  • Comments generated: 6

Comment thread docs/src/content/docs/setup/cli.md Outdated
Comment on lines 223 to 224
**Options:** `--engine` (copilot, claude, codex, custom), `--non-interactive`, `--repo`

Copilot AI Apr 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup docs list --engine values as (copilot, claude, codex, custom), but the CLI supports additional engines (e.g., "gemini"). This makes the docs incomplete for supported usage; consider adding "gemini" here (or documenting the authoritative source of supported engine IDs).

Copilot uses AI. Check for mistakes.
Comment thread cmd/gh-aw/main.go
Comment on lines 736 to 739
runCmd.Flags().Bool("dry-run", false, "Validate workflow without actually triggering execution on GitHub Actions")
runCmd.Flags().BoolP("json", "j", false, "Output results in JSON format")
runCmd.Flags().Bool("approve", false, "Approve all safe update changes during compilation (skip safe update enforcement)")
runCmd.Flags().Bool("approve", false, "Approve all safe update changes. When strict mode is active (the default), the compiler emits warnings for new restricted secrets or unapproved action additions/removals not present in the existing gh-aw-manifest. Use this flag to approve and skip safe update enforcement")
// Register completions for run command

Copilot AI Apr 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --approve usage string is duplicated verbatim across multiple commands (compile/run here, and upgrade in pkg/cli). To reduce future drift (and make the new consistency tests less brittle), consider extracting this text into a shared constant/helper and reusing it for all --approve flags.

Copilot uses AI. Check for mistakes.
Comment on lines 113 to 115
cmd.Flags().Bool("audit", false, "Check dependency health without performing upgrades")
cmd.Flags().Bool("approve", false, "Approve all safe update changes during compilation (skip safe update enforcement)")
cmd.Flags().Bool("approve", false, "Approve all safe update changes. When strict mode is active (the default), the compiler emits warnings for new restricted secrets or unapproved action additions/removals not present in the existing gh-aw-manifest. Use this flag to approve and skip safe update enforcement")
cmd.Flags().Bool("skip-extension-upgrade", false, "Skip automatic extension upgrade (used internally to prevent recursion after upgrade)")

Copilot AI Apr 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --approve flag usage text is a long string duplicated from cmd/gh-aw/main.go (compile/run). Consider extracting this into a shared constant/helper (e.g., in pkg/cli or pkg/constants) and reusing it across commands to prevent future help-text drift.

Copilot uses AI. Check for mistakes.
Comment thread pkg/cli/mcp_add.go
Comment on lines 363 to 365
cmd.Flags().StringVar(&registryURL, "registry", "", "MCP registry URL (default: https://api.mcp.github.com/v0.1)")
cmd.Flags().StringVar(&transportType, "transport", "", "Preferred transport type (stdio, http, docker)")
cmd.Flags().StringVar(&transportType, "transport", "", "Preferred transport type (stdio, http, Docker)")
cmd.Flags().StringVar(&customToolID, "tool-id", "", "Custom tool ID to use in the workflow (default: uses server ID)")

Copilot AI Apr 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --transport help text lists "Docker" with a capital D, but the command only accepts lowercase values (createMCPToolConfig validates "stdio", "http", "docker" exactly and returns an error otherwise). This is user-facing misleading and will cause failures for users who try "--transport Docker". Either change the help text to list lowercase "docker", or normalize the flag value (e.g., strings.ToLower) so "Docker" is accepted, and keep the help/error message consistent with the accepted inputs.

Copilot uses AI. Check for mistakes.
Comment thread pkg/cli/mcp_add_test.go
Comment on lines +160 to +169
func TestMCPAddTransportFlagDescriptionUsesDockerCapitalization(t *testing.T) {
cmd := NewMCPAddSubcommand()
transportFlag := cmd.Flags().Lookup("transport")
if transportFlag == nil {
t.Fatal("expected --transport flag to exist")
}

if !strings.Contains(transportFlag.Usage, "Docker") {
t.Fatalf("expected --transport usage to include Docker, got: %s", transportFlag.Usage)
}

Copilot AI Apr 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test locks in "Docker" capitalization in the --transport usage string, but the actual accepted flag value is lowercase "docker" (and passing "Docker" currently fails validation). Consider updating the assertion to match the canonical/accepted input values (e.g., assert the usage mentions "docker"), or add coverage that the implementation accepts case-insensitive inputs if you want to display "Docker" in help output.

Copilot uses AI. Check for mistakes.
Comment on lines 42 to 44
cmd.Flags().BoolVar(&nonInteractiveFlag, "non-interactive", false, "Check secrets without prompting (display-only mode)")
cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex)")
cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex, custom)")
addRepoFlag(cmd)

Copilot AI Apr 15, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --engine flag help lists (copilot, claude, codex, custom) but the codebase supports additional engines (e.g., constants.EngineOptions includes "gemini"). Since secrets bootstrap uses constants.GetEngineOption(engineFilter), omitting "gemini" makes the help text incomplete/misleading for a supported engine. Consider including "gemini" in this list (and keeping it in sync with supported engines).

Copilot uses AI. Check for mistakes.
Copilot AI requested a review from pelikhan April 15, 2026 23:54
@pelikhan pelikhan merged commit 48e626d into main Apr 15, 2026
@pelikhan pelikhan deleted the copilot/cli-consistency-issues-fix branch April 15, 2026 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[cli-consistency] CLI Consistency Issues - 2026-04-15

3 participants