[cli-consistency] Normalize CLI help text and setup docs across run/upgrade/secrets/mcp-server by Copilot · Pull Request #26515 · github/gh-aw

Copilot · 2026-04-15T23:26:47Z

CLI consistency inspection reported 7 user-facing mismatches in command help and setup docs (grammar, incomplete engine list, inconsistent --approve text, missing MCP tool in docs, ambiguous phrasing, and capitalization drift). This PR aligns those surfaces so command UX and documentation read consistently.

CLI help text normalization
- upgrade: corrected long-description grammar (“for” → “to”).
- run: clarified interactive-mode wording (“this command enters interactive mode and shows…”).
- run --approve and upgrade --approve: aligned to the same full safe-update/strict-mode semantics used by compile --approve.
- secrets bootstrap --engine: updated the advertised engine list to include gemini.
- mcp add --transport: normalized casing to Docker in help text.
Documentation parity (docs/src/content/docs/setup/cli.md)
- secrets bootstrap options now include gemini engine.
- mcp-server available tools list now includes checks.
Guardrails to prevent regression
- Added focused tests to assert help/flag text invariants:
  - cmd/gh-aw/main_help_text_test.go
  - pkg/cli/upgrade_command_test.go
  - updates in pkg/cli/secrets_command_test.go
  - updates in pkg/cli/mcp_add_test.go

Example of the normalized --approve text used across commands:

"Approve all safe update changes. When strict mode is active (the default), the compiler emits warnings for new restricted secrets or unapproved action additions/removals not present in the existing gh-aw-manifest. Use this flag to approve and skip safe update enforcement"

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a10343a0-9495-4c9d-a3e7-b955ffde414c Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

This reverts commit f9f1777. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a10343a0-9495-4c9d-a3e7-b955ffde414c Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

pelikhan · 2026-04-15T23:46:02Z

 ```

-**Options:** `--engine` (copilot, claude, codex), `--non-interactive`, `--repo`
+**Options:** `--engine` (copilot, claude, codex, custom), `--non-interactive`, `--repo`


@copilot custom -> Gemini

Updated in 4d63ba4: changed the docs engine list from custom to gemini in /docs/src/content/docs/setup/cli.md.

pelikhan · 2026-04-15T23:46:26Z


 	cmd.Flags().BoolVar(&nonInteractiveFlag, "non-interactive", false, "Check secrets without prompting (display-only mode)")
-	cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex)")
+	cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex, custom)")


@copilot custom -> gemini

Updated in 4d63ba4: changed secrets bootstrap --engine help text from custom to gemini in /pkg/cli/tokens_bootstrap.go, and adjusted the related test.

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/91a00b5c-7527-4a62-ad3d-9019c58347c5 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

This reverts commit b028adc. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-04-15T23:51:12Z

🧪 Test Quality Sentinel Report

Test Quality Score: 60/100

⚠️ Acceptable — with suggestions

Metric	Value
New/modified tests analyzed	4
✅ Design tests (behavioral contracts)	4 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	0 (0%)
Duplicate test clusters	0
Test inflation detected	Yes (contextually expected — see note)
🚨 Coding-guideline violations	None

Test Classification Details

View all 4 tests

Test	File	Classification	Issues Detected
`TestRunCommandHelpTextConsistency`	`cmd/gh-aw/main_help_text_test.go:12`	✅ Design	No error/edge cases; all assertions have messages ✅
`TestMCPAddTransportFlagDescriptionUsesDockerCapitalization`	`pkg/cli/mcp_add_test.go:158`	✅ Design	No error/edge cases; uses stdlib `t.Fatal`/`t.Fatalf` ✅
`TestSecretsBootstrapEngineFlagIncludesCustom`	`pkg/cli/secrets_command_test.go:67`	✅ Design	No error/edge cases; all assertions have messages ✅
`TestUpgradeCommandHelpTextConsistency`	`pkg/cli/upgrade_command_test.go:12`	✅ Design	No error/edge cases; all assertions have messages ✅

Flagged Tests — Suggestions for Improvement

All four tests are design tests enforcing user-visible behavioral contracts (CLI flag names, help text content, flag description consistency). No tests are implementation tests. No coding-guideline violations.

One area for improvement: zero tests include error paths or edge cases.

⚠️ Missing Edge Cases (All 4 Tests)

These tests only verify that flags exist with specific strings in their Usage field — a happy-path contract. Consider adding coverage for:

Negative/typo guard: assert the flag description does not contain outdated or incorrect text (e.g., the wrong capitalization that this PR fixed)
Cross-command consistency: TestRunCommandHelpTextConsistency already does this for --approve — consider extending the pattern to additional flags that must stay in sync

These tests are intentionally narrow (CLI metadata verification), so zero error paths is understandable. This is a suggestion, not a blocker.

Test Inflation Note

All test files show a >2:1 ratio of test lines added vs. production lines changed (e.g., upgrade_command_test.go adds 21 lines against 2 production lines modified). Per the scoring formula this applies the 10-point penalty — but the context here is normal: this PR specifically adds regression coverage for previously untested CLI help-text contracts, so the production changes are intentionally minimal (string fixes only). The inflation metric is designed to catch filler tests; these tests are genuinely high-value.

Score Breakdown

Component	Score
Behavioral Coverage (4/4 design tests × 40)	40/40
Error/Edge Case Coverage (0/4 × 30)	0/30
Low Duplication (0 clusters)	20/20
Proportional Growth (inflation detected)	0/10
Total	60/100

Language Support

Tests analyzed:

🐹 Go (*_test.go): 4 tests — unit (//go:build !integration)
🟨 JavaScript (*.test.cjs, *.test.js): 0 tests

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 4 tests verify user-observable CLI behavioral contracts. The 60-point score reflects only the lack of error/edge-case coverage, which is expected for CLI help-text regression tests.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §24484213320

🧪 Test quality analysis by Test Quality Sentinel · ● 627.5K · ◷

github-actions

✅ Test Quality Sentinel: 60/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 4 tests enforce user-visible CLI behavioral contracts (help text consistency, flag descriptions). Edge-case coverage is low (0/4 tests) but expected for CLI metadata regression tests.

Copilot

Pull request overview

This PR standardizes CLI help text and setup documentation to improve consistency across run, upgrade, secrets bootstrap, and MCP-related commands, and adds regression tests to prevent future drift.

Changes:

Normalized user-facing help text (grammar fixes, aligned --approve semantics, added custom to the advertised --engine list, and adjusted MCP transport casing).
Updated setup docs to match CLI surfaces (secrets bootstrap engine list and MCP server available tools).
Added/updated tests asserting help/flag text invariants for run, upgrade, secrets bootstrap, and mcp add.

Show a summary per file

File	Description
`pkg/cli/upgrade_command.go`	Updates upgrade help grammar and aligns `--approve` usage text.
`pkg/cli/upgrade_command_test.go`	Adds a focused test to lock in upgrade help/flag text invariants.
`cmd/gh-aw/main.go`	Clarifies `run` interactive-mode wording and aligns `run --approve` usage with `compile`.
`cmd/gh-aw/main_help_text_test.go`	Adds a test to ensure `run` long help and `--approve` usage matches `compile`.
`pkg/cli/tokens_bootstrap.go`	Updates `secrets bootstrap --engine` help text to include `custom`.
`pkg/cli/secrets_command_test.go`	Adds a test to ensure `--engine` help includes `custom` on `secrets bootstrap`.
`pkg/cli/mcp_add.go`	Changes `--transport` help text casing to `Docker`.
`pkg/cli/mcp_add_test.go`	Adds a test asserting the `Docker` capitalization in `--transport` usage.
`docs/src/content/docs/setup/cli.md`	Updates setup docs for `custom` engine and adds `checks` to MCP server tools list.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 135/135 changed files
Comments generated: 6

Copilot · 2026-04-15T23:51:31Z

+**Options:** `--engine` (copilot, claude, codex, custom), `--non-interactive`, `--repo`



The setup docs list --engine values as (copilot, claude, codex, custom), but the CLI supports additional engines (e.g., "gemini"). This makes the docs incomplete for supported usage; consider adding "gemini" here (or documenting the authoritative source of supported engine IDs).

Copilot · 2026-04-15T23:51:32Z

 	runCmd.Flags().Bool("dry-run", false, "Validate workflow without actually triggering execution on GitHub Actions")
 	runCmd.Flags().BoolP("json", "j", false, "Output results in JSON format")
-	runCmd.Flags().Bool("approve", false, "Approve all safe update changes during compilation (skip safe update enforcement)")
+	runCmd.Flags().Bool("approve", false, "Approve all safe update changes. When strict mode is active (the default), the compiler emits warnings for new restricted secrets or unapproved action additions/removals not present in the existing gh-aw-manifest. Use this flag to approve and skip safe update enforcement")
 	// Register completions for run command


The --approve usage string is duplicated verbatim across multiple commands (compile/run here, and upgrade in pkg/cli). To reduce future drift (and make the new consistency tests less brittle), consider extracting this text into a shared constant/helper and reusing it for all --approve flags.

Copilot · 2026-04-15T23:51:32Z

 	cmd.Flags().Bool("audit", false, "Check dependency health without performing upgrades")
-	cmd.Flags().Bool("approve", false, "Approve all safe update changes during compilation (skip safe update enforcement)")
+	cmd.Flags().Bool("approve", false, "Approve all safe update changes. When strict mode is active (the default), the compiler emits warnings for new restricted secrets or unapproved action additions/removals not present in the existing gh-aw-manifest. Use this flag to approve and skip safe update enforcement")
 	cmd.Flags().Bool("skip-extension-upgrade", false, "Skip automatic extension upgrade (used internally to prevent recursion after upgrade)")


The --approve flag usage text is a long string duplicated from cmd/gh-aw/main.go (compile/run). Consider extracting this into a shared constant/helper (e.g., in pkg/cli or pkg/constants) and reusing it across commands to prevent future help-text drift.

Copilot · 2026-04-15T23:51:32Z

 	cmd.Flags().StringVar(&registryURL, "registry", "", "MCP registry URL (default: https://api.mcp.github.com/v0.1)")
-	cmd.Flags().StringVar(&transportType, "transport", "", "Preferred transport type (stdio, http, docker)")
+	cmd.Flags().StringVar(&transportType, "transport", "", "Preferred transport type (stdio, http, Docker)")
 	cmd.Flags().StringVar(&customToolID, "tool-id", "", "Custom tool ID to use in the workflow (default: uses server ID)")


The --transport help text lists "Docker" with a capital D, but the command only accepts lowercase values (createMCPToolConfig validates "stdio", "http", "docker" exactly and returns an error otherwise). This is user-facing misleading and will cause failures for users who try "--transport Docker". Either change the help text to list lowercase "docker", or normalize the flag value (e.g., strings.ToLower) so "Docker" is accepted, and keep the help/error message consistent with the accepted inputs.

Copilot · 2026-04-15T23:51:33Z

+func TestMCPAddTransportFlagDescriptionUsesDockerCapitalization(t *testing.T) {
+	cmd := NewMCPAddSubcommand()
+	transportFlag := cmd.Flags().Lookup("transport")
+	if transportFlag == nil {
+		t.Fatal("expected --transport flag to exist")
+	}
+
+	if !strings.Contains(transportFlag.Usage, "Docker") {
+		t.Fatalf("expected --transport usage to include Docker, got: %s", transportFlag.Usage)
+	}


This test locks in "Docker" capitalization in the --transport usage string, but the actual accepted flag value is lowercase "docker" (and passing "Docker" currently fails validation). Consider updating the assertion to match the canonical/accepted input values (e.g., assert the usage mentions "docker"), or add coverage that the implementation accepts case-insensitive inputs if you want to display "Docker" in help output.

Copilot · 2026-04-15T23:51:33Z

 	cmd.Flags().BoolVar(&nonInteractiveFlag, "non-interactive", false, "Check secrets without prompting (display-only mode)")
-	cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex)")
+	cmd.Flags().StringVarP(&engineFlag, "engine", "e", "", "Check tokens for specific engine (copilot, claude, codex, custom)")
 	addRepoFlag(cmd)


The --engine flag help lists (copilot, claude, codex, custom) but the codebase supports additional engines (e.g., constants.EngineOptions includes "gemini"). Since secrets bootstrap uses constants.GetEngineOption(engineFilter), omitting "gemini" makes the help text incomplete/misleading for a supported engine. Consider including "gemini" in this list (and keeping it in sync with supported engines).

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/91a00b5c-7527-4a62-ad3d-9019c58347c5 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Initial plan

b676f68

Copilot AI assigned Copilot and pelikhan Apr 15, 2026

Copilot started work on behalf of pelikhan April 15, 2026 23:26 View session

Copilot AI linked an issue Apr 15, 2026 that may be closed by this pull request

[cli-consistency] CLI Consistency Issues - 2026-04-15 #26446

Closed

Copilot AI and others added 3 commits April 15, 2026 23:33

chore: plan CLI consistency fixes

f9f1777

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a10343a0-9495-4c9d-a3e7-b955ffde414c Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Revert "chore: plan CLI consistency fixes"

89793c0

This reverts commit f9f1777. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

fix: align CLI help text and docs consistency

36970af

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a10343a0-9495-4c9d-a3e7-b955ffde414c Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix CLI consistency issues in command help text~~ [cli-consistency] Normalize CLI help text and setup docs across run/upgrade/secrets/mcp-server Apr 15, 2026

Copilot AI requested a review from pelikhan April 15, 2026 23:37

Copilot finished work on behalf of pelikhan April 15, 2026 23:37

pelikhan reviewed Apr 15, 2026

View reviewed changes

pelikhan marked this pull request as ready for review April 15, 2026 23:46

Copilot AI review requested due to automatic review settings April 15, 2026 23:46

Copilot started work on behalf of pelikhan April 15, 2026 23:46 View session

Copilot started reviewing on behalf of pelikhan April 15, 2026 23:47 View session

github-actions Bot mentioned this pull request Apr 15, 2026

[aw] No-Op Runs #25214

Closed

Copilot AI and others added 2 commits April 15, 2026 23:50

chore: plan review comment updates

b028adc

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/91a00b5c-7527-4a62-ad3d-9019c58347c5 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Revert "chore: plan review comment updates"

bf8118b

This reverts commit b028adc. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions Bot approved these changes Apr 15, 2026

View reviewed changes

Copilot AI reviewed Apr 15, 2026

View reviewed changes

fix: use gemini in secrets bootstrap engine list

4d63ba4

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/91a00b5c-7527-4a62-ad3d-9019c58347c5 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 15, 2026 23:54

Copilot AI requested a review from pelikhan April 15, 2026 23:54

pelikhan merged commit 48e626d into main Apr 15, 2026

pelikhan deleted the copilot/cli-consistency-issues-fix branch April 15, 2026 23:58

github-actions Bot mentioned this pull request Apr 16, 2026

Smoke Test: Claude - 24484587539 #26525

Closed

		Options: `--engine` (copilot, claude, codex, custom), `--non-interactive`, `--repo`

Uh oh!

Conversation

Copilot AI commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pelikhan Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 15, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 60/100

Test Classification Details

Flagged Tests — Suggestions for Improvement

⚠️ Missing Edge Cases (All 4 Tests)

Test Inflation Note

Score Breakdown

Language Support

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 15, 2026 •

edited

Loading