Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 8 additions & 12 deletions .github/workflows/smoke-codex.lock.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 13 additions & 9 deletions .github/workflows/smoke-codex.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,18 +96,22 @@ checkout:

## Output

Add a **very brief** comment (max 5-10 lines) to the current pull request with:
- PR titles only (no descriptions)
- ✅ or ❌ for each test result
- Overall status: PASS or FAIL
**Check `${{ github.event_name }}` to determine the correct output action:**

If all tests pass:
- Use the `add_labels` safe-output tool to add the label `smoke-codex` to the pull request
- Use the `remove_labels` safe-output tool to remove the label `smoke` from the pull request
- Use the `unassign_from_user` safe-output tool to unassign the user `githubactionagent` from the pull request (this is a fictitious user used for testing)
- **If `${{ github.event_name }}` is `pull_request`**: Use the `add_comment` safe-output tool to add a **very brief** comment (max 5-10 lines) to the triggering pull request, specifying `item_number: ${{ github.event.pull_request.number }}` (use this exact number — do NOT search GitHub for a PR):
- PR titles only (no descriptions)
- ✅ or ❌ for each test result
- Overall status: PASS or FAIL

- **If `${{ github.event_name }}` is `schedule` or `workflow_dispatch`**: Call the `noop` safe-output tool with a brief summary of the test results (there is no pull request to comment on for scheduled or manually triggered runs).
Comment on lines +99 to +106

Copilot AI Mar 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github.event_name is not currently evaluated/interpolated into the prompt (the runtime expression evaluator only exposes a limited github.* context, and event_name isn’t included). As written, the agent will likely see the literal string ${{ github.event_name }} and won’t be able to reliably branch between pull_request vs schedule/workflow_dispatch. Consider switching these checks to an expression that is resolvable (e.g., ${{ env.GH_AW_GITHUB_EVENT_NAME }} now that the lock workflow exports it), or include the event name in the rendered <github-context> section and instruct the agent to branch on that value.

See below for a potential fix:

**Check `${{ env.GH_AW_GITHUB_EVENT_NAME }}` (the GitHub event name) to determine the correct output action:**

- **If `${{ env.GH_AW_GITHUB_EVENT_NAME }}` is `pull_request`**: Use the `add_comment` safe-output tool to add a **very brief** comment (max 5-10 lines) to the triggering pull request, specifying `item_number: ${{ github.event.pull_request.number }}` (use this exact number — do NOT search GitHub for a PR):
  - PR titles only (no descriptions)
  - ✅ or ❌ for each test result
  - Overall status: PASS or FAIL

- **If `${{ env.GH_AW_GITHUB_EVENT_NAME }}` is `schedule` or `workflow_dispatch`**: Call the `noop` safe-output tool with a brief summary of the test results (there is no pull request to comment on for scheduled or manually triggered runs).

If all tests pass and `${{ env.GH_AW_GITHUB_EVENT_NAME }}` is `pull_request`:

Copilot uses AI. Check for mistakes.

If all tests pass and this workflow was triggered by a `pull_request` event:
- Use the `add_labels` safe-output tool to add the label `smoke-codex` to the pull request (use `item_number: ${{ github.event.pull_request.number }}`)
- Use the `remove_labels` safe-output tool to remove the label `smoke` from the pull request (use `item_number: ${{ github.event.pull_request.number }}`)
- Use the `unassign_from_user` safe-output tool to unassign the user `githubactionagent` from the pull request (this is a fictitious user used for testing; use `item_number: ${{ github.event.pull_request.number }}`)
- Use the `add_smoked_label` safe-output action tool to add the label `smoked` to the pull request (call it with `{"labels": "smoked", "number": "${{ github.event.pull_request.number }}"}`)

**Important**: If no action is needed after completing your analysis, you **MUST** call the `noop` safe-output tool with a brief explanation. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.
**Important**: You **MUST** always call exactly one safe-output tool. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.

Copilot AI Mar 20, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Output section asks for multiple safe-output actions on success (add_comment + add_labels/remove_labels/unassign/add_smoked_label), but then states: “You MUST always call exactly one safe-output tool.” This is contradictory and will cause the agent to choose only one action (or skip label cleanup) to satisfy the “exactly one” rule. Align this with the established safe-outputs convention in other workflows (e.g., smoke-copilot.md:174) by requiring at least one safe-output tool call, and using noop only when no other actions are needed.

Suggested change
**Important**: You **MUST** always call exactly one safe-output tool. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.
**Important**: You **MUST** always call at least one safe-output tool, and you may call multiple safe-output tools in a single response when instructed (for example, `add_comment` plus label updates). Use `noop` only when no other safe-output actions are needed. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.

Copilot uses AI. Check for mistakes.

```json
{"noop": {"message": "No action needed: [brief explanation of what was analyzed and why]"}}
Expand Down
Loading