Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/agentics-maintenance.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@
#
# This file defines the generated agentic maintenance workflow for this repository.
# It runs scheduled cleanup for expiring safe outputs and supports manual maintenance operations.
#
#
# This workflow is generated automatically when workflows use expiring safe outputs
# or when repository maintenance features are enabled in .github/workflows/aw.json.
#
#
# To disable maintenance workflow generation, set in .github/workflows/aw.json:
# {"maintenance": false}
#
#
# Agentic maintenance docs:
# https://github.github.com/gh-aw/reference/ephemerals/#manual-maintenance-operations
#
Expand Down
30 changes: 15 additions & 15 deletions .github/workflows/issue-monster.lock.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions .github/workflows/issue-monster.md
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,7 @@ safe-outputs:
max: 3
target: "*" # Requires explicit issue_number in agent output
allowed: [copilot] # Only allow copilot agent
ignore-if-error: true # Don't fail the workflow if copilot is temporarily unavailable

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore-if-error: true permanently silences ALL availability errors — including permanent misconfiguration — not just transient ones.

💡 Details

The inline comment reads "temporarily unavailable", but the flag has no retry budget, expiry, or transience check. Any of these permanent conditions will be silently swallowed on every 30-minute schedule run forever:

  • Copilot never enabled for this repo/org
  • Billing lapse disabling the Copilot plan
  • Token lacking agent-tasks:write (if that check happens at assignment time)

Operators will only see ⏭️ Skipped in workflow summaries with no actionable signal that assignments have been broken indefinitely. The original behaviour (failure issue creation) was noisy, but this pendulum swings all the way to silent — there is no middle ground here.

Consider pairing this with a periodic health-check step (e.g. a separate job that calls getAvailableAgentLogins and creates a single de-duplicated alert issue when the result is empty for N consecutive runs), or at minimum document the operational consequence more explicitly.

add-comment:
max: 3
target: "*"
Expand Down
10 changes: 6 additions & 4 deletions actions/setup/js/assign_to_agent.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ async function main(config = {}) {
if (configuredBaseBranch) core.info(`Configured base branch: ${configuredBaseBranch}`);
core.info(`Target configuration: ${targetConfig}`);
core.info(`Max count: ${maxCount}`);
if (ignoreIfError) core.info("Ignore-if-error mode enabled: Will not fail if agent assignment encounters auth errors");
if (ignoreIfError) core.info("Ignore-if-error mode enabled: Will not fail if agent assignment encounters auth or availability errors");
if (allowedAgents) core.info(`Allowed agents: ${allowedAgents.join(", ")}`);
core.info(`Default target repo: ${defaultTargetRepo}`);
if (allowedRepos.size > 0) core.info(`Allowed repos: ${[...allowedRepos].join(", ")}`);
Expand Down Expand Up @@ -393,15 +393,17 @@ async function main(config = {}) {
let errorMessage = getErrorMessage(error);

const isAuthError = ["Bad credentials", "Not Authenticated", "Resource not accessible", "Insufficient permissions", "requires authentication"].some(msg => errorMessage.includes(msg));
const isAvailabilityError = errorMessage.includes("coding agent is not available for this repository");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The string "coding agent is not available for this repository" appears in two places in this file: here as a matcher (line 396) and at line ~335 as the synthesised error message. If one drifts, the availability guard silently stops firing.

💡 Suggested fix: extract to a module-level constant
// near top of file, alongside other constants
const AGENT_UNAVAILABLE_ERROR_FRAGMENT = "coding agent is not available for this repository";

// line ~335 (throw site)
if (!agentId) {
  throw new Error(`${agentName} ${AGENT_UNAVAILABLE_ERROR_FRAGMENT}`);
}

// line 396 (guard site)
const isAvailabilityError = errorMessage.includes(AGENT_UNAVAILABLE_ERROR_FRAGMENT);

This removes the implicit string contract between the two sites and makes future edits safe.


if (ignoreIfError && isAuthError) {
core.warning(`Agent assignment failed for ${agentName} on ${type} #${number} due to authentication/permission error. Skipping due to ignore-if-error=true.`);
if (ignoreIfError && (isAuthError || isAvailabilityError)) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graceful-skip path still emits core.error(...) from findAgent, creating misleading error-level workflow annotations.

💡 Details and suggested fix

When findAgent returns null because checkUserCanBeAssigned throws 404, it logs:

core.error("Failed to find copilot agent: Not Found")

before returning null. Control only reaches the new ignoreIfError && isAvailabilityError guard here after that error-level log has already been emitted.

Operators reviewing workflow runs will see a red error annotation even though the operation was intentionally skipped — the opposite of "graceful". If any monitoring alerts on core.error output, this will fire on every scheduled issue-monster run when Copilot is unavailable.

The cleanest fix is to downgrade the core.error call in findAgent (in assign_agent_helpers.cjs) to core.debug or core.warning for non-auth errors, and let assign_to_agent.cjs decide how to escalate based on ignoreIfError. Alternatively, pass an options flag to findAgent to suppress the error log when the caller knows it will handle the null gracefully.

const errorType = isAuthError ? "authentication/permission" : "agent availability";
core.warning(`Agent assignment failed for ${agentName} on ${type} #${number} due to ${errorType} error. Skipping due to ignore-if-error=true.`);
core.info(`Error details: ${errorMessage}`);
_allResults.push({ issue_number: issueNumber, pull_number: pullNumber, agent: agentName, owner: effectiveOwner, repo: effectiveRepo, pull_request_repo: effectivePullRequestRepoSlug, success: true, skipped: true });
return { success: true, skipped: true };
}

if (errorMessage.includes("coding agent is not available for this repository")) {
if (isAvailabilityError) {
try {
const available = await getAvailableAgentLogins(effectiveOwner, effectiveRepo, githubClient);
if (available.length > 0) errorMessage += ` (available agents: ${available.join(", ")})`;
Expand Down
41 changes: 37 additions & 4 deletions actions/setup/js/assign_to_agent.test.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -1032,7 +1032,7 @@ describe("assign_to_agent", () => {
await eval(`(async () => { ${assignToAgentScript}; ${STANDALONE_RUNNER} })()`);

// Should log that ignore-if-error is enabled
expect(mockCore.info).toHaveBeenCalledWith(expect.stringContaining("Ignore-if-error mode enabled: Will not fail if agent assignment encounters auth errors"));
expect(mockCore.info).toHaveBeenCalledWith(expect.stringContaining("Ignore-if-error mode enabled: Will not fail if agent assignment encounters auth or availability errors"));

// Should warn about skipping but not fail
expect(mockCore.warning).toHaveBeenCalledWith(expect.stringContaining("Agent assignment failed"));
Expand Down Expand Up @@ -1109,8 +1109,41 @@ describe("assign_to_agent", () => {
expect(mockCore.setFailed).not.toHaveBeenCalled();
});

it("should skip assignment and not fail when ignore-if-error is true and agent is not available for the repository", async () => {
process.env.GH_AW_AGENT_IGNORE_IF_ERROR = "true";
setAgentOutput({
items: [
{
type: "assign_to_agent",
issue_number: 42,
agent: "copilot",
},
],
errors: [],
});

// Simulate agent not available (checkUserCanBeAssigned returns 404)
const notFoundError = new Error("Not Found");
notFoundError.status = 404;
mockGithub.rest.issues.checkUserCanBeAssigned.mockRejectedValueOnce(notFoundError);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The comment says "Simulate agent not available (checkUserCanBeAssigned returns 404)" but the actual isAvailabilityError trigger is the new Error(\${agentName} coding agent is not available...`)thrown bymainafterfindAgent` returns null. The 404 is correct as the upstream cause, but the indirection makes the test harder to follow.

💡 Suggested comment improvement
// Simulate agent not available:
// checkUserCanBeAssigned returns 404 → findAgent returns null
// → main throws "copilot coding agent is not available for this repository"
// → isAvailabilityError = true → guard fires
const notFoundError = new Error("Not Found");
notFoundError.status = 404;

Also note: if the thrown error message at line ~335 is ever changed, this test would still pass (because it mocks at the API layer), masking the regression. A complementary unit test for findAgent returning null → correct error message could provide an extra safety net.

// getAvailableAgentLogins also calls checkUserCanBeAssigned - return 404 for that too
mockGithub.rest.issues.checkUserCanBeAssigned.mockRejectedValue(notFoundError);

await eval(`(async () => { ${assignToAgentScript}; ${STANDALONE_RUNNER} })()`);

// Should warn about skipping due to agent availability error, but not fail
expect(mockCore.warning).toHaveBeenCalledWith(expect.stringContaining("agent availability"));
expect(mockCore.warning).toHaveBeenCalledWith(expect.stringContaining("ignore-if-error=true"));
expect(mockCore.setFailed).not.toHaveBeenCalled();

// Summary should show skipped assignments
expect(mockCore.summary.addRaw).toHaveBeenCalled();
const summaryCall = mockCore.summary.addRaw.mock.calls[0][0];
expect(summaryCall).toContain("⏭️ Skipped");
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The new test verifies that availability errors are silently skipped when ignoreIfError=true. The complementary path — availability error without ignoreIfError set — has no dedicated test. The existing failure test (line ~1145) uses IGNORE_IF_MISSING and a "Network timeout" error, so the enrichment branch at lines 406–413 (listing available agents in the error message) is also untested.

💡 Suggested additional test case
it("should fail and enrich error message when ignoreIfError is false and agent is unavailable", async () => {
  // GH_AW_AGENT_IGNORE_IF_ERROR is NOT set
  setAgentOutput({ items: [{ type: "assign_to_agent", issue_number: 42, agent: "copilot" }], errors: [] });

  // findAgent returns null → main throws availability error
  const notFoundError = new Error("Not Found");
  notFoundError.status = 404;
  mockGithub.rest.issues.checkUserCanBeAssigned.mockRejectedValue(notFoundError);

  await eval(`(async () => { ${assignToAgentScript}; ${STANDALONE_RUNNER} })()`);

  expect(mockCore.error).toHaveBeenCalledWith(expect.stringContaining("Failed to assign agent"));
  expect(mockCore.setFailed).toHaveBeenCalled();
});


it("should still fail on non-auth errors even with ignore-if-error enabled", async () => {
process.env.GH_AW_AGENT_IGNORE_IF_MISSING = "true";
process.env.GH_AW_AGENT_IGNORE_IF_ERROR = "true";
setAgentOutput({
Comment thread
Copilot marked this conversation as resolved.
items: [
{
Expand All @@ -1122,9 +1155,9 @@ describe("assign_to_agent", () => {
errors: [],
});

// Simulate a different error (not auth-related)
// Simulate a different error (not auth-related) during assignment
const otherError = new Error("Network timeout");
mockGithub.rest.issues.checkUserCanBeAssigned.mockRejectedValue(otherError);
mockGithub.request.mockRejectedValue(otherError);

await eval(`(async () => { ${assignToAgentScript}; ${STANDALONE_RUNNER} })()`);

Expand Down
Loading