diff --git a/.agents/skills/grill-with-docs/ADR-FORMAT.md b/.agents/skills/grill-with-docs/ADR-FORMAT.md index da7e78e..dc90176 100644 --- a/.agents/skills/grill-with-docs/ADR-FORMAT.md +++ b/.agents/skills/grill-with-docs/ADR-FORMAT.md @@ -12,7 +12,7 @@ Create the `docs/adr/` directory lazily — only when the first ADR is needed. {1-3 sentences: what's the context, what did we decide, and why.} ``` -That's it. An ADR can be a single paragraph. The value is in recording *that* a decision was made and *why* — not in filling out sections. +That's it. An ADR can be a single paragraph. The value is in recording _that_ a decision was made and _why_ — not in filling out sections. ## Optional sections diff --git a/.agents/skills/improve-codebase-architecture/DEEPENING.md b/.agents/skills/improve-codebase-architecture/DEEPENING.md index ecaf5d7..c52fdfd 100644 --- a/.agents/skills/improve-codebase-architecture/DEEPENING.md +++ b/.agents/skills/improve-codebase-architecture/DEEPENING.md @@ -18,7 +18,7 @@ Dependencies that have local test stand-ins (PGLite for Postgres, in-memory file Your own services across a network boundary (microservices, internal APIs). Define a **port** (interface) at the seam. The deep module owns the logic; the transport is injected as an **adapter**. Tests use an in-memory adapter. Production uses an HTTP/gRPC/queue adapter. -Recommendation shape: *"Define a port at the seam, implement an HTTP adapter for production and an in-memory adapter for testing, so the logic sits in one deep module even though it's deployed across a network."* +Recommendation shape: _"Define a port at the seam, implement an HTTP adapter for production and an in-memory adapter for testing, so the logic sits in one deep module even though it's deployed across a network."_ ### 4. True external (Mock) diff --git a/.agents/skills/improve-codebase-architecture/LANGUAGE.md b/.agents/skills/improve-codebase-architecture/LANGUAGE.md index 530c276..dd9b60f 100644 --- a/.agents/skills/improve-codebase-architecture/LANGUAGE.md +++ b/.agents/skills/improve-codebase-architecture/LANGUAGE.md @@ -19,11 +19,11 @@ What's inside a module — its body of code. Distinct from **Adapter**: a thing Leverage at the interface — the amount of behaviour a caller (or test) can exercise per unit of interface they have to learn. A module is **deep** when a large amount of behaviour sits behind a small interface. A module is **shallow** when the interface is nearly as complex as the implementation. **Seam** _(from Michael Feathers)_ -A place where you can alter behaviour without editing in that place. The *location* at which a module's interface lives. Choosing where to put the seam is its own design decision, distinct from what goes behind it. +A place where you can alter behaviour without editing in that place. The _location_ at which a module's interface lives. Choosing where to put the seam is its own design decision, distinct from what goes behind it. _Avoid_: boundary (overloaded with DDD's bounded context). **Adapter** -A concrete thing that satisfies an interface at a seam. Describes *role* (what slot it fills), not substance (what's inside). +A concrete thing that satisfies an interface at a seam. Describes _role_ (what slot it fills), not substance (what's inside). **Leverage** What callers get from depth. More capability per unit of interface they have to learn. One implementation pays back across N call sites and M tests. @@ -35,7 +35,7 @@ What maintainers get from depth. Change, bugs, knowledge, and verification conce - **Depth is a property of the interface, not the implementation.** A deep module can be internally composed of small, mockable, swappable parts — they just aren't part of the interface. A module can have **internal seams** (private to its implementation, used by its own tests) as well as the **external seam** at its interface. - **The deletion test.** Imagine deleting the module. If complexity vanishes, the module wasn't hiding anything (it was a pass-through). If complexity reappears across N callers, the module was earning its keep. -- **The interface is the test surface.** Callers and tests cross the same seam. If you want to test *past* the interface, the module is probably the wrong shape. +- **The interface is the test surface.** Callers and tests cross the same seam. If you want to test _past_ the interface, the module is probably the wrong shape. - **One adapter means a hypothetical seam. Two adapters means a real one.** Don't introduce a seam unless something actually varies across it. ## Relationships diff --git a/.agents/skills/tdd/mocking.md b/.agents/skills/tdd/mocking.md index 71cbfee..9c1a403 100644 --- a/.agents/skills/tdd/mocking.md +++ b/.agents/skills/tdd/mocking.md @@ -53,6 +53,7 @@ const api = { ``` The SDK approach means: + - Each mock returns one specific shape - No conditional logic in test setup - Easier to see which endpoints a test exercises diff --git a/.agents/skills/tdd/tests.md b/.agents/skills/tdd/tests.md index ff22f80..eb781ed 100644 --- a/.agents/skills/tdd/tests.md +++ b/.agents/skills/tdd/tests.md @@ -6,11 +6,11 @@ ```typescript // GOOD: Tests observable behavior -test("user can checkout with valid cart", async () => { +test('user can checkout with valid cart', async () => { const cart = createCart(); cart.add(product); const result = await checkout(cart, paymentMethod); - expect(result.status).toBe("confirmed"); + expect(result.status).toBe('confirmed'); }); ``` @@ -28,7 +28,7 @@ Characteristics: ```typescript // BAD: Tests implementation details -test("checkout calls paymentService.process", async () => { +test('checkout calls paymentService.process', async () => { const mockPayment = jest.mock(paymentService); await checkout(cart, payment); expect(mockPayment.process).toHaveBeenCalledWith(cart.total); @@ -46,16 +46,16 @@ Red flags: ```typescript // BAD: Bypasses interface to verify -test("createUser saves to database", async () => { - await createUser({ name: "Alice" }); - const row = await db.query("SELECT * FROM users WHERE name = ?", ["Alice"]); +test('createUser saves to database', async () => { + await createUser({ name: 'Alice' }); + const row = await db.query('SELECT * FROM users WHERE name = ?', ['Alice']); expect(row).toBeDefined(); }); // GOOD: Verifies through interface -test("createUser makes user retrievable", async () => { - const user = await createUser({ name: "Alice" }); +test('createUser makes user retrievable', async () => { + const user = await createUser({ name: 'Alice' }); const retrieved = await getUser(user.id); - expect(retrieved.name).toBe("Alice"); + expect(retrieved.name).toBe('Alice'); }); ``` diff --git a/.agents/skills/triage/AGENT-BRIEF.md b/.agents/skills/triage/AGENT-BRIEF.md index 2efecdf..780a2d8 100644 --- a/.agents/skills/triage/AGENT-BRIEF.md +++ b/.agents/skills/triage/AGENT-BRIEF.md @@ -51,16 +51,19 @@ Describe what should happen after the agent's work is complete. Be specific about edge cases and error conditions. **Key interfaces:** + - `TypeName` — what needs to change and why - `functionName()` return type — what it currently returns vs what it should return - Config shape — any new configuration options needed **Acceptance criteria:** + - [ ] Specific, testable criterion 1 - [ ] Specific, testable criterion 2 - [ ] Specific, testable criterion 3 **Out of scope:** + - Thing that should NOT be changed or addressed in this issue - Adjacent feature that might seem related but is separate ``` @@ -85,12 +88,14 @@ Truncation should break at the last word boundary before 1024 characters and append "..." to indicate truncation. **Key interfaces:** + - The `SkillMetadata` type's `description` field — no type change needed, but the validation/processing logic that populates it needs to respect word boundaries - Any function that reads SKILL.md frontmatter and extracts the description **Acceptance criteria:** + - [ ] Descriptions under 1024 chars are unchanged - [ ] Descriptions over 1024 chars are truncated at the last word boundary before 1024 chars @@ -98,6 +103,7 @@ and append "..." to indicate truncation. - [ ] The total length including "..." does not exceed 1024 chars **Out of scope:** + - Changing the 1024 char limit itself - Multi-line description support ``` @@ -123,6 +129,7 @@ requested the feature. When triaging new issues, these files should be checked for matches. **Key interfaces:** + - Markdown file format in `.out-of-scope/` — each file should have a `# Concept Name` heading, a `**Decision:**` line, a `**Reason:**` line, and a `**Prior requests:**` list with issue links @@ -130,6 +137,7 @@ checked for matches. and match incoming issues against them by concept similarity **Acceptance criteria:** + - [ ] Closing a feature as wontfix creates/updates a file in `.out-of-scope/` - [ ] The file includes the decision, reasoning, and link to the closed issue - [ ] If a matching `.out-of-scope/` file already exists, the new issue is @@ -138,6 +146,7 @@ checked for matches. when a new issue matches a prior rejection **Out of scope:** + - Automated matching (human confirms the match) - Reopening previously rejected features - Bug reports (only enhancement rejections go to `.out-of-scope/`) @@ -155,11 +164,13 @@ The triage thing is broken. Look at the main file and fix it. The function around line 150 has the issue. **Files to change:** + - src/triage/handler.ts (line 150) - src/types.ts (line 42) ``` This is bad because: + - No category - Vague description ("the triage thing is broken") - References file paths and line numbers that will go stale diff --git a/.agents/skills/triage/OUT-OF-SCOPE.md b/.agents/skills/triage/OUT-OF-SCOPE.md index cc8ea25..df6aaa4 100644 --- a/.agents/skills/triage/OUT-OF-SCOPE.md +++ b/.agents/skills/triage/OUT-OF-SCOPE.md @@ -20,7 +20,7 @@ One file per **concept**, not per issue. Multiple issues requesting the same thi The file should be written in a relaxed, readable style — more like a short design document than a database entry. Use paragraphs, code samples, and examples to make the reasoning clear and useful to someone encountering it for the first time. -```markdown +````markdown # Dark Mode This project does not support dark mode or user-facing theming. @@ -45,13 +45,13 @@ interface ThemeConfig { fonts: FontStack; } ``` +```` ## Prior requests - #42 — "Add dark mode support" - #87 — "Night theme for accessibility" - #134 — "Dark theme option" -``` ### Naming the file diff --git a/.sandcastle/lib/afkIdentity.ts b/.sandcastle/lib/afkIdentity.ts new file mode 100644 index 0000000..493f45e --- /dev/null +++ b/.sandcastle/lib/afkIdentity.ts @@ -0,0 +1,149 @@ +import { invariant } from '../../src/util/assert.js'; + +// ============================================================================= +// Issue numbers +// ============================================================================= + +export function assertIssueNumber(value: unknown): number { + invariant( + typeof value === 'number' && + Number.isFinite(value) && + Number.isInteger(value) && + value > 0, + 'issue number must be a positive integer', + ); + + return value; +} + +// ============================================================================= +// Run IDs +// ============================================================================= + +/** Compact UTC run ID: YYYYMMDDTHHMMSSZ. */ +export const RUN_ID_PATTERN = /^\d{8}T\d{6}Z$/u; + +export function assertRunId(value: unknown): string { + invariant( + typeof value === 'string' && RUN_ID_PATTERN.test(value), + 'run ID must use compact UTC format YYYYMMDDTHHMMSSZ', + ); + + return value; +} + +export function createRunId(date = new Date()): string { + const runId = `${date.getUTCFullYear()}${pad(date.getUTCMonth() + 1)}${pad( + date.getUTCDate(), + )}T${pad(date.getUTCHours())}${pad(date.getUTCMinutes())}${pad( + date.getUTCSeconds(), + )}Z`; + + return assertRunId(runId); +} + +function pad(value: number): string { + return String(value).padStart(2, '0'); +} + +// ============================================================================= +// Branch and Coder workspace names +// ============================================================================= + +const BRANCH_NAME_PATTERN = /^afk-triage\/\d+-\d{8}T\d{6}Z$/u; +const WORKSPACE_NAME_PATTERN = /^agent-tty-triage-\d+$/u; + +export function branchNameForIssue(issueNumber: number, runId: string): string { + const checkedIssueNumber = assertIssueNumber(issueNumber); + const checkedRunId = assertRunId(runId); + const branchName = `afk-triage/${checkedIssueNumber}-${checkedRunId}`; + + invariant( + BRANCH_NAME_PATTERN.test(branchName), + 'branch name must match the AFK triage naming convention', + ); + + return branchName; +} + +export function workspaceNameForIssue(issueNumber: number): string { + const checkedIssueNumber = assertIssueNumber(issueNumber); + const workspaceName = `agent-tty-triage-${checkedIssueNumber}`; + + invariant( + WORKSPACE_NAME_PATTERN.test(workspaceName), + 'workspace name must match the AFK triage naming convention', + ); + + return workspaceName; +} + +// ============================================================================= +// Triage states +// ============================================================================= + +export const TRIAGE_STATES = [ + 'needs-info', + 'ready-for-agent', + 'ready-for-human', + 'wontfix', + 'needs-triage', +] as const; + +export type TriageState = (typeof TRIAGE_STATES)[number]; + +export function isTriageState(value: string): value is TriageState { + return TRIAGE_STATES.includes(value as TriageState); +} + +// ============================================================================= +// AFK markers +// ============================================================================= + +const MARKER_PATTERN = + //u; + +export interface AfkMarker { + readonly issue: number; + readonly outcome: TriageState; + readonly run: string; +} + +export function formatAfkMarker(input: AfkMarker): string { + const issue = assertIssueNumber(input.issue); + invariant(isTriageState(input.outcome), 'AFK marker outcome is invalid'); + const run = assertRunId(input.run); + + return ``; +} + +export function parseAfkMarker(comment: string): AfkMarker | null { + const match = MARKER_PATTERN.exec(comment); + if (match === null) { + return null; + } + + const [, issueRaw, outcomeRaw, runRaw] = match; + invariant(issueRaw !== undefined, 'AFK marker issue capture must exist'); + invariant(outcomeRaw !== undefined, 'AFK marker outcome capture must exist'); + invariant(runRaw !== undefined, 'AFK marker run capture must exist'); + + const issue = Number(issueRaw); + if (!Number.isInteger(issue) || issue <= 0) { + return null; + } + + if (!isTriageState(outcomeRaw)) { + return null; + } + + if (!RUN_ID_PATTERN.test(runRaw)) { + return null; + } + + return { + issue, + outcome: outcomeRaw, + run: runRaw, + }; +} diff --git a/.sandcastle/lib/afkMarker.ts b/.sandcastle/lib/afkMarker.ts deleted file mode 100644 index fd9024a..0000000 --- a/.sandcastle/lib/afkMarker.ts +++ /dev/null @@ -1,79 +0,0 @@ -import { invariant } from '../../src/util/assert.js'; -import { assertRunId } from './branchName.js'; -import { assertIssueNumber } from './issueNumber.js'; - -export type TriageState = - | 'needs-info' - | 'ready-for-agent' - | 'ready-for-human' - | 'wontfix' - | 'needs-triage'; - -export const TRIAGE_STATES = new Set([ - 'needs-info', - 'ready-for-agent', - 'ready-for-human', - 'wontfix', - 'needs-triage', -]); - -const MARKER_PATTERN = - //u; - -export interface AfkMarker { - readonly issue: number; - readonly outcome: TriageState; - readonly run: string; -} - -export function isTriageState(value: string): value is TriageState { - return TRIAGE_STATES.has(value as TriageState); -} - -export function formatAfkMarker(input: AfkMarker): string { - const issue = assertIssueNumber(input.issue); - invariant(isTriageState(input.outcome), 'AFK marker outcome is invalid'); - const run = assertRunId(input.run); - - return ``; -} - -export function parseAfkMarker(comment: string): AfkMarker | null { - const match = MARKER_PATTERN.exec(comment); - if (match === null) { - return null; - } - - const [, issueRaw, outcomeRaw, runRaw] = match; - invariant(issueRaw !== undefined, 'AFK marker issue capture must exist'); - invariant(outcomeRaw !== undefined, 'AFK marker outcome capture must exist'); - invariant(runRaw !== undefined, 'AFK marker run capture must exist'); - - const issue = Number(issueRaw); - if (!Number.isInteger(issue) || issue <= 0) { - return null; - } - - if (!isTriageState(outcomeRaw)) { - return null; - } - - if (!isValidRunId(runRaw)) { - return null; - } - - return { - issue, - outcome: outcomeRaw, - run: runRaw, - }; -} - -function isValidRunId(value: string): boolean { - try { - assertRunId(value); - return true; - } catch { - return false; - } -} diff --git a/.sandcastle/lib/batchRunner.ts b/.sandcastle/lib/batchRunner.ts new file mode 100644 index 0000000..1a294ba --- /dev/null +++ b/.sandcastle/lib/batchRunner.ts @@ -0,0 +1,275 @@ +import { invariant } from '../../src/util/assert.js'; +import { branchNameForIssue, workspaceNameForIssue } from './afkIdentity.js'; +import { + createCoderProvisioner, + type CoderProvisioner, + type ProvisionedAgent, +} from './coderProvisioner.js'; +import type { TriageIssue } from './eligibility.js'; +import { conciseErrorMessage, isLockError } from './errorMessage.js'; +import { pLimit } from './pLimit.js'; + +export type TriageIssueStatus = 'success' | 'locked' | 'failed' | 'skipped'; + +export interface TriageIssueSummary { + readonly issueNumber: number; + readonly status: TriageIssueStatus; + readonly message?: string; +} + +export interface BatchRunnerDeps { + readonly coderProvisioner?: CoderProvisioner; + readonly errorLogger?: (...args: readonly unknown[]) => void; +} + +export interface RunBatchOptions { + readonly runId: string; + readonly parallelism: number; +} + +export interface BatchRunnerStatus { + readonly active: number; + readonly pending: number; +} + +export interface TriageBatchRunner { + run( + eligibleIssues: readonly TriageIssue[], + options: RunBatchOptions, + ): Promise; + + /** + * Idempotent. Sets the shutdown flag synchronously so newly dequeued tasks + * stop spawning new Coder workspaces, then closes active sandboxes and + * reaps any in-flight workspace creates. + */ + requestShutdown(): Promise; + + readonly status: BatchRunnerStatus; +} + +export function createTriageBatchRunner( + deps: BatchRunnerDeps = {}, +): TriageBatchRunner { + const provisioner = deps.coderProvisioner ?? createCoderProvisioner(); + const errorLogger = + deps.errorLogger ?? + ((...args: readonly unknown[]) => console.error(...args)); + + const activeAgents = new Map(); + const pendingWorkspaceNames = new Map(); + + // Set synchronously before cleanup awaits so newly dequeued tasks do not + // create workspaces outside the cleanup snapshot. + let shutdownRequested = false; + + async function runOne( + issue: TriageIssue, + runId: string, + ): Promise { + let workspaceName: string | undefined; + let agent: ProvisionedAgent | undefined; + let result: TriageIssueSummary | undefined; + + if (shutdownRequested) { + return { + issueNumber: issue.number, + status: 'skipped', + message: 'shutdown requested', + }; + } + + try { + workspaceName = workspaceNameForIssue(issue.number); + const branchName = branchNameForIssue(issue.number, runId); + + pendingWorkspaceNames.set(workspaceName, issue.number); + try { + agent = await provisioner.provision({ + issue, + runId, + workspaceName, + branchName, + }); + } finally { + pendingWorkspaceNames.delete(workspaceName); + } + + // Re-check shutdownRequested after provision() resolves. SIGINT can + // arrive while provisioning a Coder workspace (which takes minutes); + // requestShutdown() runs synchronously, snapshots + // pendingWorkspaceNames (which still includes our entry — pending is + // populated before the await above), and dispatches + // deleteWorkspace() against our workspace name. Without this + // re-check, the just-resolved agent would land in the cleared + // activeAgents map, agent.run() would race against the workspace + // delete, and a graceful skip would be misclassified as 'failed'. + // + // Do NOT call agent.close() here: the workspace is already in the + // queue requestShutdown owns. Calling close() in parallel would race + // the dispatched deleteWorkspace and cause a spurious "workspace may + // be stranded" log line in the second-to-finish call. Returning + // 'skipped' without close lets requestShutdown's reap path log the + // single, correct outcome. + // + // oxlint cannot see across the await boundary that shutdownRequested + // is mutated by requestShutdown, so it narrows it to `false` after + // the entry-time check and flags this as `no-unnecessary-condition`. + // The check is semantically required (without it, DEREM-3 + // reintroduces the race); suppress the warning rather than delete it. + // oxlint-disable-next-line typescript/no-unnecessary-condition + if (shutdownRequested) { + return { + issueNumber: issue.number, + status: 'skipped', + message: 'shutdown requested', + }; + } + + activeAgents.set(agent, issue.number); + + await agent.run(); + + result = { + issueNumber: issue.number, + status: 'success', + }; + } catch (error) { + errorLogger(`[issue ${issue.number}]`, error); + result = { + issueNumber: issue.number, + status: + workspaceName !== undefined && isLockError(error, workspaceName) + ? 'locked' + : 'failed', + message: conciseErrorMessage(error), + }; + } finally { + // Skip close when the agent was already cleaned up by the signal + // handler (requestShutdown). Otherwise we would call close() twice on + // the same Coder workspace; the second call fails because the + // workspace is gone, the catch fires, and a successful triage gets + // silently overwritten with a misleading `close failed: workspace + // not found` message. Membership in `activeAgents` is the + // single-owner protocol that disambiguates "still ours to close" + // from "signal handler took it". + if (agent !== undefined && activeAgents.has(agent)) { + activeAgents.delete(agent); + try { + await agent.close(); + } catch (closeError) { + errorLogger(`[issue ${issue.number}] close failed`, closeError); + // Agent is in activeAgents only if provision() resolved, which + // means runOne already reached either the success-result + // assignment or the catch block (both of which assign result). + // Assert rather than branch. + invariant( + result !== undefined, + 'result must be defined when closing an active agent', + ); + const original = + result.message ?? `original status: ${result.status}`; + result = { + issueNumber: issue.number, + status: 'failed', + message: `${original}; close failed: ${conciseErrorMessage(closeError)}`, + }; + } + } + } + + return result; + } + + async function run( + eligibleIssues: readonly TriageIssue[], + options: RunBatchOptions, + ): Promise { + const limit = pLimit(options.parallelism); + return Promise.all( + eligibleIssues.map((issue) => limit(() => runOne(issue, options.runId))), + ); + } + + async function requestShutdown(): Promise { + shutdownRequested = true; + + const agentEntries = Array.from(activeAgents.entries()); + const pendingEntries = Array.from(pendingWorkspaceNames.entries()); + activeAgents.clear(); + pendingWorkspaceNames.clear(); + + const closeResults = await Promise.allSettled( + agentEntries.map(async ([agent]) => { + await agent.close(); + }), + ); + + for (const [index, settled] of closeResults.entries()) { + const entry = agentEntries[index]; + if (entry === undefined) { + continue; + } + const [, issueNumber] = entry; + if (settled.status === 'fulfilled') { + errorLogger(`[issue ${issueNumber}] sandbox closed during shutdown`); + } else { + errorLogger( + `[issue ${issueNumber}] sandbox close failed during shutdown — workspace may be stranded:`, + settled.reason, + ); + } + } + + // Reap any workspaces whose `provision()` registered them as in-flight + // but never resolved a ProvisionedAgent for us to close via the normal + // path. The provisioner's deleteWorkspace mirrors the + // `onClose: 'delete'` semantics for the in-flight case. + const pendingResults = await Promise.allSettled( + pendingEntries.map(([workspaceName]) => + provisioner.deleteWorkspace(workspaceName), + ), + ); + + for (const [index, settled] of pendingResults.entries()) { + const entry = pendingEntries[index]; + if (entry === undefined) { + continue; + } + const [workspaceName, issueNumber] = entry; + if (settled.status === 'rejected') { + // Defensive: the production CoderProvisioner.deleteWorkspace + // always resolves (spawn errors map to outcome:'failed'), so this + // branch is unreachable for the real provisioner. Kept so that + // an injected provisioner that violates the resolve-only contract + // produces a useful log line instead of an unhandled rejection. + errorLogger( + `[issue ${issueNumber}] in-flight workspace ${workspaceName} delete threw during shutdown — workspace may be stranded:`, + settled.reason, + ); + continue; + } + const deleteResult = settled.value; + if (deleteResult.outcome === 'deleted') { + errorLogger( + `[issue ${issueNumber}] in-flight workspace ${workspaceName} deleted during shutdown`, + ); + } else { + errorLogger( + `[issue ${issueNumber}] in-flight workspace ${workspaceName} delete failed during shutdown (status ${deleteResult.status}) — workspace may be stranded: ${deleteResult.stderr.trim()}`, + ); + } + } + } + + return { + run, + requestShutdown, + get status(): BatchRunnerStatus { + return { + active: activeAgents.size, + pending: pendingWorkspaceNames.size, + }; + }, + }; +} diff --git a/.sandcastle/lib/branchName.ts b/.sandcastle/lib/branchName.ts deleted file mode 100644 index b1a685e..0000000 --- a/.sandcastle/lib/branchName.ts +++ /dev/null @@ -1,29 +0,0 @@ -import { invariant } from '../../src/util/assert.js'; -import { assertIssueNumber } from './issueNumber.js'; - -/** Compact UTC run ID: YYYYMMDDTHHMMSSZ. */ -export const RUN_ID_PATTERN = /^\d{8}T\d{6}Z$/u; - -const BRANCH_NAME_PATTERN = /^afk-triage\/\d+-\d{8}T\d{6}Z$/u; - -export function assertRunId(value: unknown): string { - invariant( - typeof value === 'string' && RUN_ID_PATTERN.test(value), - 'run ID must use compact UTC format YYYYMMDDTHHMMSSZ', - ); - - return value; -} - -export function branchNameForIssue(issueNumber: number, runId: string): string { - const checkedIssueNumber = assertIssueNumber(issueNumber); - const checkedRunId = assertRunId(runId); - const branchName = `afk-triage/${checkedIssueNumber}-${checkedRunId}`; - - invariant( - BRANCH_NAME_PATTERN.test(branchName), - 'branch name must match the AFK triage naming convention', - ); - - return branchName; -} diff --git a/.sandcastle/lib/coderProvisioner.ts b/.sandcastle/lib/coderProvisioner.ts new file mode 100644 index 0000000..c2c86d4 --- /dev/null +++ b/.sandcastle/lib/coderProvisioner.ts @@ -0,0 +1,159 @@ +import { coder, type CoderOptions } from '../vendor/sandcastle-coder/coder.js'; +import type { TriageIssue } from './eligibility.js'; +import { runCoderAsync } from './gh.js'; + +// Production knobs for the AFK Triage Coder workspace and Triage Agent. +// These are intentionally hard-coded; if a future deployment needs to vary +// any of them, expose them through CoderProvisionerDeps. +const TRIAGE_AGENT_IDLE_TIMEOUT_SECONDS = 1800; +const CODER_TEMPLATE = 'coder'; +const CODER_PRESET = 'Falkenstein'; +const TRIAGE_AGENT_MODEL = 'claude-opus-4-6'; +const TRIAGE_PROMPT_FILE = '.sandcastle/triage-prompt.md'; + +const SANDBOX_READY_HOOKS = [ + { command: 'gh auth status' }, + // Sandcastle syncs git-tracked files only; install deps before triage. + { command: 'npm ci' }, + { command: 'npm install -g @anthropic-ai/claude-code' }, +] as const; + +/** + * The minimal subset of `@ai-hero/sandcastle` the provisioner pulls in via + * dynamic import. Tests fake this; production callers let it default to a + * real import. + */ +export type SandcastleImports = Pick< + typeof import('@ai-hero/sandcastle'), + 'createSandbox' | 'claudeCode' +>; + +export interface ProvisionContext { + readonly issue: TriageIssue; + readonly runId: string; + readonly workspaceName: string; + readonly branchName: string; +} + +export interface ProvisionedAgent { + /** Run the Triage Agent prompt against the prepared Coder workspace. */ + run(): Promise; + /** Tear down the Coder workspace (best-effort onClose: 'delete'). */ + close(): Promise; +} + +export type WorkspaceDeleteResult = + | { readonly outcome: 'deleted' } + | { + readonly outcome: 'failed'; + readonly status: number; + readonly stderr: string; + }; + +export interface CoderProvisioner { + /** + * Create a Coder workspace and prepare the Triage Agent that will run + * inside it. Throws on workspace-name conflict so callers can map the + * error to 'locked' via isLockError. + */ + provision(ctx: ProvisionContext): Promise; + + /** + * Reap a workspace whose `provision()` never resolved a ProvisionedAgent + * (in-flight workspace creates during shutdown). Always resolves: spawn + * errors and non-zero exits both surface as + * `{ outcome: 'failed', status, stderr }` (the underlying + * `runCoderAsync` captures spawn errors via `child.on('error')` and + * resolves on `child.on('close')`, so the returned promise itself + * never rejects). + */ + deleteWorkspace(workspaceName: string): Promise; +} + +export type CoderFactory = typeof coder; + +export interface CoderProvisionerDeps { + readonly importSandcastle?: () => Promise; + readonly runCoderAsync?: typeof runCoderAsync; + /** + * Override the vendored `coder()` provider factory. Tests use this to + * pin the production knobs (template, preset, onClose, workspaceName) + * by asserting on the call args instead of inspecting the opaque + * `IsolatedSandboxProvider` returned by the real factory. + */ + readonly coderFactory?: CoderFactory; +} + +export function createCoderProvisioner( + deps: CoderProvisionerDeps = {}, +): CoderProvisioner { + const importSandcastle = + deps.importSandcastle ?? + (() => import('@ai-hero/sandcastle') as Promise); + const runCoderAsyncImpl = deps.runCoderAsync ?? runCoderAsync; + const coderFactoryImpl = deps.coderFactory ?? coder; + + return { + async provision(ctx: ProvisionContext): Promise { + const { createSandbox, claudeCode } = await importSandcastle(); + + const coderOptions: CoderOptions = { + template: CODER_TEMPLATE, + preset: CODER_PRESET, + workspaceName: ctx.workspaceName, + onClose: 'delete', + }; + + // Use sandcastle's HEAD default for the base branch so AFK triage sees this checkout. + const sandbox = await createSandbox({ + branch: ctx.branchName, + sandbox: coderFactoryImpl(coderOptions), + hooks: { + sandbox: { + onSandboxReady: [...SANDBOX_READY_HOOKS], + }, + }, + }); + + return { + run: async (): Promise => { + await sandbox.run({ + agent: claudeCode(TRIAGE_AGENT_MODEL), + promptFile: TRIAGE_PROMPT_FILE, + promptArgs: { + ISSUE_NUMBER: String(ctx.issue.number), + }, + idleTimeoutSeconds: TRIAGE_AGENT_IDLE_TIMEOUT_SECONDS, + }); + }, + close: async (): Promise => { + await sandbox.close(); + }, + }; + }, + + async deleteWorkspace( + workspaceName: string, + ): Promise { + // Use the async runCoderAsync (spawn) variant rather than the sync + // runCoder (spawnSync) one: the synchronous variant blocks the event + // loop, which would prevent a second SIGINT from being delivered to + // the force-exit branch of the signal handler while a hung + // `coder delete` is in progress. The async variant yields between + // chunks so that escape hatch keeps working. + const result = await runCoderAsyncImpl([ + 'delete', + workspaceName, + '--yes', + ]); + if (result.status === 0) { + return { outcome: 'deleted' }; + } + return { + outcome: 'failed', + status: result.status, + stderr: result.stderr, + }; + }, + }; +} diff --git a/.sandcastle/lib/commentActivity.ts b/.sandcastle/lib/commentActivity.ts new file mode 100644 index 0000000..70af777 --- /dev/null +++ b/.sandcastle/lib/commentActivity.ts @@ -0,0 +1,97 @@ +import { parseAfkMarker } from './afkIdentity.js'; +import type { TriageComment } from './eligibility.js'; + +// Unset trusts any marker; set this for bot-only idempotency. +const TRUSTED_MARKER_AUTHORS_ENV = 'AFK_TRIAGE_TRUSTED_MARKER_AUTHORS'; + +export interface ActivityFilters { + readonly trustedMarkerAuthors?: readonly string[]; +} + +/** + * Parse the comma-separated `AFK_TRIAGE_TRUSTED_MARKER_AUTHORS` allow-list. + * Returns `undefined` when unset or empty so callers can default to "trust + * any author" (the v1 behavior). The function never reads `process.env` + * itself; callers pass an env object so tests do not need to mutate globals. + * + * A non-empty env value that yields no usable entries (for example + * `AFK_TRIAGE_TRUSTED_MARKER_AUTHORS=","`) returns the empty allow-list + * `[]` rather than `undefined`. This matches the pre-refactor behavior: + * empty allow-list means "trust no author", which is the secure default + * for an opt-in allow-list. + */ +export function loadTrustedMarkerAuthors( + env: NodeJS.ProcessEnv, +): readonly string[] | undefined { + const raw = env[TRUSTED_MARKER_AUTHORS_ENV]?.trim(); + if (raw === undefined || raw === '') { + return undefined; + } + return raw + .split(',') + .map((entry) => entry.trim()) + .filter((entry) => entry.length > 0); +} + +/** + * Use the marker comment timestamp, not the embedded batch-start run ID, + * because a long-running Triage Batch can post a marker minutes after the + * run started. + */ +export function latestAfkMarkerCreatedAt( + issueNumber: number, + comments: readonly TriageComment[], + filters: ActivityFilters, +): number | undefined { + const trusted = filters.trustedMarkerAuthors; + const markerTimestamps = comments + .filter((comment) => { + const marker = parseAfkMarker(comment.body); + if (marker === null || marker.issue !== issueNumber) { + return false; + } + if (trusted === undefined) { + return true; + } + const login = comment.author?.login; + return typeof login === 'string' && trusted.includes(login); + }) + .map((comment) => Date.parse(comment.createdAt)) + .filter((timestamp) => Number.isFinite(timestamp)) + .sort((left, right) => left - right); + + return markerTimestamps.at(-1); +} + +/** + * Latest human-side activity on the issue, returned as a millisecond + * timestamp. "Human" here means any non-bot author whose comment is not + * an AFK marker for *this* issue: the issue reporter, the maintainer, and + * any third-party participant all count. AFK markers for *other* issue + * numbers also count, because they are not the local AFK noise the + * eligibility check is trying to filter out. + */ +export function latestHumanActivityAt( + issueNumber: number, + comments: readonly TriageComment[], +): number | undefined { + const timestamps = comments + .filter((comment) => { + if (isBotComment(comment)) { + return false; + } + const marker = parseAfkMarker(comment.body); + return marker === null || marker.issue !== issueNumber; + }) + .map((comment) => Date.parse(comment.createdAt)) + .filter((timestamp) => Number.isFinite(timestamp)) + .sort((left, right) => left - right); + + return timestamps.at(-1); +} + +// GitHub App noise must not re-eligibilize needs-info issues. +function isBotComment(comment: TriageComment): boolean { + const login = comment.author?.login; + return typeof login === 'string' && login.endsWith('[bot]'); +} diff --git a/.sandcastle/lib/eligibility.ts b/.sandcastle/lib/eligibility.ts index 42454fc..f4a7cef 100644 --- a/.sandcastle/lib/eligibility.ts +++ b/.sandcastle/lib/eligibility.ts @@ -1,20 +1,11 @@ import { invariant } from '../../src/util/assert.js'; -import { parseAfkMarker } from './afkMarker.js'; -import { assertIssueNumber } from './issueNumber.js'; - -// Unset trusts any marker; set this for bot-only idempotency. -const TRUSTED_MARKER_AUTHORS_ENV = 'AFK_TRIAGE_TRUSTED_MARKER_AUTHORS'; - -function trustedMarkerAuthors(): readonly string[] | undefined { - const raw = process.env[TRUSTED_MARKER_AUTHORS_ENV]?.trim(); - if (raw === undefined || raw === '') { - return undefined; - } - return raw - .split(',') - .map((entry) => entry.trim()) - .filter((entry) => entry.length > 0); -} +import { assertIssueNumber } from './afkIdentity.js'; +import { + latestAfkMarkerCreatedAt, + latestHumanActivityAt, + loadTrustedMarkerAuthors, + type ActivityFilters, +} from './commentActivity.js'; export interface TriageComment { readonly body: string; @@ -40,7 +31,10 @@ export type IssueEligibility = readonly reason: string; }; -export function classifyIssueForTriage(issue: TriageIssue): IssueEligibility { +export function classifyIssueForTriage( + issue: TriageIssue, + env: NodeJS.ProcessEnv = process.env, +): IssueEligibility { const issueNumber = assertIssueNumber(issue.number); invariant(Array.isArray(issue.labels), 'issue labels must be an array'); invariant(Array.isArray(issue.comments), 'issue comments must be an array'); @@ -58,15 +52,20 @@ export function classifyIssueForTriage(issue: TriageIssue): IssueEligibility { }; } + const trusted = loadTrustedMarkerAuthors(env); + const filters: ActivityFilters = + trusted === undefined ? {} : { trustedMarkerAuthors: trusted }; + const latestMarkerCreatedAt = latestAfkMarkerCreatedAt( issueNumber, issue.comments, + filters, ); if (latestMarkerCreatedAt === undefined) { return { eligible: true, reason: 'needs-info-with-new-activity' }; } - const latestActivity = latestNonAfkActivity(issueNumber, issue.comments); + const latestActivity = latestHumanActivityAt(issueNumber, issue.comments); if (latestActivity !== undefined && latestActivity > latestMarkerCreatedAt) { return { eligible: true, reason: 'needs-info-with-new-activity' }; } @@ -76,53 +75,3 @@ export function classifyIssueForTriage(issue: TriageIssue): IssueEligibility { reason: 'needs-info has no activity newer than the latest AFK marker', }; } - -// Use the marker comment timestamp, not the embedded batch-start run ID. -function latestAfkMarkerCreatedAt( - issueNumber: number, - comments: readonly TriageComment[], -): number | undefined { - const trusted = trustedMarkerAuthors(); - const markerTimestamps = comments - .filter((comment) => { - const marker = parseAfkMarker(comment.body); - if (marker === null || marker.issue !== issueNumber) { - return false; - } - if (trusted === undefined) { - return true; - } - const login = comment.author?.login; - return typeof login === 'string' && trusted.includes(login); - }) - .map((comment) => Date.parse(comment.createdAt)) - .filter((timestamp) => Number.isFinite(timestamp)) - .sort((left, right) => left - right); - - return markerTimestamps.at(-1); -} - -// GitHub App noise must not re-eligibilize needs-info issues. -function isBotComment(comment: TriageComment): boolean { - const login = comment.author?.login; - return typeof login === 'string' && login.endsWith('[bot]'); -} - -function latestNonAfkActivity( - issueNumber: number, - comments: readonly TriageComment[], -): number | undefined { - const timestamps = comments - .filter((comment) => { - if (isBotComment(comment)) { - return false; - } - const marker = parseAfkMarker(comment.body); - return marker === null || marker.issue !== issueNumber; - }) - .map((comment) => Date.parse(comment.createdAt)) - .filter((timestamp) => Number.isFinite(timestamp)) - .sort((left, right) => left - right); - - return timestamps.at(-1); -} diff --git a/.sandcastle/lib/issueNumber.ts b/.sandcastle/lib/issueNumber.ts deleted file mode 100644 index 8aa217a..0000000 --- a/.sandcastle/lib/issueNumber.ts +++ /dev/null @@ -1,13 +0,0 @@ -import { invariant } from '../../src/util/assert.js'; - -export function assertIssueNumber(value: unknown): number { - invariant( - typeof value === 'number' && - Number.isFinite(value) && - Number.isInteger(value) && - value > 0, - 'issue number must be a positive integer', - ); - - return value; -} diff --git a/.sandcastle/lib/issueSource.ts b/.sandcastle/lib/issueSource.ts new file mode 100644 index 0000000..2e83294 --- /dev/null +++ b/.sandcastle/lib/issueSource.ts @@ -0,0 +1,115 @@ +import { z } from 'zod'; + +import type { TriageComment, TriageIssue } from './eligibility.js'; +import { runGh, runJson, type CommandRunner } from './gh.js'; + +const GH_ISSUE_LIST_LIMIT = 500; +// Avoid cwd-based repo inference when running from CI or a sandcastle worktree. +const GH_REPO_ARGS: readonly string[] = ['--repo', 'coder/agent-tty']; + +const ghLabelSchema = z.looseObject({ + name: z.string(), +}); + +const ghAuthorSchema = z.looseObject({ + login: z.string().optional(), +}); + +export const ghCommentSchema = z.looseObject({ + body: z.string(), + createdAt: z.string(), + // `gh --json` returns `"author": null` for deleted / ghost accounts, so + // accept both null and undefined. Downstream code already uses optional + // chaining (`comment.author?.login`) so the runtime path tolerates both. + author: ghAuthorSchema.nullish(), +}); + +export const ghIssueSchema = z.looseObject({ + number: z.number(), + labels: z.array(ghLabelSchema), + comments: z.array(ghCommentSchema).default([]), + // Same as ghCommentSchema.author: GitHub returns null for deleted users. + author: ghAuthorSchema.nullish(), + createdAt: z.string().optional(), +}); + +const ghIssueListSchema = z.array(ghIssueSchema); + +type GhIssue = z.infer; + +/** + * List candidate issues for a Triage Batch and normalize them into the + * orchestrator-facing TriageIssue shape. Always queries `needs-triage`; + * also queries `needs-info` when `includeNeedsInfo` is true. The runner + * parameter exists so unit tests can inject canned `gh` output without + * spawning real processes; production callers should let it default to + * `runGh`. + */ +export function listCandidateIssues( + includeNeedsInfo: boolean, + runner: CommandRunner = runGh, +): TriageIssue[] { + const issues = [listIssuesByLabel('needs-triage', runner)]; + if (includeNeedsInfo) { + issues.push(listIssuesByLabel('needs-info', runner)); + } + + return issues.flat().map(normalizeGhIssue).filter(uniqueIssueFilter()); +} + +function listIssuesByLabel(label: string, runner: CommandRunner): GhIssue[] { + return runJson( + 'gh', + [ + 'issue', + 'list', + ...GH_REPO_ARGS, + '--label', + label, + '--state', + 'open', + '--limit', + String(GH_ISSUE_LIST_LIMIT), + '--json', + 'number,labels,comments,author,createdAt', + ], + ghIssueListSchema, + runner, + ); +} + +function normalizeGhIssue(issue: GhIssue): TriageIssue { + return { + number: issue.number, + labels: issue.labels.map((label) => label.name), + comments: issue.comments.map(normalizeGhComment), + }; +} + +function normalizeGhComment( + comment: z.infer, +): TriageComment { + const author = + comment.author?.login === undefined + ? undefined + : { login: comment.author.login }; + + return { + body: comment.body, + createdAt: comment.createdAt, + ...(author === undefined ? {} : { author }), + }; +} + +function uniqueIssueFilter(): (issue: TriageIssue) => boolean { + const seen = new Set(); + + return (issue) => { + if (seen.has(issue.number)) { + return false; + } + + seen.add(issue.number); + return true; + }; +} diff --git a/.sandcastle/lib/labelInvariants.ts b/.sandcastle/lib/labelInvariants.ts deleted file mode 100644 index a0b939c..0000000 --- a/.sandcastle/lib/labelInvariants.ts +++ /dev/null @@ -1,120 +0,0 @@ -import { invariant } from '../../src/util/assert.js'; -import type { TriageState } from './afkMarker.js'; -import { isTriageState, TRIAGE_STATES } from './afkMarker.js'; - -export type TriageCategory = 'bug' | 'enhancement'; - -const CATEGORY_LABELS = new Set(['bug', 'enhancement']); -// Reuse afkMarker.ts's TRIAGE_STATES so the validator and the type guard -// cannot drift if a state is added in only one of the two places. -const STATE_LABELS: ReadonlySet = TRIAGE_STATES; - -export type LabelTransitionResult = - | { - readonly ok: true; - readonly addLabels: readonly string[]; - readonly removeLabels: readonly string[]; - } - | { - readonly ok: false; - readonly reason: string; - }; - -export interface LabelTransitionInput { - readonly currentLabels: readonly string[]; - readonly nextCategory: TriageCategory | null; - readonly nextState: TriageState; -} - -export function validateTriageLabelTransition( - input: LabelTransitionInput, -): LabelTransitionResult { - invariant( - Array.isArray(input.currentLabels), - 'current labels must be an array', - ); - invariant( - isTriageState(input.nextState), - 'next state must be a triage state', - ); - - const current = new Set(input.currentLabels); - const currentCategories = input.currentLabels.filter(isCategoryLabel); - const currentStates = input.currentLabels.filter(isStateLabel); - - if (currentCategories.length > 1) { - return { - ok: false, - reason: `conflicting category labels: ${currentCategories.sort().join(', ')}`, - }; - } - - if (currentStates.length > 1) { - return { - ok: false, - reason: `conflicting state labels: ${currentStates.sort().join(', ')}`, - }; - } - - if (input.nextCategory === null && currentCategories.length !== 1) { - return { - ok: false, - reason: - 'exactly one current category label is required when nextCategory is null', - }; - } - - const next = new Set(current); - for (const category of CATEGORY_LABELS) { - next.delete(category); - } - for (const state of STATE_LABELS) { - next.delete(state); - } - - const afterCategory = input.nextCategory ?? currentCategories[0]; - invariant( - afterCategory !== undefined, - 'category must exist after nextCategory null validation', - ); - - next.add(afterCategory); - next.add(input.nextState); - - const afterLabels = Array.from(next); - const afterCategories = afterLabels.filter(isCategoryLabel); - const afterStates = afterLabels.filter(isStateLabel); - - if (afterCategories.length !== 1 || afterStates.length !== 1) { - return { - ok: false, - reason: - 'label transition must leave exactly one category and one state label', - }; - } - - const addLabels = afterLabels - .filter((label) => !current.has(label)) - .sort(compareLabel); - const removeLabels = input.currentLabels - .filter((label) => !next.has(label)) - .sort(compareLabel); - - return { - ok: true, - addLabels, - removeLabels, - }; -} - -function isCategoryLabel(label: string): label is TriageCategory { - return CATEGORY_LABELS.has(label as TriageCategory); -} - -function isStateLabel(label: string): label is TriageState { - return STATE_LABELS.has(label as TriageState); -} - -function compareLabel(left: string, right: string): number { - return left.localeCompare(right); -} diff --git a/.sandcastle/lib/pLimit.ts b/.sandcastle/lib/pLimit.ts new file mode 100644 index 0000000..bb5406c --- /dev/null +++ b/.sandcastle/lib/pLimit.ts @@ -0,0 +1,54 @@ +import { invariant } from '../../src/util/assert.js'; + +export function pLimit(concurrency: number) { + invariant( + Number.isInteger(concurrency) && concurrency > 0, + 'concurrency must be a positive integer', + ); + + let active = 0; + const queue: Array<() => void> = []; + + function runNext(): void { + if (active >= concurrency) { + return; + } + + const next = queue.shift(); + if (next === undefined) { + return; + } + + active += 1; + next(); + } + + return function limit(task: () => Promise): Promise { + invariant(typeof task === 'function', 'limited task must be a function'); + + return new Promise((resolve, reject) => { + const run = (): void => { + // Wrap `task()` in `Promise.resolve().then(...)` so a synchronous + // throw inside a non-async caller still becomes a rejection. Without + // this, a sync throw would skip the `.finally()` decrement, leaking + // a concurrency slot permanently and stalling the batch after + // `concurrency` such failures. + Promise.resolve() + .then(task) + .then(resolve, reject) + .finally(() => { + active -= 1; + runNext(); + }); + }; + + if (active < concurrency) { + active += 1; + run(); + return; + } + + queue.push(run); + }); + }; +} diff --git a/.sandcastle/lib/workspaceName.ts b/.sandcastle/lib/workspaceName.ts deleted file mode 100644 index 13a6c07..0000000 --- a/.sandcastle/lib/workspaceName.ts +++ /dev/null @@ -1,16 +0,0 @@ -import { invariant } from '../../src/util/assert.js'; -import { assertIssueNumber } from './issueNumber.js'; - -const WORKSPACE_NAME_PATTERN = /^agent-tty-triage-\d+$/u; - -export function workspaceNameForIssue(issueNumber: number): string { - const checkedIssueNumber = assertIssueNumber(issueNumber); - const workspaceName = `agent-tty-triage-${checkedIssueNumber}`; - - invariant( - WORKSPACE_NAME_PATTERN.test(workspaceName), - 'workspace name must match the AFK triage naming convention', - ); - - return workspaceName; -} diff --git a/.sandcastle/main.ts b/.sandcastle/main.ts index ed7feb8..c28125c 100644 --- a/.sandcastle/main.ts +++ b/.sandcastle/main.ts @@ -3,33 +3,18 @@ import { pathToFileURL } from 'node:url'; import { Command, CommanderError } from 'commander'; import { z } from 'zod'; -import type { Sandbox } from '@ai-hero/sandcastle'; - -import { invariant } from '../src/util/assert.js'; -import { coder, type CoderOptions } from './vendor/sandcastle-coder/coder.js'; -import { branchNameForIssue, assertRunId } from './lib/branchName.js'; +import { assertRunId, createRunId } from './lib/afkIdentity.js'; import { - classifyIssueForTriage, - type TriageComment, - type TriageIssue, -} from './lib/eligibility.js'; -import { conciseErrorMessage, isLockError } from './lib/errorMessage.js'; -import { runCoder, runCoderAsync, runGh, runJson } from './lib/gh.js'; + createTriageBatchRunner, + type TriageBatchRunner, + type TriageIssueStatus, + type TriageIssueSummary, +} from './lib/batchRunner.js'; +import { classifyIssueForTriage, type TriageIssue } from './lib/eligibility.js'; +import { conciseErrorMessage } from './lib/errorMessage.js'; +import { runCoder, runJson } from './lib/gh.js'; +import { listCandidateIssues } from './lib/issueSource.js'; import { parseParallelism } from './lib/parallelism.js'; -import { workspaceNameForIssue } from './lib/workspaceName.js'; - -const TRIAGE_AGENT_IDLE_TIMEOUT_SECONDS = 1800; -const GH_ISSUE_LIST_LIMIT = 500; -// Avoid cwd-based repo inference when running from CI or a sandcastle worktree. -const GH_REPO_ARGS: readonly string[] = ['--repo', 'coder/agent-tty']; - -export type TriageIssueStatus = 'success' | 'locked' | 'failed' | 'skipped'; - -export interface TriageIssueSummary { - readonly issueNumber: number; - readonly status: TriageIssueStatus; - readonly message?: string; -} export interface TriageBatchSummary { readonly runId: string; @@ -44,89 +29,6 @@ interface RunnerArgs { readonly dryRun: boolean; } -const ghLabelSchema = z.looseObject({ - name: z.string(), -}); - -const ghAuthorSchema = z.looseObject({ - login: z.string().optional(), -}); - -export const ghCommentSchema = z.looseObject({ - body: z.string(), - createdAt: z.string(), - // `gh --json` returns `"author": null` for deleted / ghost accounts, so - // accept both null and undefined. Downstream code already uses optional - // chaining (`comment.author?.login`) so the runtime path tolerates both. - author: ghAuthorSchema.nullish(), -}); - -export const ghIssueSchema = z.looseObject({ - number: z.number(), - labels: z.array(ghLabelSchema), - comments: z.array(ghCommentSchema).default([]), - // Same as ghCommentSchema.author: GitHub returns null for deleted users. - author: ghAuthorSchema.nullish(), - createdAt: z.string().optional(), -}); - -const ghIssueListSchema = z.array(ghIssueSchema); - -type GhIssue = z.infer; - -export function pLimit(concurrency: number) { - invariant( - Number.isInteger(concurrency) && concurrency > 0, - 'concurrency must be a positive integer', - ); - - let active = 0; - const queue: Array<() => void> = []; - - function runNext(): void { - if (active >= concurrency) { - return; - } - - const next = queue.shift(); - if (next === undefined) { - return; - } - - active += 1; - next(); - } - - return function limit(task: () => Promise): Promise { - invariant(typeof task === 'function', 'limited task must be a function'); - - return new Promise((resolve, reject) => { - const run = (): void => { - // Wrap `task()` in `Promise.resolve().then(...)` so a synchronous - // throw inside a non-async caller still becomes a rejection. Without - // this, a sync throw would skip the `.finally()` decrement, leaking - // a concurrency slot permanently and stalling the batch after - // `concurrency` such failures. - Promise.resolve() - .then(task) - .then(resolve, reject) - .finally(() => { - active -= 1; - runNext(); - }); - }; - - if (active < concurrency) { - active += 1; - run(); - return; - } - - queue.push(run); - }); - }; -} - export function buildTriageBatchSummary( runId: string, perIssue: readonly TriageIssueSummary[], @@ -202,16 +104,17 @@ export function parseRunnerArgs( }; } -async function runBatch(args: RunnerArgs): Promise { +async function runBatch( + args: RunnerArgs, + batchRunner: TriageBatchRunner, +): Promise { const runId = createRunId(); if (!args.dryRun) { preflightCoder(); } - const issues = listCandidateIssues(args.includeNeedsInfo) - .map(normalizeGhIssue) - .filter(uniqueIssueFilter()); + const issues = listCandidateIssues(args.includeNeedsInfo); const perIssue: TriageIssueSummary[] = []; const eligibleIssues: TriageIssue[] = []; @@ -239,10 +142,10 @@ async function runBatch(args: RunnerArgs): Promise { eligibleIssues.push(issue); } - const limit = pLimit(args.parallelism); - const completed = await Promise.all( - eligibleIssues.map((issue) => limit(() => runTriageAgent(issue, runId))), - ); + const completed = await batchRunner.run(eligibleIssues, { + runId, + parallelism: args.parallelism, + }); perIssue.push(...completed); return buildTriageBatchSummary(runId, perIssue); @@ -252,287 +155,11 @@ function preflightCoder(): void { runJson('coder', ['whoami', '-o', 'json'], z.unknown(), runCoder); } -function listCandidateIssues(includeNeedsInfo: boolean): GhIssue[] { - const issues = [listIssuesByLabel('needs-triage')]; - if (includeNeedsInfo) { - issues.push(listIssuesByLabel('needs-info')); - } - - return issues.flat(); -} - -function listIssuesByLabel(label: string): GhIssue[] { - return runJson( - 'gh', - [ - 'issue', - 'list', - ...GH_REPO_ARGS, - '--label', - label, - '--state', - 'open', - '--limit', - String(GH_ISSUE_LIST_LIMIT), - '--json', - 'number,labels,comments,author,createdAt', - ], - ghIssueListSchema, - runGh, - ); -} - -function normalizeGhIssue(issue: GhIssue): TriageIssue { - return { - number: issue.number, - labels: issue.labels.map((label) => label.name), - comments: issue.comments.map(normalizeGhComment), - }; -} - -function normalizeGhComment( - comment: z.infer, -): TriageComment { - const author = - comment.author?.login === undefined - ? undefined - : { login: comment.author.login }; - - return { - body: comment.body, - createdAt: comment.createdAt, - ...(author === undefined ? {} : { author }), - }; -} - -function uniqueIssueFilter(): (issue: TriageIssue) => boolean { - const seen = new Set(); - - return (issue) => { - if (seen.has(issue.number)) { - return false; - } - - seen.add(issue.number); - return true; - }; -} - -const activeSandboxes = new Map(); -const pendingWorkspaceNames = new Map(); - -// Set synchronously before cleanup awaits so newly dequeued tasks do not -// create workspaces outside the cleanup snapshot. -let shutdownRequested = false; - -async function runTriageAgent( - issue: TriageIssue, - runId: string, -): Promise { - let workspaceName: string | undefined; - let sandbox: Sandbox | undefined; - let result: TriageIssueSummary | undefined; - - if (shutdownRequested) { - return { - issueNumber: issue.number, - status: 'skipped', - message: 'shutdown requested', - }; - } - - try { - workspaceName = workspaceNameForIssue(issue.number); - const { createSandbox, claudeCode } = await import('@ai-hero/sandcastle'); - const coderOptions: CoderOptions = { - template: 'coder', - preset: 'Falkenstein', - workspaceName, - onClose: 'delete', - }; - - // Re-check the shutdown flag after the `await import(...)` yield. - // Even on a module-cache hit, dynamic import yields once to the - // microtask queue; a SIGINT during that microtask sets - // `shutdownRequested = true` and snapshots empty maps. Without this - // re-check the resumed task would still call createSandbox below - // and orphan the resulting workspace when `process.exit()` fires, - // because there are no further await points between the - // pendingWorkspaceNames.set() call and the createSandbox call. - if (shutdownRequested) { - return { - issueNumber: issue.number, - status: 'skipped', - message: 'shutdown requested', - }; - } - - pendingWorkspaceNames.set(workspaceName, issue.number); - try { - // Use sandcastle's HEAD default for the base branch so AFK triage sees this checkout. - sandbox = await createSandbox({ - branch: branchNameForIssue(issue.number, runId), - sandbox: coder(coderOptions), - hooks: { - sandbox: { - onSandboxReady: [ - { command: 'gh auth status' }, - // Sandcastle syncs git-tracked files only; install deps before triage. - { command: 'npm ci' }, - { command: 'npm install -g @anthropic-ai/claude-code' }, - ], - }, - }, - }); - activeSandboxes.set(sandbox, issue.number); - } finally { - pendingWorkspaceNames.delete(workspaceName); - } - - await sandbox.run({ - agent: claudeCode('claude-opus-4-6'), - promptFile: '.sandcastle/triage-prompt.md', - promptArgs: { - ISSUE_NUMBER: String(issue.number), - }, - idleTimeoutSeconds: TRIAGE_AGENT_IDLE_TIMEOUT_SECONDS, - }); - - result = { - issueNumber: issue.number, - status: 'success', - }; - } catch (error) { - console.error(`[issue ${issue.number}]`, error); - result = { - issueNumber: issue.number, - status: - workspaceName !== undefined && isLockError(error, workspaceName) - ? 'locked' - : 'failed', - message: conciseErrorMessage(error), - }; - } finally { - // Skip close when the sandbox was already cleaned up by the signal - // handler (closeActiveSandboxes). Otherwise we would call `coder - // delete` twice on the same workspace; the second call fails because - // the workspace is gone, the catch fires, and a successful triage - // gets silently overwritten with a misleading `close failed: workspace - // not found` message. Membership in `activeSandboxes` is the - // single-owner protocol that disambiguates "still ours to close" - // from "signal handler took it". - if (sandbox !== undefined && activeSandboxes.has(sandbox)) { - activeSandboxes.delete(sandbox); - try { - await sandbox.close(); - } catch (closeError) { - console.error(`[issue ${issue.number}] close failed`, closeError); - const closeMessage = `close failed: ${conciseErrorMessage(closeError)}`; - result = { - issueNumber: issue.number, - status: 'failed', - message: - result === undefined - ? closeMessage - : `${result.message ?? `original status: ${result.status}`}; ${closeMessage}`, - }; - } - } - } - - return result; -} - -function createRunId(date = new Date()): string { - const runId = `${date.getUTCFullYear()}${pad(date.getUTCMonth() + 1)}${pad( - date.getUTCDate(), - )}T${pad(date.getUTCHours())}${pad(date.getUTCMinutes())}${pad( - date.getUTCSeconds(), - )}Z`; - - return assertRunId(runId); -} - -function pad(value: number): string { - return String(value).padStart(2, '0'); -} - function signalExitCode(signal: NodeJS.Signals): number { return signal === 'SIGINT' ? 130 : 143; } -async function closeActiveSandboxes(): Promise { - const sandboxEntries = Array.from(activeSandboxes.entries()); - const pendingEntries = Array.from(pendingWorkspaceNames.entries()); - activeSandboxes.clear(); - pendingWorkspaceNames.clear(); - - const sandboxResults = await Promise.allSettled( - sandboxEntries.map(async ([sandbox]) => { - await sandbox.close(); - }), - ); - - for (const [index, settled] of sandboxResults.entries()) { - const entry = sandboxEntries[index]; - if (entry === undefined) { - continue; - } - const [, issueNumber] = entry; - if (settled.status === 'fulfilled') { - console.error(`[issue ${issueNumber}] sandbox closed during shutdown`); - } else { - console.error( - `[issue ${issueNumber}] sandbox close failed during shutdown — workspace may be stranded:`, - settled.reason, - ); - } - } - - // Reap any workspaces whose `coder create` returned a workspace on the - // control plane but whose `createSandbox` did not yet resolve a Sandbox - // instance for us to close via the normal path. Direct - // `coder delete --yes` is the closest equivalent to the - // `onClose: 'delete'` semantics for the `Sandbox.close()` path. - // - // Use the async runCoderAsync (spawn) variant rather than the sync - // runCoder (spawnSync) one: the synchronous variant blocks the event - // loop, which would prevent a second SIGINT from being delivered to - // the force-exit branch of installSignalHandlers while a hung - // `coder delete` is in progress. The async variant yields between - // chunks so that escape hatch keeps working. - const pendingResults = await Promise.allSettled( - pendingEntries.map(([workspaceName]) => - runCoderAsync(['delete', workspaceName, '--yes']), - ), - ); - - for (const [index, settled] of pendingResults.entries()) { - const entry = pendingEntries[index]; - if (entry === undefined) { - continue; - } - const [workspaceName, issueNumber] = entry; - if (settled.status === 'rejected') { - console.error( - `[issue ${issueNumber}] in-flight workspace ${workspaceName} delete threw during shutdown — workspace may be stranded:`, - settled.reason, - ); - continue; - } - if (settled.value.status === 0) { - console.error( - `[issue ${issueNumber}] in-flight workspace ${workspaceName} deleted during shutdown`, - ); - } else { - console.error( - `[issue ${issueNumber}] in-flight workspace ${workspaceName} delete failed during shutdown (status ${settled.value.status}) — workspace may be stranded: ${settled.value.stderr.trim()}`, - ); - } - } -} - -function installSignalHandlers(): void { +function installSignalHandlers(batchRunner: TriageBatchRunner): void { let signalled = false; const handle = (signal: NodeJS.Signals): void => { // Let a second signal escape hung workspace cleanup. @@ -543,11 +170,15 @@ function installSignalHandlers(): void { process.exit(signalExitCode(signal)); } signalled = true; - shutdownRequested = true; + const status = batchRunner.status; console.error( - `[afk-triage] received ${signal}; closing ${activeSandboxes.size} active sandboxes and ${pendingWorkspaceNames.size} in-flight workspaces (send ${signal} again to force-exit)`, + `[afk-triage] received ${signal}; closing ${status.active} active sandboxes and ${status.pending} in-flight workspaces (send ${signal} again to force-exit)`, ); - closeActiveSandboxes().finally(() => { + // requestShutdown sets the runner's shutdown flag synchronously before + // any await, then resolves once cleanup finishes. The .finally() exit + // ensures the process terminates with the right signal-based code + // regardless of whether cleanup succeeded. + batchRunner.requestShutdown().finally(() => { process.exit(signalExitCode(signal)); }); }; @@ -558,11 +189,12 @@ function installSignalHandlers(): void { async function main(): Promise { process.env['CODER_WORKSPACE_USE_PARAMETER_DEFAULTS'] = 'true'; - installSignalHandlers(); + const batchRunner = createTriageBatchRunner(); + installSignalHandlers(batchRunner); try { const args = parseRunnerArgs(process.argv.slice(2), process.env); - const summary = await runBatch(args); + const summary = await runBatch(args, batchRunner); printSummary(summary); } catch (error) { if (error instanceof CommanderError) { diff --git a/test/unit/sandcastle/afkIdentity.test.ts b/test/unit/sandcastle/afkIdentity.test.ts new file mode 100644 index 0000000..77cec54 --- /dev/null +++ b/test/unit/sandcastle/afkIdentity.test.ts @@ -0,0 +1,159 @@ +import { describe, expect, it } from 'vitest'; + +import { + assertIssueNumber, + assertRunId, + branchNameForIssue, + createRunId, + formatAfkMarker, + parseAfkMarker, + workspaceNameForIssue, +} from '../../../.sandcastle/lib/afkIdentity.js'; + +describe('assertIssueNumber', () => { + it('returns a valid issue number', () => { + expect(assertIssueNumber(42)).toBe(42); + }); + + it('rejects zero and negative numbers', () => { + expect(() => assertIssueNumber(0)).toThrow(/positive integer/u); + expect(() => assertIssueNumber(-1)).toThrow(/positive integer/u); + }); + + it('rejects fractional and non-finite numbers', () => { + expect(() => assertIssueNumber(1.5)).toThrow(/positive integer/u); + expect(() => assertIssueNumber(Number.NaN)).toThrow(/positive integer/u); + expect(() => assertIssueNumber(Number.POSITIVE_INFINITY)).toThrow( + /positive integer/u, + ); + }); +}); + +describe('assertRunId', () => { + it('accepts a compact UTC run ID', () => { + expect(assertRunId('20260430T141500Z')).toBe('20260430T141500Z'); + }); + + it('rejects extended ISO-8601 run IDs', () => { + expect(() => assertRunId('2026-04-30T14:15:00Z')).toThrow(/compact UTC/u); + }); + + it('rejects non-string values', () => { + expect(() => assertRunId(20260430)).toThrow(/compact UTC/u); + }); +}); + +describe('createRunId', () => { + it('formats a fixed Date as a compact UTC run ID', () => { + expect(createRunId(new Date(Date.UTC(2026, 3, 30, 14, 15, 0)))).toBe( + '20260430T141500Z', + ); + }); + + it('zero-pads single-digit fields', () => { + expect(createRunId(new Date(Date.UTC(2026, 0, 1, 1, 2, 3)))).toBe( + '20260101T010203Z', + ); + }); + + it('returned value round-trips through assertRunId', () => { + const runId = createRunId(new Date(Date.UTC(2026, 11, 31, 23, 59, 59))); + expect(assertRunId(runId)).toBe(runId); + }); +}); + +describe('branchNameForIssue', () => { + it('builds the triage branch name for an issue and run ID', () => { + expect(branchNameForIssue(123, '20260430T141500Z')).toBe( + 'afk-triage/123-20260430T141500Z', + ); + }); + + it('rejects an invalid run ID', () => { + expect(() => branchNameForIssue(123, '2026-04-30T14:15:00Z')).toThrow( + /compact UTC/u, + ); + }); + + it('rejects an invalid issue number', () => { + expect(() => branchNameForIssue(0, '20260430T141500Z')).toThrow( + /positive integer/u, + ); + }); +}); + +describe('workspaceNameForIssue', () => { + it('builds the Coder workspace name for an issue', () => { + expect(workspaceNameForIssue(123)).toBe('agent-tty-triage-123'); + }); + + it('rejects an invalid issue number', () => { + expect(() => workspaceNameForIssue(0)).toThrow(/positive integer/u); + }); +}); + +describe('AFK marker helpers', () => { + it('formats and parses a valid marker', () => { + const marker = formatAfkMarker({ + issue: 123, + outcome: 'ready-for-agent', + run: '20260430T141500Z', + }); + + expect(marker).toBe( + '', + ); + expect(parseAfkMarker(marker)).toEqual({ + issue: 123, + outcome: 'ready-for-agent', + run: '20260430T141500Z', + }); + }); + + it('parses a marker embedded after the disclaimer text', () => { + expect( + parseAfkMarker(`> *This was generated by AI during triage.* + + +Needs reporter input.`), + ).toEqual({ + issue: 456, + outcome: 'needs-info', + run: '20260430T141500Z', + }); + }); + + it('returns null for an invalid outcome', () => { + expect( + parseAfkMarker( + '', + ), + ).toBeNull(); + }); + + it('returns null when the issue capture is zero', () => { + expect( + parseAfkMarker( + '', + ), + ).toBeNull(); + }); + + it('returns null when the run ID is not compact UTC format', () => { + expect( + parseAfkMarker( + '', + ), + ).toBeNull(); + }); + + it('rejects an invalid run ID when formatting', () => { + expect(() => + formatAfkMarker({ + issue: 123, + outcome: 'ready-for-agent', + run: '2026-04-30T14:15:00Z', + }), + ).toThrow(/compact UTC/u); + }); +}); diff --git a/test/unit/sandcastle/afkMarker.test.ts b/test/unit/sandcastle/afkMarker.test.ts deleted file mode 100644 index 386de16..0000000 --- a/test/unit/sandcastle/afkMarker.test.ts +++ /dev/null @@ -1,56 +0,0 @@ -import { describe, expect, it } from 'vitest'; - -import { - formatAfkMarker, - parseAfkMarker, -} from '../../../.sandcastle/lib/afkMarker.js'; - -describe('AFK marker helpers', () => { - it('formats and parses a valid marker', () => { - const marker = formatAfkMarker({ - issue: 123, - outcome: 'ready-for-agent', - run: '20260430T141500Z', - }); - - expect(marker).toBe( - '', - ); - expect(parseAfkMarker(marker)).toEqual({ - issue: 123, - outcome: 'ready-for-agent', - run: '20260430T141500Z', - }); - }); - - it('parses a marker embedded after the disclaimer text', () => { - expect( - parseAfkMarker(`> *This was generated by AI during triage.* - - -Needs reporter input.`), - ).toEqual({ - issue: 456, - outcome: 'needs-info', - run: '20260430T141500Z', - }); - }); - - it('returns null for an invalid outcome', () => { - expect( - parseAfkMarker( - '', - ), - ).toBeNull(); - }); - - it('rejects an invalid run ID when formatting', () => { - expect(() => - formatAfkMarker({ - issue: 123, - outcome: 'ready-for-agent', - run: '2026-04-30T14:15:00Z', - }), - ).toThrow(/compact UTC/u); - }); -}); diff --git a/test/unit/sandcastle/batchRunner.test.ts b/test/unit/sandcastle/batchRunner.test.ts new file mode 100644 index 0000000..9aaf7f4 --- /dev/null +++ b/test/unit/sandcastle/batchRunner.test.ts @@ -0,0 +1,566 @@ +import { describe, expect, it, vi } from 'vitest'; + +import { createTriageBatchRunner } from '../../../.sandcastle/lib/batchRunner.js'; +import type { + CoderProvisioner, + ProvisionContext, + ProvisionedAgent, + WorkspaceDeleteResult, +} from '../../../.sandcastle/lib/coderProvisioner.js'; +import type { TriageIssue } from '../../../.sandcastle/lib/eligibility.js'; + +const RUN_ID = '20260430T141500Z'; + +interface FakeAgent { + readonly id: number; + readonly run: ReturnType; + readonly close: ReturnType; +} + +interface FakeProvisioner { + readonly provisioner: CoderProvisioner; + readonly provision: ReturnType; + readonly deleteWorkspace: ReturnType; + readonly agents: Map; + rejectProvision: (error: unknown) => void; +} + +interface FakeProvisionerConfig { + /** When true, provision() returns a controllable promise instead of resolving immediately. */ + readonly hangProvision?: boolean; + /** Throw before any agent is constructed. */ + readonly provisionError?: Error; + /** Override the FakeAgent produced for an issue (e.g. with a throwing run/close). */ + readonly agentFactory?: (issueNumber: number) => FakeAgent; +} + +function defaultAgent(issueNumber: number): FakeAgent { + return { + id: issueNumber, + run: vi.fn(() => Promise.resolve()), + close: vi.fn(() => Promise.resolve()), + }; +} + +function asProvisionedAgent(agent: FakeAgent): ProvisionedAgent { + return agent as unknown as ProvisionedAgent; +} + +function makeFakeProvisioner( + config: FakeProvisionerConfig = {}, +): FakeProvisioner { + // Stores one rejector per hung provision() call so multi-issue tests + // using `hangProvision: true` can settle every issue. A single mutable + // slot would silently overwrite earlier rejectors and Promise.all + // would hang on the unsettled ones. + const pendingRejects: ((error: unknown) => void)[] = []; + + const agentFactory = config.agentFactory ?? defaultAgent; + const agents = new Map(); + + const provision = vi.fn( + (ctx: ProvisionContext): Promise => { + if (config.provisionError !== undefined) { + return Promise.reject(config.provisionError); + } + const issueNumber = ctx.issue.number; + if (config.hangProvision === true) { + return new Promise((_resolve, reject) => { + pendingRejects.push(reject); + }); + } + const agent = agentFactory(issueNumber); + agents.set(issueNumber, agent); + return Promise.resolve(asProvisionedAgent(agent)); + }, + ); + + const deleteWorkspace = vi.fn( + (_workspaceName: string): Promise => + Promise.resolve({ outcome: 'deleted' }), + ); + + const provisioner: CoderProvisioner = { + provision: (ctx) => provision(ctx), + deleteWorkspace: (name) => deleteWorkspace(name), + }; + + return { + provisioner, + provision, + deleteWorkspace, + agents, + rejectProvision: (error) => { + if (pendingRejects.length === 0) { + throw new Error('provision is not pending'); + } + // Drain all so multi-issue hung tests can settle every issue. + for (const reject of pendingRejects.splice(0)) { + reject(error); + } + }, + }; +} + +function makeIssue(number: number): TriageIssue { + return { number, labels: [], comments: [] }; +} + +/** + * Spin until `predicate()` is truthy or `timeoutMs` elapses, throwing a + * descriptive error on timeout. Polls the runner's status counters so + * we can synchronize on observable lifecycle state without hanging the + * whole test on vitest's global timeout. + */ +async function waitFor( + predicate: () => boolean, + description: string, + timeoutMs = 1000, +): Promise { + const deadline = Date.now() + timeoutMs; + while (!predicate()) { + if (Date.now() > deadline) { + throw new Error(`waitFor timed out after ${timeoutMs}ms: ${description}`); + } + await new Promise((resolve) => setImmediate(resolve)); + } +} + +describe('createTriageBatchRunner', () => { + it('resolves to an empty list and does not call the provisioner when no issues are eligible', async () => { + const fakes = makeFakeProvisioner(); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const result = await runner.run([], { runId: RUN_ID, parallelism: 2 }); + + expect(result).toEqual([]); + expect(fakes.provision).not.toHaveBeenCalled(); + expect(fakes.deleteWorkspace).not.toHaveBeenCalled(); + }); + + it('returns success and closes the agent once when an issue completes cleanly', async () => { + const fakes = makeFakeProvisioner(); + const errorLogger = vi.fn(); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger, + }); + + const result = await runner.run([makeIssue(42)], { + runId: RUN_ID, + parallelism: 1, + }); + + expect(result).toEqual([{ issueNumber: 42, status: 'success' }]); + expect(fakes.provision).toHaveBeenCalledTimes(1); + const ctx = fakes.provision.mock.calls[0]?.[0] as ProvisionContext; + expect(ctx).toMatchObject({ + runId: RUN_ID, + workspaceName: 'agent-tty-triage-42', + branchName: 'afk-triage/42-20260430T141500Z', + }); + const agent = fakes.agents.get(42); + expect(agent?.run).toHaveBeenCalledTimes(1); + expect(agent?.close).toHaveBeenCalledTimes(1); + expect(errorLogger).not.toHaveBeenCalled(); + expect(runner.status).toEqual({ active: 0, pending: 0 }); + }); + + it('classifies workspace-name conflicts as locked', async () => { + const fakes = makeFakeProvisioner({ + provisionError: new Error('workspace agent-tty-triage-7 already exists'), + }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const result = await runner.run([makeIssue(7)], { + runId: RUN_ID, + parallelism: 1, + }); + + expect(result).toEqual([ + { + issueNumber: 7, + status: 'locked', + message: 'workspace agent-tty-triage-7 already exists', + }, + ]); + }); + + it('returns failed when agent.run rejects, with concise error message', async () => { + const fakes = makeFakeProvisioner({ + agentFactory: (issueNumber) => ({ + id: issueNumber, + run: vi.fn(() => + Promise.reject(new Error('agent crashed\nstack trace ignored')), + ), + close: vi.fn(() => Promise.resolve()), + }), + }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const result = await runner.run([makeIssue(11)], { + runId: RUN_ID, + parallelism: 1, + }); + + expect(result).toEqual([ + { issueNumber: 11, status: 'failed', message: 'agent crashed' }, + ]); + expect(fakes.agents.get(11)?.close).toHaveBeenCalledTimes(1); + }); + + it('overwrites a successful result when agent.close throws after a successful run', async () => { + const fakes = makeFakeProvisioner({ + agentFactory: (issueNumber) => ({ + id: issueNumber, + run: vi.fn(() => Promise.resolve()), + close: vi.fn(() => Promise.reject(new Error('coder delete refused'))), + }), + }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const result = await runner.run([makeIssue(3)], { + runId: RUN_ID, + parallelism: 1, + }); + + expect(result).toEqual([ + { + issueNumber: 3, + status: 'failed', + message: 'original status: success; close failed: coder delete refused', + }, + ]); + }); + + it('skips queued issues when shutdown is requested before runOne dispatches', async () => { + // Exercises the entry-time `if (shutdownRequested)` check in runOne. + // pLimit schedules runOne via Promise.resolve().then(...). When + // requestShutdown() runs synchronously before that microtask fires, + // shutdownRequested is true by the time runOne starts and the entry + // check returns 'skipped' immediately, never calling provision(). + const fakes = makeFakeProvisioner(); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const runPromise = runner.run([makeIssue(4)], { + runId: RUN_ID, + parallelism: 1, + }); + await runner.requestShutdown(); + const result = await runPromise; + + expect(result).toEqual([ + { issueNumber: 4, status: 'skipped', message: 'shutdown requested' }, + ]); + expect(fakes.provision).not.toHaveBeenCalled(); + }); + + it('skips without double-closing when shutdown fires while provision() is in flight', async () => { + // Exercises the post-provision shutdownRequested re-check in runOne. + // The fake provision() fires requestShutdown() synchronously and then + // resolves with a real agent. Without the post-provision re-check, + // the agent would be registered in activeAgents (already cleared by + // requestShutdown), `agent.run()` would race against the in-flight + // workspace delete, and the result would be 'failed' instead of + // 'skipped'. The post-provision branch must NOT call agent.close(): the + // workspace is already in requestShutdown's deleteWorkspace queue, and + // a parallel close would race with that dispatch and produce a + // spurious "workspace may be stranded" log line. + const runnerRef: { + current: ReturnType | undefined; + } = { current: undefined }; + const baseFakes = makeFakeProvisioner({ + agentFactory: (issueNumber) => ({ + id: issueNumber, + run: vi.fn(() => + Promise.reject( + new Error('agent.run must not fire after post-provision shutdown'), + ), + ), + close: vi.fn(() => Promise.resolve()), + }), + }); + const provisioner: CoderProvisioner = { + provision: async (ctx) => { + if (runnerRef.current !== undefined) { + // Synchronously flip shutdownRequested = true. We do NOT await + // requestShutdown() here so that provision() resolves to its + // agent first; the runOne flow then re-checks the flag. + void runnerRef.current.requestShutdown(); + } + return baseFakes.provisioner.provision(ctx); + }, + deleteWorkspace: (name) => baseFakes.provisioner.deleteWorkspace(name), + }; + const runner = createTriageBatchRunner({ + coderProvisioner: provisioner, + errorLogger: vi.fn(), + }); + runnerRef.current = runner; + + const result = await runner.run([makeIssue(4)], { + runId: RUN_ID, + parallelism: 1, + }); + + expect(result).toEqual([ + { issueNumber: 4, status: 'skipped', message: 'shutdown requested' }, + ]); + // The post-provision skip path must not call run() OR close(); the + // workspace is reaped by requestShutdown's deleteWorkspace queue. + expect(baseFakes.agents.get(4)?.run).not.toHaveBeenCalled(); + expect(baseFakes.agents.get(4)?.close).not.toHaveBeenCalled(); + expect(baseFakes.deleteWorkspace).toHaveBeenCalledWith( + 'agent-tty-triage-4', + ); + }); + + it('composes run-fail and close-fail messages in the close-error path', async () => { + // Exercises the run-fail + close-fail double-failure path: agent.run() + // rejects (result becomes failed) and the subsequent agent.close() in + // runOne's finally also throws, producing the composed message + // `; close failed: `. + const fakes = makeFakeProvisioner({ + agentFactory: (issueNumber) => ({ + id: issueNumber, + run: vi.fn(() => Promise.reject(new Error('agent crashed'))), + close: vi.fn(() => Promise.reject(new Error('coder delete refused'))), + }), + }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const result = await runner.run([makeIssue(13)], { + runId: RUN_ID, + parallelism: 1, + }); + + expect(result).toEqual([ + { + issueNumber: 13, + status: 'failed', + message: 'agent crashed; close failed: coder delete refused', + }, + ]); + }); + + it('closes active agents via requestShutdown while agent.run() is mid-flight', async () => { + // Exercises the single-owner close protocol: an agent registered in + // activeAgents is closed by requestShutdown(), and runOne's finally + // sees activeAgents.has(agent) === false and skips the second close. + const runHang: { resolve: () => void; promise: Promise } = { + resolve: () => undefined, + promise: Promise.resolve(), + }; + runHang.promise = new Promise((resolve) => { + runHang.resolve = resolve; + }); + const fakes = makeFakeProvisioner({ + agentFactory: (issueNumber) => ({ + id: issueNumber, + run: vi.fn(() => runHang.promise), + close: vi.fn(() => Promise.resolve()), + }), + }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const runPromise = runner.run([makeIssue(8)], { + runId: RUN_ID, + parallelism: 1, + }); + + // Wait until the agent is registered and agent.run() is in flight. + await waitFor( + () => runner.status.active > 0, + 'runner.status.active to advance after provision()', + ); + expect(runner.status.active).toBe(1); + expect(runner.status.pending).toBe(0); + + await runner.requestShutdown(); + + // Single-owner protocol: requestShutdown closed the agent exactly once. + expect(fakes.agents.get(8)?.close).toHaveBeenCalledTimes(1); + + // Let agent.run() resolve; runOne's finally must not double-close. + runHang.resolve(); + const summaries = await runPromise; + expect(summaries[0]?.status).toBe('success'); + expect(fakes.agents.get(8)?.close).toHaveBeenCalledTimes(1); + }); + + it('reaps in-flight workspaces via provisioner.deleteWorkspace during shutdown', async () => { + const fakes = makeFakeProvisioner({ hangProvision: true }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + const runPromise = runner.run([makeIssue(5)], { + runId: RUN_ID, + parallelism: 1, + }); + + // Wait until the runner has registered the in-flight workspace. + await waitFor( + () => runner.status.pending > 0, + 'runner.status.pending to advance after pendingWorkspaceNames.set', + ); + expect(runner.status.pending).toBe(1); + + await runner.requestShutdown(); + + expect(fakes.deleteWorkspace).toHaveBeenCalledWith('agent-tty-triage-5'); + + fakes.rejectProvision(new Error('cancelled')); + const summaries = await runPromise; + expect(summaries[0]?.status).toBe('failed'); + }); + + it('logs "workspace may be stranded" when active-agent close throws during shutdown', async () => { + // Exercises the active-agent close-fail branch in requestShutdown: + // an agent registered in activeAgents whose close() throws should + // surface the per-issue "sandbox close failed during shutdown — + // workspace may be stranded" log line so operators know to clean up + // the workspace manually. A regression that silences this log or + // drops the error reason would defeat that contract. + const errorLogger = vi.fn(); + const runHang: { resolve: () => void; promise: Promise } = { + resolve: () => undefined, + promise: Promise.resolve(), + }; + runHang.promise = new Promise((resolve) => { + runHang.resolve = resolve; + }); + const closeError = new Error('coder delete refused during shutdown'); + const fakes = makeFakeProvisioner({ + agentFactory: (issueNumber) => ({ + id: issueNumber, + run: vi.fn(() => runHang.promise), + close: vi.fn(() => Promise.reject(closeError)), + }), + }); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger, + }); + + const runPromise = runner.run([makeIssue(91)], { + runId: RUN_ID, + parallelism: 1, + }); + await waitFor( + () => runner.status.active > 0, + 'agent registered in activeAgents', + ); + + await runner.requestShutdown(); + + expect(errorLogger).toHaveBeenCalledWith( + '[issue 91] sandbox close failed during shutdown — workspace may be stranded:', + closeError, + ); + + runHang.resolve(); + await runPromise; + }); + + it('logs "workspace may be stranded" when in-flight deleteWorkspace exits non-zero during shutdown', async () => { + // Exercises the pending-delete-fail branch in requestShutdown: when + // deleteWorkspace returns { outcome: 'failed', status, stderr } for + // an in-flight workspace, the runner emits a per-issue "in-flight + // workspace ... delete failed during shutdown (status N) — workspace + // may be stranded: " log line. Operators rely on this for + // manual cleanup of stranded Coder workspaces. + const errorLogger = vi.fn(); + const baseFakes = makeFakeProvisioner({ hangProvision: true }); + const failedDelete = vi.fn( + (_name: string): Promise => + Promise.resolve({ + outcome: 'failed', + status: 7, + stderr: 'workspace not found \n', + }), + ); + const provisioner: CoderProvisioner = { + provision: (ctx) => baseFakes.provisioner.provision(ctx), + deleteWorkspace: (name) => failedDelete(name), + }; + const runner = createTriageBatchRunner({ + coderProvisioner: provisioner, + errorLogger, + }); + + const runPromise = runner.run([makeIssue(92)], { + runId: RUN_ID, + parallelism: 1, + }); + await waitFor( + () => runner.status.pending > 0, + 'workspace registered in pendingWorkspaceNames', + ); + + await runner.requestShutdown(); + + expect(errorLogger).toHaveBeenCalledWith( + '[issue 92] in-flight workspace agent-tty-triage-92 delete failed during shutdown (status 7) — workspace may be stranded: workspace not found', + ); + + baseFakes.rejectProvision(new Error('cancelled')); + await runPromise; + }); + + it('exposes status counts that drop back to zero after run completes', async () => { + const fakes = makeFakeProvisioner(); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + expect(runner.status).toEqual({ active: 0, pending: 0 }); + + await runner.run([makeIssue(1), makeIssue(2)], { + runId: RUN_ID, + parallelism: 2, + }); + + expect(runner.status).toEqual({ active: 0, pending: 0 }); + }); + + it('requestShutdown is idempotent and does not call deleteWorkspace when nothing is pending', async () => { + const fakes = makeFakeProvisioner(); + const runner = createTriageBatchRunner({ + coderProvisioner: fakes.provisioner, + errorLogger: vi.fn(), + }); + + await runner.run([makeIssue(9)], { runId: RUN_ID, parallelism: 1 }); + + await runner.requestShutdown(); + await runner.requestShutdown(); + + // No active agents or pending workspaces remained after the run, so + // the cleanup snapshots were both empty. + expect(fakes.deleteWorkspace).not.toHaveBeenCalled(); + }); +}); diff --git a/test/unit/sandcastle/branchName.test.ts b/test/unit/sandcastle/branchName.test.ts deleted file mode 100644 index c8df6ba..0000000 --- a/test/unit/sandcastle/branchName.test.ts +++ /dev/null @@ -1,23 +0,0 @@ -import { describe, expect, it } from 'vitest'; - -import { branchNameForIssue } from '../../../.sandcastle/lib/branchName.js'; - -describe('branchNameForIssue', () => { - it('builds the triage branch name for an issue and run ID', () => { - expect(branchNameForIssue(123, '20260430T141500Z')).toBe( - 'afk-triage/123-20260430T141500Z', - ); - }); - - it('rejects an invalid run ID', () => { - expect(() => branchNameForIssue(123, '2026-04-30T14:15:00Z')).toThrow( - /compact UTC/u, - ); - }); - - it('rejects an invalid issue number', () => { - expect(() => branchNameForIssue(0, '20260430T141500Z')).toThrow( - /positive integer/u, - ); - }); -}); diff --git a/test/unit/sandcastle/coderProvisioner.test.ts b/test/unit/sandcastle/coderProvisioner.test.ts new file mode 100644 index 0000000..d0bd3e3 --- /dev/null +++ b/test/unit/sandcastle/coderProvisioner.test.ts @@ -0,0 +1,225 @@ +import { describe, expect, it, vi } from 'vitest'; + +import { + createCoderProvisioner, + type CoderFactory, + type ProvisionContext, + type SandcastleImports, +} from '../../../.sandcastle/lib/coderProvisioner.js'; +import type { CommandResult } from '../../../.sandcastle/lib/gh.js'; + +const RUN_ID = '20260430T141500Z'; + +interface FakeSandbox { + readonly run: ReturnType; + readonly close: ReturnType; +} + +function makeFakeSandcastle(sandbox: FakeSandbox): { + readonly imports: SandcastleImports; + readonly createSandbox: ReturnType; + readonly claudeCode: ReturnType; + readonly importFn: () => Promise; +} { + const createSandbox = vi.fn(() => Promise.resolve(sandbox)); + const claudeCode = vi.fn((model: string) => ({ kind: 'fake-agent', model })); + const imports = { + createSandbox, + claudeCode, + } as unknown as SandcastleImports; + return { + imports, + createSandbox, + claudeCode, + importFn: () => Promise.resolve(imports), + }; +} + +function makeProvisionContext(issueNumber: number): ProvisionContext { + return { + issue: { number: issueNumber, labels: [], comments: [] }, + runId: RUN_ID, + workspaceName: `agent-tty-triage-${issueNumber}`, + branchName: `afk-triage/${issueNumber}-${RUN_ID}`, + }; +} + +// The vendored `coder()` factory hides template, preset, onClose, and +// workspaceName inside its closure — they never appear on the returned +// provider. To pin those production knobs, tests inject a fake factory +// and assert on the args passed to it. +const FAKE_PROVIDER = { + tag: 'isolated' as const, + name: 'coder', + env: {} as Record, + create: () => Promise.resolve({} as never), +}; + +function fakeCoderFactory(): { + readonly factory: CoderFactory; + readonly mock: ReturnType; +} { + const mock = vi.fn(() => FAKE_PROVIDER); + return { + factory: mock as unknown as CoderFactory, + mock, + }; +} + +const noopRunCoderAsync = vi.fn(() => + Promise.resolve({ stdout: '', stderr: '', status: 0 }), +); + +describe('createCoderProvisioner.provision', () => { + it('passes production template, preset, onClose, and workspaceName to the coder() factory', async () => { + const sandbox: FakeSandbox = { + run: vi.fn(() => Promise.resolve()), + close: vi.fn(() => Promise.resolve()), + }; + const fakes = makeFakeSandcastle(sandbox); + const coderFactory = fakeCoderFactory(); + const provisioner = createCoderProvisioner({ + importSandcastle: fakes.importFn, + runCoderAsync: noopRunCoderAsync, + coderFactory: coderFactory.factory, + }); + + await provisioner.provision(makeProvisionContext(42)); + + expect(coderFactory.mock).toHaveBeenCalledTimes(1); + expect(coderFactory.mock).toHaveBeenCalledWith({ + template: 'coder', + preset: 'Falkenstein', + workspaceName: 'agent-tty-triage-42', + onClose: 'delete', + }); + }); + + it('forwards the provider produced by coder() and the AFK ready-hooks to createSandbox', async () => { + const sandbox: FakeSandbox = { + run: vi.fn(() => Promise.resolve()), + close: vi.fn(() => Promise.resolve()), + }; + const fakes = makeFakeSandcastle(sandbox); + const coderFactory = fakeCoderFactory(); + const provisioner = createCoderProvisioner({ + importSandcastle: fakes.importFn, + runCoderAsync: noopRunCoderAsync, + coderFactory: coderFactory.factory, + }); + + await provisioner.provision(makeProvisionContext(42)); + + expect(fakes.createSandbox).toHaveBeenCalledTimes(1); + const call = fakes.createSandbox.mock.calls[0]?.[0] as { + readonly branch: string; + readonly sandbox: typeof FAKE_PROVIDER; + readonly hooks: { + readonly sandbox: { readonly onSandboxReady: readonly unknown[] }; + }; + }; + expect(call.branch).toBe('afk-triage/42-20260430T141500Z'); + expect(call.sandbox).toBe(FAKE_PROVIDER); + expect(call.hooks.sandbox.onSandboxReady).toEqual([ + { command: 'gh auth status' }, + { command: 'npm ci' }, + { command: 'npm install -g @anthropic-ai/claude-code' }, + ]); + }); + + it('runs the Triage Agent with the pinned model, prompt file, and idle timeout', async () => { + const sandbox: FakeSandbox = { + run: vi.fn(() => Promise.resolve()), + close: vi.fn(() => Promise.resolve()), + }; + const fakes = makeFakeSandcastle(sandbox); + const coderFactory = fakeCoderFactory(); + const provisioner = createCoderProvisioner({ + importSandcastle: fakes.importFn, + runCoderAsync: noopRunCoderAsync, + coderFactory: coderFactory.factory, + }); + + const agent = await provisioner.provision(makeProvisionContext(123)); + await agent.run(); + + expect(fakes.claudeCode).toHaveBeenCalledWith('claude-opus-4-6'); + expect(sandbox.run).toHaveBeenCalledTimes(1); + const runCall = sandbox.run.mock.calls[0]?.[0] as { + readonly promptFile: string; + readonly promptArgs: Record; + readonly idleTimeoutSeconds: number; + }; + expect(runCall.promptFile).toBe('.sandcastle/triage-prompt.md'); + expect(runCall.promptArgs).toEqual({ ISSUE_NUMBER: '123' }); + expect(runCall.idleTimeoutSeconds).toBe(1800); + }); + + it('close delegates to sandbox.close', async () => { + const sandbox: FakeSandbox = { + run: vi.fn(() => Promise.resolve()), + close: vi.fn(() => Promise.resolve()), + }; + const fakes = makeFakeSandcastle(sandbox); + const coderFactory = fakeCoderFactory(); + const provisioner = createCoderProvisioner({ + importSandcastle: fakes.importFn, + runCoderAsync: noopRunCoderAsync, + coderFactory: coderFactory.factory, + }); + + const agent = await provisioner.provision(makeProvisionContext(7)); + await agent.close(); + + expect(sandbox.close).toHaveBeenCalledTimes(1); + }); +}); + +describe('createCoderProvisioner.deleteWorkspace', () => { + it('runs `coder delete --yes` and returns deleted on status 0', async () => { + const runCoderAsync = vi.fn(() => + Promise.resolve({ + stdout: '', + stderr: '', + status: 0, + }), + ); + const provisioner = createCoderProvisioner({ + importSandcastle: () => + Promise.resolve({} as unknown as SandcastleImports), + runCoderAsync, + }); + + const result = await provisioner.deleteWorkspace('agent-tty-triage-9'); + + expect(runCoderAsync).toHaveBeenCalledWith([ + 'delete', + 'agent-tty-triage-9', + '--yes', + ]); + expect(result).toEqual({ outcome: 'deleted' }); + }); + + it('returns failed with status and stderr when coder delete exits non-zero', async () => { + const runCoderAsync = vi.fn(() => + Promise.resolve({ + stdout: '', + stderr: 'workspace not found\n', + status: 1, + }), + ); + const provisioner = createCoderProvisioner({ + importSandcastle: () => + Promise.resolve({} as unknown as SandcastleImports), + runCoderAsync, + }); + + const result = await provisioner.deleteWorkspace('agent-tty-triage-9'); + + expect(result).toEqual({ + outcome: 'failed', + status: 1, + stderr: 'workspace not found\n', + }); + }); +}); diff --git a/test/unit/sandcastle/commentActivity.test.ts b/test/unit/sandcastle/commentActivity.test.ts new file mode 100644 index 0000000..b96dc48 --- /dev/null +++ b/test/unit/sandcastle/commentActivity.test.ts @@ -0,0 +1,180 @@ +import { describe, expect, it } from 'vitest'; + +import { + latestAfkMarkerCreatedAt, + latestHumanActivityAt, + loadTrustedMarkerAuthors, +} from '../../../.sandcastle/lib/commentActivity.js'; +import type { TriageComment } from '../../../.sandcastle/lib/eligibility.js'; + +const ISSUE = 123; +const OTHER_ISSUE = 456; + +const markerForIssue = (issueNumber: number, runId: string): string => + ``; + +describe('loadTrustedMarkerAuthors', () => { + it('returns undefined when the env var is unset', () => { + expect(loadTrustedMarkerAuthors({})).toBeUndefined(); + }); + + it('returns undefined when the env var is empty', () => { + expect( + loadTrustedMarkerAuthors({ AFK_TRIAGE_TRUSTED_MARKER_AUTHORS: '' }), + ).toBeUndefined(); + }); + + it('returns undefined when the env var is whitespace-only', () => { + expect( + loadTrustedMarkerAuthors({ + AFK_TRIAGE_TRUSTED_MARKER_AUTHORS: ' ', + }), + ).toBeUndefined(); + }); + + it('returns a single-entry list', () => { + expect( + loadTrustedMarkerAuthors({ + AFK_TRIAGE_TRUSTED_MARKER_AUTHORS: 'triage-bot', + }), + ).toEqual(['triage-bot']); + }); + + it('parses a comma list with whitespace and drops empties', () => { + expect( + loadTrustedMarkerAuthors({ + AFK_TRIAGE_TRUSTED_MARKER_AUTHORS: ' triage-bot , , ops-bot ', + }), + ).toEqual(['triage-bot', 'ops-bot']); + }); + + it('returns the empty allow-list when only commas remain after trimming', () => { + // A non-empty env value that yields no usable entries is "trust no + // author" — the secure default for an opt-in allow-list. Returning + // `undefined` here would silently mean "trust everyone" and invert + // the security posture for a malformed env value. + expect( + loadTrustedMarkerAuthors({ AFK_TRIAGE_TRUSTED_MARKER_AUTHORS: ' , , ' }), + ).toEqual([]); + }); +}); + +describe('latestAfkMarkerCreatedAt', () => { + it('returns undefined when there are no comments', () => { + expect(latestAfkMarkerCreatedAt(ISSUE, [], {})).toBeUndefined(); + }); + + it('returns the latest of multiple AFK markers for the same issue', () => { + const earlier: TriageComment = { + body: markerForIssue(ISSUE, '20260430T141500Z'), + createdAt: '2026-04-30T14:15:00Z', + }; + const later: TriageComment = { + body: markerForIssue(ISSUE, '20260430T150000Z'), + createdAt: '2026-04-30T15:00:00Z', + }; + + expect(latestAfkMarkerCreatedAt(ISSUE, [earlier, later], {})).toBe( + Date.parse('2026-04-30T15:00:00Z'), + ); + }); + + it('ignores AFK markers whose embedded issue number does not match', () => { + const otherIssueMarker: TriageComment = { + body: markerForIssue(OTHER_ISSUE, '20260430T150000Z'), + createdAt: '2026-04-30T15:00:00Z', + }; + + expect( + latestAfkMarkerCreatedAt(ISSUE, [otherIssueMarker], {}), + ).toBeUndefined(); + }); + + it('respects the trusted-author allow-list', () => { + const trustedMarker: TriageComment = { + body: markerForIssue(ISSUE, '20260430T150000Z'), + createdAt: '2026-04-30T15:00:00Z', + author: { login: 'triage-bot' }, + }; + const spoofedMarker: TriageComment = { + body: markerForIssue(ISSUE, '20260430T160000Z'), + createdAt: '2026-04-30T16:00:00Z', + author: { login: 'attacker' }, + }; + + expect( + latestAfkMarkerCreatedAt(ISSUE, [trustedMarker, spoofedMarker], { + trustedMarkerAuthors: ['triage-bot'], + }), + ).toBe(Date.parse('2026-04-30T15:00:00Z')); + }); + + it('ignores invalid createdAt timestamps', () => { + const invalid: TriageComment = { + body: markerForIssue(ISSUE, '20260430T150000Z'), + createdAt: 'not a real date', + }; + + expect(latestAfkMarkerCreatedAt(ISSUE, [invalid], {})).toBeUndefined(); + }); +}); + +describe('latestHumanActivityAt', () => { + it('ignores comments authored by GitHub Apps (login ending in [bot])', () => { + const botComment: TriageComment = { + body: 'Bumps `vite`.', + createdAt: '2026-04-30T15:00:00Z', + author: { login: 'dependabot[bot]' }, + }; + + expect(latestHumanActivityAt(ISSUE, [botComment])).toBeUndefined(); + }); + + it('ignores AFK markers for the same issue', () => { + const localMarker: TriageComment = { + body: markerForIssue(ISSUE, '20260430T150000Z'), + createdAt: '2026-04-30T15:00:00Z', + }; + + expect(latestHumanActivityAt(ISSUE, [localMarker])).toBeUndefined(); + }); + + it('counts AFK markers for a different issue as activity', () => { + const otherIssueMarker: TriageComment = { + body: markerForIssue(OTHER_ISSUE, '20260430T150000Z'), + createdAt: '2026-04-30T15:00:00Z', + }; + + expect(latestHumanActivityAt(ISSUE, [otherIssueMarker])).toBe( + Date.parse('2026-04-30T15:00:00Z'), + ); + }); + + it('returns the latest qualifying timestamp across mixed comments', () => { + const comments: readonly TriageComment[] = [ + { + body: 'Reporter follow-up.', + createdAt: '2026-04-30T14:00:00Z', + author: { login: 'reporter' }, + }, + { + body: markerForIssue(ISSUE, '20260430T141500Z'), + createdAt: '2026-04-30T14:15:00Z', + }, + { + body: 'Bumps `vite`.', + createdAt: '2026-04-30T15:00:00Z', + author: { login: 'dependabot[bot]' }, + }, + { + body: 'Another reporter note.', + createdAt: '2026-04-30T15:30:00Z', + author: { login: 'reporter' }, + }, + ]; + + expect(latestHumanActivityAt(ISSUE, comments)).toBe( + Date.parse('2026-04-30T15:30:00Z'), + ); + }); +}); diff --git a/test/unit/sandcastle/issueNumber.test.ts b/test/unit/sandcastle/issueNumber.test.ts deleted file mode 100644 index e861451..0000000 --- a/test/unit/sandcastle/issueNumber.test.ts +++ /dev/null @@ -1,22 +0,0 @@ -import { describe, expect, it } from 'vitest'; - -import { assertIssueNumber } from '../../../.sandcastle/lib/issueNumber.js'; - -describe('assertIssueNumber', () => { - it('returns a valid issue number', () => { - expect(assertIssueNumber(42)).toBe(42); - }); - - it('rejects zero and negative numbers', () => { - expect(() => assertIssueNumber(0)).toThrow(/positive integer/u); - expect(() => assertIssueNumber(-1)).toThrow(/positive integer/u); - }); - - it('rejects fractional and non-finite numbers', () => { - expect(() => assertIssueNumber(1.5)).toThrow(/positive integer/u); - expect(() => assertIssueNumber(Number.NaN)).toThrow(/positive integer/u); - expect(() => assertIssueNumber(Number.POSITIVE_INFINITY)).toThrow( - /positive integer/u, - ); - }); -}); diff --git a/test/unit/sandcastle/issueSource.test.ts b/test/unit/sandcastle/issueSource.test.ts new file mode 100644 index 0000000..61c4763 --- /dev/null +++ b/test/unit/sandcastle/issueSource.test.ts @@ -0,0 +1,197 @@ +import { describe, expect, it } from 'vitest'; + +import type { CommandResult } from '../../../.sandcastle/lib/gh.js'; +import { + ghCommentSchema, + ghIssueSchema, + listCandidateIssues, +} from '../../../.sandcastle/lib/issueSource.js'; + +describe('ghCommentSchema null-author handling', () => { + it('accepts comments with null author (ghost/deleted accounts)', () => { + expect(() => + ghCommentSchema.parse({ + body: 'comment from a ghost account', + createdAt: '2026-04-30T14:15:00Z', + author: null, + }), + ).not.toThrow(); + }); + + it('accepts comments with undefined author', () => { + expect(() => + ghCommentSchema.parse({ + body: 'comment with no author key', + createdAt: '2026-04-30T14:15:00Z', + }), + ).not.toThrow(); + }); + + it('accepts comments with a normal author object', () => { + expect(() => + ghCommentSchema.parse({ + body: 'normal comment', + createdAt: '2026-04-30T14:15:00Z', + author: { login: 'alice' }, + }), + ).not.toThrow(); + }); +}); + +describe('ghIssueSchema null-author handling', () => { + it('accepts issues with null author and a comment whose author is null', () => { + expect(() => + ghIssueSchema.parse({ + number: 42, + labels: [{ name: 'needs-triage' }], + author: null, + createdAt: '2026-04-30T12:00:00Z', + comments: [ + { + body: 'orphaned comment', + createdAt: '2026-04-30T13:00:00Z', + author: null, + }, + ], + }), + ).not.toThrow(); + }); +}); + +interface RecordedCall { + readonly args: readonly string[]; +} + +function recordingRunner(responses: Record): { + calls: RecordedCall[]; + runner: (args: readonly string[]) => CommandResult; +} { + const calls: RecordedCall[] = []; + + const runner = (args: readonly string[]): CommandResult => { + calls.push({ args }); + const labelIndex = args.indexOf('--label'); + const label = labelIndex >= 0 ? args[labelIndex + 1] : undefined; + if (label === undefined || !(label in responses)) { + throw new Error(`unexpected label in test runner: ${String(label)}`); + } + return { + stdout: JSON.stringify(responses[label]), + stderr: '', + status: 0, + }; + }; + + return { calls, runner }; +} + +describe('listCandidateIssues', () => { + it('queries only needs-triage when includeNeedsInfo is false', () => { + const { calls, runner } = recordingRunner({ + 'needs-triage': [ + { + number: 1, + labels: [{ name: 'needs-triage' }], + comments: [], + }, + ], + }); + + const issues = listCandidateIssues(false, runner); + + expect(issues).toEqual([ + { number: 1, labels: ['needs-triage'], comments: [] }, + ]); + expect(calls).toHaveLength(1); + expect(calls[0]?.args).toEqual([ + 'issue', + 'list', + '--repo', + 'coder/agent-tty', + '--label', + 'needs-triage', + '--state', + 'open', + '--limit', + '500', + '--json', + 'number,labels,comments,author,createdAt', + ]); + }); + + it('queries both labels when includeNeedsInfo is true', () => { + const { calls, runner } = recordingRunner({ + 'needs-triage': [], + 'needs-info': [], + }); + + listCandidateIssues(true, runner); + + expect( + calls.map((call) => call.args[call.args.indexOf('--label') + 1]), + ).toEqual(['needs-triage', 'needs-info']); + }); + + it('normalizes labels and comments into the orchestrator-facing shape', () => { + const { runner } = recordingRunner({ + 'needs-triage': [ + { + number: 7, + labels: [{ name: 'bug' }, { name: 'needs-triage' }], + comments: [ + { + body: 'first', + createdAt: '2026-04-30T12:00:00Z', + author: { login: 'alice' }, + }, + { + body: 'orphan', + createdAt: '2026-04-30T13:00:00Z', + author: null, + }, + ], + }, + ], + }); + + const issues = listCandidateIssues(false, runner); + + expect(issues).toEqual([ + { + number: 7, + labels: ['bug', 'needs-triage'], + comments: [ + { + body: 'first', + createdAt: '2026-04-30T12:00:00Z', + author: { login: 'alice' }, + }, + { + body: 'orphan', + createdAt: '2026-04-30T13:00:00Z', + }, + ], + }, + ]); + }); + + it('deduplicates issues that appear under both labels and preserves first-seen order', () => { + const { runner } = recordingRunner({ + 'needs-triage': [ + { number: 10, labels: [{ name: 'needs-triage' }], comments: [] }, + { number: 11, labels: [{ name: 'needs-triage' }], comments: [] }, + ], + 'needs-info': [ + // Duplicate of #10 with stale label snapshot — must be dropped. + { number: 10, labels: [{ name: 'needs-info' }], comments: [] }, + { number: 12, labels: [{ name: 'needs-info' }], comments: [] }, + ], + }); + + const issues = listCandidateIssues(true, runner); + + expect(issues.map((issue) => issue.number)).toEqual([10, 11, 12]); + // The first-seen issue (from needs-triage) wins. + expect(issues[0]?.labels).toEqual(['needs-triage']); + }); +}); diff --git a/test/unit/sandcastle/labelInvariants.test.ts b/test/unit/sandcastle/labelInvariants.test.ts deleted file mode 100644 index c843b8e..0000000 --- a/test/unit/sandcastle/labelInvariants.test.ts +++ /dev/null @@ -1,60 +0,0 @@ -import { describe, expect, it } from 'vitest'; - -import { validateTriageLabelTransition } from '../../../.sandcastle/lib/labelInvariants.js'; - -describe('validateTriageLabelTransition', () => { - it('computes a state transition that leaves one category and one state', () => { - expect( - validateTriageLabelTransition({ - currentLabels: ['bug', 'needs-triage'], - nextCategory: 'bug', - nextState: 'ready-for-agent', - }), - ).toEqual({ - ok: true, - addLabels: ['ready-for-agent'], - removeLabels: ['needs-triage'], - }); - }); - - it('computes a category change', () => { - expect( - validateTriageLabelTransition({ - currentLabels: ['bug', 'needs-triage'], - nextCategory: 'enhancement', - nextState: 'ready-for-human', - }), - ).toEqual({ - ok: true, - addLabels: ['enhancement', 'ready-for-human'], - removeLabels: ['bug', 'needs-triage'], - }); - }); - - it('rejects nextCategory null without one current category', () => { - expect( - validateTriageLabelTransition({ - currentLabels: ['needs-triage'], - nextCategory: null, - nextState: 'needs-info', - }), - ).toEqual({ - ok: false, - reason: - 'exactly one current category label is required when nextCategory is null', - }); - }); - - it('rejects conflicting current state labels', () => { - const result = validateTriageLabelTransition({ - currentLabels: ['bug', 'needs-triage', 'needs-info'], - nextCategory: 'bug', - nextState: 'ready-for-agent', - }); - - expect(result.ok).toBe(false); - expect(result).toMatchObject({ - reason: 'conflicting state labels: needs-info, needs-triage', - }); - }); -}); diff --git a/test/unit/sandcastle/main.test.ts b/test/unit/sandcastle/main.test.ts index fff4625..4004f8f 100644 --- a/test/unit/sandcastle/main.test.ts +++ b/test/unit/sandcastle/main.test.ts @@ -1,11 +1,9 @@ import { CommanderError } from 'commander'; import { describe, expect, it } from 'vitest'; +import { pLimit } from '../../../.sandcastle/lib/pLimit.js'; import { buildTriageBatchSummary, - ghCommentSchema, - ghIssueSchema, - pLimit, parseRunnerArgs, } from '../../../.sandcastle/main.js'; @@ -70,57 +68,6 @@ describe('buildTriageBatchSummary', () => { }); }); -describe('ghCommentSchema null-author handling', () => { - it('accepts comments with null author (ghost/deleted accounts)', () => { - expect(() => - ghCommentSchema.parse({ - body: 'comment from a ghost account', - createdAt: '2026-04-30T14:15:00Z', - author: null, - }), - ).not.toThrow(); - }); - - it('accepts comments with undefined author', () => { - expect(() => - ghCommentSchema.parse({ - body: 'comment with no author key', - createdAt: '2026-04-30T14:15:00Z', - }), - ).not.toThrow(); - }); - - it('accepts comments with a normal author object', () => { - expect(() => - ghCommentSchema.parse({ - body: 'normal comment', - createdAt: '2026-04-30T14:15:00Z', - author: { login: 'alice' }, - }), - ).not.toThrow(); - }); -}); - -describe('ghIssueSchema null-author handling', () => { - it('accepts issues with null author and a comment whose author is null', () => { - expect(() => - ghIssueSchema.parse({ - number: 42, - labels: [{ name: 'needs-triage' }], - author: null, - createdAt: '2026-04-30T12:00:00Z', - comments: [ - { - body: 'orphaned comment', - createdAt: '2026-04-30T13:00:00Z', - author: null, - }, - ], - }), - ).not.toThrow(); - }); -}); - describe('parseRunnerArgs', () => { it('returns defaults when no flags or env are set', () => { expect(parseRunnerArgs([], {})).toEqual({ diff --git a/test/unit/sandcastle/workspaceName.test.ts b/test/unit/sandcastle/workspaceName.test.ts deleted file mode 100644 index 81b61ba..0000000 --- a/test/unit/sandcastle/workspaceName.test.ts +++ /dev/null @@ -1,13 +0,0 @@ -import { describe, expect, it } from 'vitest'; - -import { workspaceNameForIssue } from '../../../.sandcastle/lib/workspaceName.js'; - -describe('workspaceNameForIssue', () => { - it('builds the Coder workspace name for an issue', () => { - expect(workspaceNameForIssue(123)).toBe('agent-tty-triage-123'); - }); - - it('rejects an invalid issue number', () => { - expect(() => workspaceNameForIssue(0)).toThrow(/positive integer/u); - }); -});