Skip to content

🤖 feat: AI-generated sidebar status via small-model loop#3238

Merged
ammario merged 33 commits into
mainfrom
sidebar-t7ry
May 9, 2026
Merged

🤖 feat: AI-generated sidebar status via small-model loop#3238
ammario merged 33 commits into
mainfrom
sidebar-t7ry

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented May 6, 2026

Summary

Adds an AgentStatusService that produces the sidebar agent-status text using the same "small model" path as workspace title generation (NAME_GEN_PREFERRED_MODELS — Claude Haiku → GPT-mini), instead of inferring it from the workspace's TODO list. The status is regenerated on a focus-aware cadence, skips when the trailing transcript hash is unchanged, and is fed at most ~8k trailing transcript tokens.

Background

Today the sidebar's per-workspace status string is derived from deriveTodoStatus, which picks the first in-progress / pending / last-completed TODO. That's a noisy proxy for "what is the agent doing right now":

  • It only updates when the agent invokes todo_write, so a long bash run or a multi-step assistant turn shows a stale TODO.
  • It can be empty for entire conversations that don't use TODO-style planning.
  • It can't summarize free-form work like "investigating crash" or "running tests".

The user asked for the status to come from a small LLM call instead, with sensible cost controls (token cap, cadence, dedup, focus gating).

Implementation

Constants (src/constants/agentStatus.ts)

  • Focused cadence 30s, unfocused cadence 120s, scheduler tick 10s, startup delay 30s.
  • Trailing-window budget: ≤ 80 messages, ≤ 8000 tokens, with a per-message char cap of 4000.
  • Concurrency cap of 1 global generation in flight.

Small-model path (src/node/services/workspaceStatusGenerator.ts)

  • Mirrors workspaceTitleGenerator.ts. streamText + a new propose_status tool (added to toolDefinitions.ts); reuses mapNameGenerationError / mapModelCreationError so all the existing 401/402/403/429/5xx classifications apply.
  • Prompt explicitly delimits the transcript with <transcript> markers and instructs the model to focus on current activity rather than overall task scope.
  • propose_status is excluded from default tools.ts registration with the same comment as propose_name.

Scheduler (src/node/services/agentStatusService.ts)

  • Single setInterval ticking every 10s. Per-workspace lastRanAt + a current focused/unfocused interval gate the actual generation.
  • For each eligible workspace: read getLastMessages(80) + readPartial → format → trim oldest entries until under the token budget (using TokenizerService with a fallback model constant), then SHA-256 the transcript.
  • If the hash matches the last successful generation, skip — exactly the "frozen chat → no new status" behavior the user asked for. The dedup is in-memory; on app relaunch every workspace generates once and then dedups normally.
  • Eligible workspaces are sorted by lastRanAt ascending so MAX_CONCURRENT=1 doesn't starve workspaces deeper in the list (round-robin without an explicit queue).
  • runTick awaits all dispatched workspaces (via tracked inFlightPromises) instead of fire-and-forget, so production schedulers and tests can observe completion deterministically.
  • Two if (this.stopped) return guards (post-generation, post-persist) so a stop() mid-flight never leaks writes past the service's lifecycle.
  • Persists before updating the in-memory dedup hash so a transient I/O failure leaves the next tick free to retry the same transcript.

Focus tracking (src/node/services/windowService.ts)

  • WindowService.isFocused() delegates to BrowserWindow.isFocused() directly. Falls back to true in non-desktop contexts (CLI server, tests) so backend services don't throttle themselves to "unfocused" cadence when there is no window.

Schema + persistence

  • New aiStatus field on ExtensionMetadata + WorkspaceActivitySnapshotSchema.
  • AgentStatusService calls extensionMetadata.setAiStatus directly (bypassing the historical emitWorkspaceActivityUpdate wrapper that would have swallowed disk errors) and emits the activity snapshot via workspaceService.emitWorkspaceActivity only after the disk write resolves successfully. ExtensionMetadataService.save() re-throws on failure to support this; initialize() keeps its startup-safety semantics by wrapping its mutation calls in a logged try/catch per AGENTS.md.

Frontend (src/browser/stores/WorkspaceStore.ts)

  • New precedence: transientStatus ?? aiStatus ?? todoStatus. transientStatus collapses displayStatus (inactive workspaces) and aggregator.getAgentStatus() (active workspaces) so explicit system-set messages always beat aiStatus regardless of which branch produces them.
  • applyWorkspaceActivitySnapshot diffs aiStatus so a fresh small-model status triggers re-render.

Validation

  • New tests
    • workspaceStatusGenerator.test.ts: prompt structure (transcript delimiters, empty-transcript sentinel) + empty-candidates short-circuit.
    • agentStatusService.test.ts: generates on first tick, dedups when transcript is unchanged, regenerates after history changes, includes the partial assistant message mid-stream, skips empty workspaces, focused-vs-unfocused cadence boundaries, round-robin fairness across multiple workspaces, archived workspaces are skipped, stops cleanly mid-generation, persist failures don't advance the dedup hash.
    • WorkspaceStore.test.ts: aiStatus beats todoStatus; displayStatus still beats aiStatus.
  • make static-check clean (eslint, typecheck, prettier, doc-link check).
  • make test shows only pre-existing flaky-timing failures present on main (CLI auth + GitStatusStore subset). My new files pass cleanly.

Risks

  • The new service makes provider calls that didn't exist before. Mitigations: 30s startup delay, 30s/120s focus-gated cadence, transcript-hash dedup, ≤1 concurrent generation, mapNameGenerationError retry budget capped at 3 candidates.
  • Sidebar precedence change is the only user-visible behavior change. If aiStatus generation fails or hasn't completed yet, behavior is identical to today (todoStatus fallback). Explicit system-set transient status (heartbeat / idle-compaction / background-turn) still beats aiStatus.
Codex review history (resolved)
  1. Partial stream output: AI status was hashing only committed chat.jsonl. Fixed by appending readPartial so the hash refreshes mid-stream.
  2. Fair scheduling: with MAX_CONCURRENT=1 and fixed iteration order, deep workspaces would starve. Fixed with lastRanAt ascending sort.
  3. Hash-before-persist: updateAiStatus went through a wrapper that silently swallowed disk errors. Fixed by calling setAiStatus directly and updating the hash only on success.
  4. save() swallowing errors: ExtensionMetadataService.save() itself swallowed errors. Fixed by re-throwing; initialize() wraps its calls in try/catch to preserve startup-safety.
  5. Active-workspace transient precedence: aggregator.getAgentStatus() was being buried under aiStatus for active workspaces. Fixed with the transientStatus tier.
  6. Lifecycle stop guard: runForWorkspace could persist/emit after stop(). Fixed with two this.stopped checks at await boundaries.
Simplification refactor

After the original implementation landed and Codex approved, the diff was simplified for minimal LoC / maintenance burden (8 files, +232 / -670 lines):

  • Dropped cross-restart hash dedup. After the original Codex approval the disk-stored aiStatusInputHash round-trip turned out to add ~50 LoC of complexity (hydration sweep, getter method, conditional-spread coercion) for ~one extra LLM call per workspace per app launch. Removed the field from the type, the coercion, the setter signature, and getAiStatusInputHash entirely; in-session dedup still works.
  • WindowService: dropped EventEmitter inheritance, setFocused, focused field, focus/blur listeners, and the test-stub defensive check. isFocused() delegates to BrowserWindow.isFocused() directly.
  • AgentStatusService: dropped __test__ exports, inFlightCount (use inFlightPromises.size), markInFlight helper, drainInFlight helper, hashesHydrated + hydratePersistedHashes, resolveTokenizerModel. Merged tick() + runTick(). Used FALLBACK_TOKENIZER_MODEL constant in buildTrailingTranscript so we don't double-call getWorkspaceTitleModelCandidates per workspace.
  • Trimmed verbose multi-paragraph comments to one-line summaries throughout.

Generated with mux • Model: anthropic:claude-opus-4-7 • Thinking: max • Cost: $58.06

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 6, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
Mux 🟢 Ready View Preview May 6, 2026, 12:30 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Please review this PR. Key areas:

  • The new AgentStatusService (cadence/dedup correctness, focus gating)
  • workspaceStatusGenerator (prompt structure, error mapping reuse)
  • WorkspaceStore precedence change (displayStatus > aiStatus > todoStatus > fallback)
  • Schema additions to WorkspaceActivitySnapshot + ExtensionMetadata

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f1c982f6c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the partial.json feedback: buildTrailingTranscript now appends historyService.readPartial(workspaceId) after the committed tail, so the input hash changes and the small-model status refreshes mid-stream. Added a unit test that stages a partial assistant message and asserts the new transcript + new hash.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b52db3d1d4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the fairness feedback: processEligibleWorkspaces now collects every eligible workspace, sorts by lastRanAt ascending, and dispatches up to AGENT_STATUS_MAX_CONCURRENT per tick. Workspaces that have never run sort with lastRanAt=0 so new workspaces preempt previously-run ones. Added a 3-workspace round-robin unit test that asserts every workspace gets a turn before any repeats.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 265f907bf5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed the persist-vs-hash ordering. agentStatusService now calls extensionMetadata.setAiStatus directly (which throws on failure) and only updates state.lastInputHash after the write resolves, then emits the activity snapshot via the new public workspaceService.emitWorkspaceActivity. The legacy updateAiStatus wrapper was removed since this was its only caller. Added a regression test that injects a disk-write failure and asserts the next tick retries against the same transcript.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d2ac73d1c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/ExtensionMetadataService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Made ExtensionMetadataService.save() rethrow after logging. All historical callers wrap their setX / updateX calls in emitWorkspaceActivityUpdate, which still downgrades the throw to a logged warning, so their semantics are unchanged. AgentStatusService invokes setAiStatus directly and now sees real disk-write failures, skipping the lastInputHash advance so the next tick retries.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3b118b76b2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/ExtensionMetadataService.ts
Comment thread src/browser/stores/WorkspaceStore.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Two follow-ups:

  1. ExtensionMetadataService.initialize() now wraps both the dir-create and clearStaleStreaming() in try/catch + log, so a transient disk failure at startup doesn't crash the app. save() keeps its strict throw semantics for AgentStatusService.

  2. Reworked sidebar precedence: displayStatus and fallbackAgentStatus are collapsed into a single transientStatus tier (displayStatus ?? fallbackAgentStatus). The final precedence is transientStatus ?? aiStatus ?? todoStatus, so explicit system-set messages — whether they arrived via the activity snapshot for inactive workspaces or via aggregator.getAgentStatus() for active/caught-up workspaces — always outrank an AI summary.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2264113ab8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the lifecycle hazard: runForWorkspace now checks this.stopped both after generateWorkspaceStatus returns and after the disk write resolves, returning early in both cases so a generation that's still in flight when stop() is called doesn't persist or emit. Added a regression test that pauses generation, calls stop(), then releases generation and asserts no setAiStatus / activity emit fires.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Refactor pass for simplicity / minimal LoC / maintenance burden. Net delta: -438 lines.

  • Dropped cross-restart hash dedup (the disk-stored aiStatusInputHash round-trip). In-session dedup is unchanged; on app relaunch every workspace generates once and then dedups normally.
  • Dropped EventEmitter inheritance from WindowService; isFocused() now delegates to BrowserWindow.isFocused() directly.
  • Removed several internal helpers in AgentStatusService (__test__, inFlightCount, markInFlight, drainInFlight, hashesHydrated, hydratePersistedHashes, resolveTokenizerModel); merged tick() + runTick(); eliminated the duplicate getWorkspaceTitleModelCandidates call inside buildTrailingTranscript by using a fallback tokenizer model constant.
  • Trimmed verbose comments / tests to focus on intent rather than narrating the obvious.

All previously-resolved review concerns (round-robin fairness, persist-then-hash, partial-stream inclusion, lifecycle stop guard, transient status precedence) are preserved and still tested.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5a67d08ddf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/ExtensionMetadataService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed P1: setAiStatus now preserves the existing entry's recency and seeds recency=0 for the rare brand-new entry case (workspace has chat but no metadata yet). Pinned the contract with a direct test against ExtensionMetadataService.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5325acf7a2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed P2: added a this.stopped check immediately before generateWorkspaceStatus, so a stop() during the earlier awaits (transcript build, candidates fetch) skips the provider call entirely. New test pins the contract.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed three review threads:

  1. Hide bespoke tools from docs — added internal: true to propose_name and propose_status in TOOL_DEFINITIONS; scripts/gen_docs.ts now filters internal tools out of the hook env-var block. The auto-generated docs/hooks/tools.mdx and builtInSkillContent.generated.ts no longer expose env-var entries for these tools.
  2. Collapse aiStatus into todoStatus — the frontend never distinguished between them, so the AI-generated payload now writes to the same todoStatus slot. Dropped the schema field, the ExtensionMetadata field, and the frontend precedence tier. setAiStatussetSidebarStatus to reflect the unified role.
  3. Drop AGENT_STATUS_STARTUP_DELAY_MS — with MAX_CONCURRENT=1 the scheduler already serializes generation per tick, so a separate startup delay is unnecessary. Dropped the constant, startupTimeout, the begin() helper, and the startupDelayMs option.

Net diff: +83 / -163. All previously-resolved Codex concerns (recency preservation, pre-generator stop guard, partial-stream inclusion, round-robin fairness, persist-then-hash, lifecycle stop guard) are preserved.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b9aafe40b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/stores/WorkspaceStore.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed P2: inverted the precedence in the active-workspace branch so activity?.todoStatus (which is now the canonical "last write wins" slot for both AI and todo-derivation) wins over deriveTodoStatus(aggregatorTodos). The live aggregator derivation is now a fallback only when the snapshot has no entry yet.

Added a regression test that gives the workspace BOTH live aggregator todos AND a persisted activity todoStatus, asserting the persisted one wins.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Bug fix prompted by deployed behavior: most workspaces were showing the literal phrase "Awaiting next task" in the sidebar. Root cause was the small-model prompt itself, which both:

  1. Framed the task as "what is the agent doing right now" — pushing the model to bucket every paused chat as "idle/finished".
  2. Handed the model the literal example "Awaiting next task" for that idle/finished case — so it just copied the example verbatim.

Reframed around "most recent activity", dropped the special idle branch, and explicitly forbid generic placeholders so the model names the concrete activity from the transcript instead.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc33fa5cbe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceStatusGenerator.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed the empty-transcript starvation path: non-racy empty transcripts now consume observed recency so they do not keep sorting ahead of populated workspaces under max-concurrency=1, while the recent first-recency guard still protects the sendMessage recency/history race.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40d38dd213

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed the pre-provider failure starvation path: config/auth failures now consume the observed recency priority while leaving lastInputHash unset, so the same transcript remains retryable on cadence without monopolizing every scheduler tick.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bb21f74090

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the pre-provider/history race by tracking the last transcript hash the scheduler examined separately from the settled dedup hash. Recent recency bumps now wait if history still shows the same transcript, including pre-provider retry states, while older/config-failure recency is still consumed to avoid starvation.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 37a9dc9a72

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed stale dedup recency priority: once the recent history-catchup guard has expired, unchanged transcript dedup skips now consume observed recency so the workspace stops bypassing cadence and cannot starve other workspaces.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be1bf725a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed startup behavior too: AgentStatusService.start() now kicks off one immediate runTick in addition to scheduling future interval ticks, matching the no-startup-delay comment while preserving max-concurrency=1.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 07e2ef0c88

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed in-flight stale writes: after generation returns, AgentStatusService now reloads the workspace snapshot and drops the result if recency advanced beyond the value observed at dispatch, so a slow provider response cannot persist/emit a pre-pivot status after a newer user turn.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c8f8d35d80

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentStatusService.ts Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed the atomicity issue: ExtensionMetadataService.setSidebarStatus now accepts a serialized-mutation guard (skipIfRecencyAdvancedSince) and performs the recency comparison immediately before the status write in the same mutation queue. AgentStatusService uses that guard instead of a separate pre-write getSnapshot check.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammario ammario merged commit 64b36e7 into main May 9, 2026
24 checks passed
@ammario ammario deleted the sidebar-t7ry branch May 9, 2026 23:59
mux-bot Bot pushed a commit that referenced this pull request May 10, 2026
The post-provider-failure / placeholder-rejection / successful-persist
branches in runForWorkspace each ran the same two-line pair:
markRecencyObserved() to consume any observed recency bump, then
state.lastInputHash = inputHash to advance the dedup hash so the next
tick won't regenerate against the same transcript.

Hoist that pair into a settleOnTranscript() closure (next to the
existing markRecencyObserved closure) so the three settlement branches
read as a single named intent. The pre-provider-failure and
empty/dedup-hit branches keep using bare markRecencyObserved() because
they should still retry against the same transcript when conditions
change — captured in a comment on the new helper so a future reader
doesn't reintroduce the duplicated pair.

Pure refactor — emitted JS, all six callsite branches, and the existing
25-test agentStatusService suite (including the
post-provider/placeholder/success/pre-provider/pre-provider-retry/dedup
regression cases added in #3238) are all unchanged.

🤖 Generated with mux

Co-Authored-By: Mux <noreply@coder.com>
@mux-bot mux-bot Bot mentioned this pull request May 10, 2026
mux-bot Bot pushed a commit that referenced this pull request May 10, 2026
The post-provider-failure / placeholder-rejection / successful-persist
branches in runForWorkspace each ran the same two-line pair:
markRecencyObserved() to consume any observed recency bump, then
state.lastInputHash = inputHash to advance the dedup hash so the next
tick won't regenerate against the same transcript.

Hoist that pair into a settleOnTranscript() closure (next to the
existing markRecencyObserved closure) so the three settlement branches
read as a single named intent. The pre-provider-failure and
empty/dedup-hit branches keep using bare markRecencyObserved() because
they should still retry against the same transcript when conditions
change — captured in a comment on the new helper so a future reader
doesn't reintroduce the duplicated pair.

Pure refactor — emitted JS, all six callsite branches, and the existing
25-test agentStatusService suite (including the
post-provider/placeholder/success/pre-provider/pre-provider-retry/dedup
regression cases added in #3238) are all unchanged.

🤖 Generated with mux

Co-Authored-By: Mux <noreply@coder.com>
mux-bot Bot pushed a commit that referenced this pull request May 11, 2026
The post-provider-failure / placeholder-rejection / successful-persist
branches in runForWorkspace each ran the same two-line pair:
markRecencyObserved() to consume any observed recency bump, then
state.lastInputHash = inputHash to advance the dedup hash so the next
tick won't regenerate against the same transcript.

Hoist that pair into a settleOnTranscript() closure (next to the
existing markRecencyObserved closure) so the three settlement branches
read as a single named intent. The pre-provider-failure and
empty/dedup-hit branches keep using bare markRecencyObserved() because
they should still retry against the same transcript when conditions
change — captured in a comment on the new helper so a future reader
doesn't reintroduce the duplicated pair.

Pure refactor — emitted JS, all six callsite branches, and the existing
25-test agentStatusService suite (including the
post-provider/placeholder/success/pre-provider/pre-provider-retry/dedup
regression cases added in #3238) are all unchanged.

🤖 Generated with mux

Co-Authored-By: Mux <noreply@coder.com>
mux-bot Bot pushed a commit that referenced this pull request May 12, 2026
The post-provider-failure / placeholder-rejection / successful-persist
branches in runForWorkspace each ran the same two-line pair:
markRecencyObserved() to consume any observed recency bump, then
state.lastInputHash = inputHash to advance the dedup hash so the next
tick won't regenerate against the same transcript.

Hoist that pair into a settleOnTranscript() closure (next to the
existing markRecencyObserved closure) so the three settlement branches
read as a single named intent. The pre-provider-failure and
empty/dedup-hit branches keep using bare markRecencyObserved() because
they should still retry against the same transcript when conditions
change — captured in a comment on the new helper so a future reader
doesn't reintroduce the duplicated pair.

Pure refactor — emitted JS, all six callsite branches, and the existing
25-test agentStatusService suite (including the
post-provider/placeholder/success/pre-provider/pre-provider-retry/dedup
regression cases added in #3238) are all unchanged.

🤖 Generated with mux

Co-Authored-By: Mux <noreply@coder.com>
mux-bot Bot pushed a commit that referenced this pull request May 12, 2026
The post-provider-failure / placeholder-rejection / successful-persist
branches in runForWorkspace each ran the same two-line pair:
markRecencyObserved() to consume any observed recency bump, then
state.lastInputHash = inputHash to advance the dedup hash so the next
tick won't regenerate against the same transcript.

Hoist that pair into a settleOnTranscript() closure (next to the
existing markRecencyObserved closure) so the three settlement branches
read as a single named intent. The pre-provider-failure and
empty/dedup-hit branches keep using bare markRecencyObserved() because
they should still retry against the same transcript when conditions
change — captured in a comment on the new helper so a future reader
doesn't reintroduce the duplicated pair.

Pure refactor — emitted JS, all six callsite branches, and the existing
25-test agentStatusService suite (including the
post-provider/placeholder/success/pre-provider/pre-provider-retry/dedup
regression cases added in #3238) are all unchanged.

🤖 Generated with mux

Co-Authored-By: Mux <noreply@coder.com>
mux-bot Bot pushed a commit that referenced this pull request May 12, 2026
The post-provider-failure / placeholder-rejection / successful-persist
branches in runForWorkspace each ran the same two-line pair:
markRecencyObserved() to consume any observed recency bump, then
state.lastInputHash = inputHash to advance the dedup hash so the next
tick won't regenerate against the same transcript.

Hoist that pair into a settleOnTranscript() closure (next to the
existing markRecencyObserved closure) so the three settlement branches
read as a single named intent. The pre-provider-failure and
empty/dedup-hit branches keep using bare markRecencyObserved() because
they should still retry against the same transcript when conditions
change — captured in a comment on the new helper so a future reader
doesn't reintroduce the duplicated pair.

Pure refactor — emitted JS, all six callsite branches, and the existing
25-test agentStatusService suite (including the
post-provider/placeholder/success/pre-provider/pre-provider-retry/dedup
regression cases added in #3238) are all unchanged.

🤖 Generated with mux

Co-Authored-By: Mux <noreply@coder.com>
Virtual0ps pushed a commit to Virtual0ps/mux that referenced this pull request May 15, 2026
## Summary

Long-lived auto-cleanup PR that accumulates low-risk,
behavior-preserving refactors picked from recent `main` commits.

## Changes

### Collapse `getCommandGhostHint` experiment gates into a table

The slash-command goal feature added in coder#3235 introduced a third
experiment-gating block to `getCommandGhostHint`
(`src/browser/utils/slashCommands/registry.ts`), structurally identical
to the existing `heartbeat` block: each matched a specific `commandKey`,
wrapped `isExperimentEnabled(<flag>)` in a `try/catch`, and returned
`null` on either disabled-experiment or thrown-check. The two blocks
already drifted — the goal branch's `catch` had no comment, the
heartbeat branch's had one explaining that hiding on a thrown check is
the safer default — and a fourth gated command would have copy-pasted
yet another nine-line `if (commandKey === "...") { try { ... } catch {
... } }` block.

Hoist the (commandKey → ExperimentId) mapping into a
`COMMAND_GHOST_HINT_EXPERIMENT_GATES` table and collapse both
`if`-blocks into a single lookup + `try/catch`. Non-gated commands
(`/compact`, `/model`, etc.) still skip the gate check identically
because the lookup returns `undefined`, so the surrounding `null !=
undefined` guard short-circuits exactly the way the absent `if
(commandKey === ...)` branches did. Adding a fourth gated command
becomes a one-line table entry, with the safer "hide on thrown check"
comment captured once on the shared `catch` instead of one branch's
`catch` (and the other's accidentally missing it).

Pure refactor — emitted JS, the gate semantics for `/goal` and
`/heartbeat` (disabled experiment hides the ghost-hint; thrown
experiment check also hides it, per the heartbeat branch's pre-existing
comment), the fall-through
`SLASH_COMMAND_DEFINITION_MAP.get(commandKey)?.inputHint ?? null` lookup
for non-gated commands, and the existing 10-test `ghostHint` suite (plus
the 136-test full `slashCommands` suite) are all unchanged.

<details>
<summary>Previous cleanups</summary>

### Drop dead `?? []` fallback for `sanitizeSchema.tagNames`

The JSX-tag preservation plugin added in coder#3256 built an
`ALLOWED_RAW_HTML_TAG_NAMES` Set by lowercasing
`sanitizeSchema.tagNames`, guarded with a `(sanitizeSchema.tagNames ??
[]).map(...)` fallback. `sanitizeSchema` is constructed locally in the
same module via `{ ...defaultSchema, tagNames: [...] }` with a concrete
array literal — after the spread/override, `tagNames` is unconditionally
a non-empty `string[]`, so the outer `?? []` fallback can never fire and
only serves to obscure that invariant from future readers.

Drop the outer fallback and add a comment capturing the rationale (the
inner `...(defaultSchema.tagNames ?? [])` spread still guards the
library type, which is genuinely optional). A future refactor that
touches the schema construction now has to spell out any change that
could legitimately make `tagNames` undefined, instead of inheriting the
dead guard by copy-paste.

Pure refactor — emitted JS, the `rehypePreserveUnknownRawHtml` plugin's
behavior (parsed via the same lowercased Set), the unknown-tag-as-text
preservation semantics added in coder#3256, and the existing
`MarkdownRenderer.raw-html` test suite are all unchanged.

### Extract `readInlineHeightPx` helper in `useAutoResizeTextarea`

The chat input auto-resize stabilization in coder#3263 added a second
`Number.parseFloat(el.style.height)` + `Number.isFinite(...)` pair to
verify that the cached pixel height still reflects the textarea's inline
style, sitting a handful of lines from the existing pair in the
`canOnlyGrow` first-render fallback. Both spell out the same intent —
read the textarea's inline `height` style as a finite px number, or
treat it as missing — but as two independent expressions they were one
stray edit away from drifting on what counts as a usable inline height
(e.g. how to handle an empty string, a non-px unit, or the `NaN` that
`parseFloat` returns for either).

Extract a private `readInlineHeightPx(el): number | null` helper that
captures the parseFloat + isFinite pair in one place, with the rationale
captured on the helper itself so a future caller (e.g. another
resize-coordination path that also needs to read the inline height)
doesn't reintroduce the duplicated check. The two call sites collapse to
a single named call, and the surviving branches replace
`Number.isFinite(appliedHeight)` /
`Number.isFinite(currentInlineHeight)` with `!== null` checks
(functionally identical because the helper already filters non-finite
values, plus TypeScript narrows `number | null` cleanly without a
separate type guard).

Pure refactor — emitted writes, the `canOnlyGrow` grow-only semantics,
the post-resize "restore the px height after `auto` writes or external
clears" verification added in coder#3263, and the existing 5-test
`useAutoResizeTextarea` suite (including the regression cases for capped
large-text deletion and external inline-height clears added in coder#3263)
are all unchanged.

### Extract `settleOnTranscript` helper in `agentStatusService`

The new `AgentStatusService.runForWorkspace` (added by the AI-generated
sidebar status loop in coder#3238) had the same two-line pair —
`markRecencyObserved()` to consume any observed recency bump, then
`state.lastInputHash = inputHash` to advance the dedup hash — repeated
across the three branches that produce a definitive outcome for the
current transcript: post-provider failure (model refused tool / rate
limit / persistent provider error), placeholder rejection
(defense-in-depth filter for generic "Awaiting next task" output), and
successful persist. The other three branches (empty workspace,
idle/frozen dedup hit, pre-provider auth/config failure) intentionally
call only `markRecencyObserved()` because they should still retry
against the same transcript when conditions change.

Hoist the duplicated pair into a `settleOnTranscript()` closure declared
next to the existing `markRecencyObserved` closure inside
`runForWorkspace`, with a comment capturing why the three settlement
branches advance the dedup hash and the three retry branches don't, so a
future reader doesn't reintroduce the duplicated pair (or accidentally
swap one for the other). The three settlement call sites collapse to a
single named call, and the three retry call sites keep their bare
`markRecencyObserved()` shape.

Pure refactor — emitted JS, the six branch behaviors (settle vs retry),
the dedup-hash semantics documented on the `lastInputHash` field, and
the existing 25-test `agentStatusService` suite (including the
post-provider / placeholder / success / pre-provider /
pre-provider-retry / dedup regression cases from coder#3238) are all
unchanged.

### Collapse `ReviewPanel` selection-validity branches

`ReviewPanel.tsx`'s post-filter selection-validity effect (touched by
the immersive sidebar fix in coder#3249) split into two structurally
identical arms: an `isImmersive` branch that validated `selectedHunkId`
against `hunks` (the unfiltered diff) and a non-immersive branch that
validated against `filteredHunks`. Both arms then ran the same
`setSelectedHunkId(filteredHunks[0].id)` reset on miss, with the
immersive arm needing its own early `return` to avoid double-running the
reset. The duplication invited drift between the two modes — e.g. a
future hide-read-style filter that adjusts the immersive reset would
have to keep both arms in lockstep by hand.

Pick the validity list up front (`const validityList = isImmersive ?
hunks : filteredHunks;`) and run a single `selectedHunkId &&
validityList.some(...)` check, with a short comment capturing the
rationale so a future reader doesn't reintroduce the split. The
early-return on `filteredHunks.length === 0`, the dependency list
(`[filteredHunks, hunks, isImmersive, selectedHunkId,
setSelectedHunkId]`), and the immersive-aware "only reset when the hunk
has truly disappeared from the diff" semantics added in coder#3249 are all
preserved.

Pure refactor — emitted JS, the selection-reset trigger conditions, and
the existing 5-test `ImmersiveReviewView` suite (including the two
immersive regression tests added in coder#3249) are all unchanged.

### Extract `renameAliasField` helper for bash tool preprocess

The bash tool's `z.preprocess` shim (in
`src/common/utils/tools/toolDefinitions.ts`) normalizes quirky model
emissions to canonical field names. After the DeepSeek v4 fix in coder#3247
added a second `description` → `display_name` rename block (mirroring
the existing `command` → `script` rename), the two blocks were
structurally identical: skip when canonical is already a string, drop
the alias via destructure, re-spread with the canonical name. Each alias
still required its own `as Record<string, unknown> & { <alias>: string
}` cast plus the same three-line destructure/spread.

Extract a private `renameAliasField(obj, alias, canonical)` helper that
captures the rename pattern in one place, with the rationale for why
aliases exist (and why they stay undocumented) noted on the helper. The
call site collapses to two named lines that read as the intent (`rename
command to script`, `rename description to display_name`) instead of
fifteen lines of mostly-identical destructure/spread. A future alias
(e.g. another quirky-model field) becomes a one-line addition.

Pure refactor — emitted JS, the canonical-field-wins precedence, the
"no-op when neither alias nor canonical is a string" branches, and the
existing 36-test `toolDefinitions` suite (including the four
`command`/`description` alias precedence cases added in coder#3247) are all
unchanged.

### Derive `TokenRecord` from `BrowserBridgeTokenPayload`

`BrowserBridgeTokenManager` (in
`src/node/services/browser/BrowserBridgeTokenManager.ts`, touched by the
new other-workspace session picker in coder#3243) declared two structurally
identical interfaces side by side: a private `TokenRecord` (the stored
token state) and an exported `BrowserBridgeTokenPayload` (the
`validate()` return shape), with `TokenRecord` differing only by the
extra `expiresAtMs: number` deadline. `validate()` then rebuilt the
payload by listing each of the four shared fields a third time. The
coder#3243 commit made this drift surface especially visible — adding
`allowOtherWorkspaceSession` required parallel edits in `TokenRecord`,
`BrowserBridgeTokenPayload`, the `mint()` insert, and the `validate()`
rebuild.

Collapse `TokenRecord` to `extends BrowserBridgeTokenPayload` (with the
rationale captured in a short comment so a future reader doesn't
reintroduce the duplicated field list) and rewrite the `validate()`
rebuild as a rest-spread destructure (`const { expiresAtMs, ...payload }
= record; return payload;`). The eslint config already enables
`ignoreRestSiblings: true`, so the unused `expiresAtMs` binding doesn't
trigger `no-unused-vars`. A future payload field (e.g. another scoping
flag) now lives in exactly one place — the exported
`BrowserBridgeTokenPayload` interface — and automatically flows through
both the stored record shape and the validate-time rebuild.

Pure refactor — emitted JS, the TTL semantics, the `null`-on-expired
path, and the existing 5-test `BrowserBridgeTokenManager` suite
(including the new "preserves explicit other-workspace session scope"
case) are all unchanged.

### Extract `detachLanguageModelCleanup` helper in
`languageModelCleanup`

`moveLanguageModelCleanup` and `runLanguageModelCleanup` (in the new
`src/node/services/languageModelCleanup.ts` from the OpenAI WebSocket
transport opt-in PR coder#3241) both implemented the same "look up the
attached cleanup, delete the symbol slot, return it" sequence inline.
The duplication is structurally identical — same
`LanguageModelWithCleanup` cast, same symbol read, same `delete` — so
both call sites were one stray edit away from drifting on whether the
slot gets cleared before or after the cleanup runs (which matters
because `attachLanguageModelCleanup` asserts the slot is empty).

Extract a private `detachLanguageModelCleanup` helper that does the
single-shot pop in one place and returns the cleanup (or `undefined`).
Both surviving public functions reduce to their intent: `move`
re-attaches the popped cleanup to the target, and `run` invokes it
inside the existing `try/catch` that swallows + logs failures. A short
comment captures the rationale on the helper itself so the next caller
(e.g. a future "cancel without invoking" path) can't accidentally leak a
slot.

Pure refactor — emitted JS, the symbol's single-shot semantics, and the
existing 6-test `languageModelCleanup` suite are all unchanged.

### Extract `isAgentTaskActiveStatus` predicate in `task_await`

Three call sites in `src/node/services/tools/task_await.ts` gated on the
active agent-task subset (`"queued" | "running" | "awaiting_report"`) by
inlining the three-arm equality check. The new task_await elapsed-timing
commit (coder#3234) extended both the `timeoutMs === 0` branch and the `timed
out` catch-branch symmetrically with
`...getAgentTaskElapsedField(taskId)`, making the two structurally
identical `{ status, taskId, ...elapsed }` returns sit a handful of
lines from a third copy in the `ForegroundWaitBackgroundedError` branch
that picks between the same three values to coalesce against a
`"running"` fallback.

Add a local `AgentTaskActiveStatus` subset alias plus an
`isAgentTaskActiveStatus` type predicate near the existing
`coerceTimeoutMs` / `parseTimestampMs` helpers, and collapse all three
inline checks to a single call. The predicate narrows the nullable
`AgentTaskStatus | null` return of `getAgentTaskStatus` to the active
subset, so the surviving `{ status, taskId, ... }` returns keep their
narrowed `status` field with no `as const` gymnastics. The new comment
captures the rationale so a future field added to one of the
active-status branches won't silently drift away from the others.

Pure refactor — emitted return shapes (including `elapsed_ms`), narrowed
`status` literals, and the existing 17-test `task_await` suite are all
unchanged.

### Drop dead length guard in `parseBedrockModelName` `secondPart`

The early `dotParts.length < 2` return at the top of
`parseBedrockModelName` (in `src/common/utils/ai/modelDisplay.ts`)
already guarantees `dotParts.length >= 2` by the time `secondPart` is
computed, which makes the `dotParts.length > 1 ?
dotParts[1].toLowerCase() : ""` ternary's empty-string branch
unreachable. The DeepSeek V4 commit (coder#3237) extended the surrounding
formatter without touching this site, but the new DeepSeek branch made
the asymmetry more obvious — the line directly above (`firstPart`)
already accesses `dotParts[0].toLowerCase()` without a guard, so the
ternary on `secondPart` was the odd one out.

Inline to a direct `dotParts[1].toLowerCase()` access (matching
`firstPart`'s shape) and capture the rationale in a one-line comment so
a future reader doesn't reintroduce the guard. Pure dead-code cleanup —
emitted JS, runtime behavior, and the existing 18-test `modelDisplay`
suite (including the new DeepSeek + existing Bedrock cases) are all
unchanged.

### Drop unused `workspaceName` param from `parseRuntimeString`

The second argument to `parseRuntimeString` (in
`src/browser/utils/chatCommands.ts`) was named `_workspaceName`
(underscore-prefixed = unused) when it was introduced in the original
SSH runtime PR, and has never been referenced by the function body —
error messages are all generic and don't include any workspace-specific
context.

The `/new`-mirrors-`/fork` simplification (coder#3230) made the noise
especially visible: the new `createNewWorkspace` call site had to pass
`options.workspaceName ?? "(auto-generated)"` purely to satisfy the
signature, and added a comment claiming `parseRuntimeString only uses
the name for error reporting context` — but the function never reads it.
Mobile already had the cleaner shape
(`parseRuntimeStringForMobile(runtime?: string)`).

Drop the parameter from the signature, drop the placeholder + misleading
comment at the only non-test caller (`createNewWorkspace`), and drop the
`workspaceName = "test-workspace"` constant + 23 second-arguments in
`chatCommands.test.ts`. Pure dead-parameter cleanup — emitted JS, error
messages, and runtime behavior are all identical, and the desktop
signature now matches mobile's.

### Drop redundant `isPlanHandoffAgent` boolean in
`streamContextBuilder`

The `isPlanHandoffAgent` boolean in `buildPlanInstructions` was
extracted when the gate was `effectiveAgentId === "exec" ||
effectiveAgentId === "orchestrator"`. After coder#3224 ripped out the
Orchestrator agent, the boolean collapsed to a single equality check
(`effectiveAgentId === "exec"`), and the trailing `else if
(isPlanHandoffAgent && chatHasStartHerePlanSummary)` redundantly
re-evaluated the same gate just to log a debug line.

Replace the flat `if/else if` with a nested `if (effectiveAgentId ===
"exec") { … }` that tests the Start Here summary inside, removing the
duplicate gate re-check and the now-meaningless boolean. A short comment
captures the rationale so a future reader doesn't reintroduce the alias.

Pure refactor — emitted control flow, the debug log, and
`planContentForTransition` assignments are identical.

### Extract `seedScrollDirectionBaseline` helper in `useAutoScroll`

`useAutoScroll` (touched by coder#3226) seeds `lastScrollTopRef` from two
call sites — `jumpToBottom` and `disableAutoScroll` — to keep
`handleScroll`'s released-branch direction check (`currentScrollTop >
previousScrollTop`) honest after a programmatic ownership transfer that
may not produce a follow-up scroll event. Both sites repeated the same
write (`lastScrollTopRef.current = contentRef.current?.scrollTop ?? 0`)
and a multi-line block comment re-explaining the same shared rationale.

Extract a `seedScrollDirectionBaseline` `useCallback` with the shared
rationale captured once on the helper itself. Each call site reduces to
a single call plus a one-line comment naming the site-specific reason it
doesn't get a free scroll-event refresh (`stickToBottom` skips the write
at max; `disableAutoScroll` never fires a scroll event itself). The
dependency arrays for `jumpToBottom` and `disableAutoScroll` add the new
helper, which has empty deps (`useCallback(..., [])`) so its identity is
stable across renders — no extra re-creations of the surrounding
callbacks.

This shrinks the surface area for future drift: a third path that needs
to flip `autoScrollRef`/`programmaticDisableRef` without a guaranteed
scroll-event tail (e.g. the deferred workspace-switch hydration race
called out in coder#3226's "Pains") can call the helper instead of
duplicating the rationale a third time. Pure refactor — emitted writes,
write order, and ref values are identical, and the existing 25-test
`useAutoScroll` suite (including the 6 new regression tests added in
coder#3226) passes unchanged.

### Extract `pushStreamErrorRow` helper in `StreamingMessageAggregator`

`buildDisplayedMessagesForMessage` now has two branches that synthesize
a `stream-error` `DisplayedMessage`: the existing
`message.metadata?.error` path and the `finishReason === "length"` path
added in coder#3223. Both pushes were structurally identical, differing only
in `id` suffix, `error` string, and `errorType` — the
parent-message-derived fields (`historyId`, `historySequence`, `model`,
`routedThroughGateway`, `timestamp`) were duplicated across both call
sites.

Extract a local `pushStreamErrorRow` closure that captures the shared
fields once. Each branch reduces to a single call passing the three
differing values. The `model` access switches from
`message.metadata.model` (which relied on the outer `if
(message.metadata?.error)` narrowing) to `message.metadata?.model`,
which is functionally identical when `metadata` is defined and falls
through to `undefined` otherwise — same emitted value either way.

This shrinks the surface area for future drift: the next branch added
(e.g. a different finish-reason synthesis) can't accidentally pass a
stale `model` or forget `routedThroughGateway`. Pure refactor — emitted
`DisplayedMessage` objects are identical, and the existing 77-test SMA
suite (including the 6 new max-tokens tests) passes unchanged.

### Drop redundant `GuardAnchors` type alias in `file_edit_insert`

In `src/node/services/tools/file_edit_insert.ts`, `GuardAnchors` was
defined as `Pick<InsertContentOptions, "insert_before" |
"insert_after">`, but `InsertContentOptions` itself is already
`Pick<FileEditInsertToolArgs, "insert_before" | "insert_after">` after
the `.nullish()` strict-mode refactor in coder#2250 stripped the
`InsertContentOptions` interface down to those same two fields. The two
aliases became structurally identical, so `GuardAnchors` is dead.

Drop the alias and use `InsertContentOptions` for the two callers
(`insertWithGuards`, `resolveGuardAnchor`). Both names were file-local;
no exports change. The function names already convey "guard" context, so
the parameter type doesn't need to repeat it. This noise was especially
visible to the new guardless-empty-file path added in coder#3220 since it
sits next to `insertContent` which uses `InsertContentOptions`.

</details>

## Validation

- `bun test src/browser/utils/slashCommands/ghostHint.test.ts` — 10/10
pass.
- `bun test src/browser/utils/slashCommands/` — 136/136 pass (full
slashCommands suite, covers `parser`, `suggestions`, `goal`,
`heartbeat`, `idle`, `new`, `fork`, `compact`, `parser_multiline`, and
`ghostHint`).
- `make static-check` — eslint, tsgo (×2), prettier, shfmt, ruff,
code-to-docs link check all pass.

## Risks

Minimal — purely a local table-extraction inside `getCommandGhostHint`.
The table-driven lookup short-circuits exactly the way the two if-blocks
did: for non-gated commands
`COMMAND_GHOST_HINT_EXPERIMENT_GATES[commandKey]` returns `undefined`,
the `gateExperimentId != null` guard fails, and execution falls through
to the existing `SLASH_COMMAND_DEFINITION_MAP.get(...)` lookup. For
`/goal` and `/heartbeat` the only behaviour change is that the goal
branch now inherits the heartbeat branch's "hide on thrown experiment
check" comment (the heartbeat branch's `catch` already had the comment;
the goal branch's `catch` did not). No imports, types, or experiment IDs
are introduced beyond the existing `ExperimentId` type-only import from
`@/common/constants/experiments`.

---

Auto-cleanup checkpoint: 23440dd

---

_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking:
`xhigh`_

<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=xhigh -->

---------

Co-authored-by: mux-bot[bot] <264182336+mux-bot[bot]@users.noreply.github.com>
Co-authored-by: mux-bot <mux-bot@coder.com>
Co-authored-by: mux <bot@mux.coder.com>
Co-authored-by: ammar-agent <ammar+ai@ammar.io>
Co-authored-by: mux-auto-cleanup <noreply@anthropic.com>
Co-authored-by: Mux <noreply@coder.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants