Multi-architect feature is underbaked — gaps in lifecycle, persistence, and UX

## Summary

The multi-architect feature (sibling architects added via `afx workspace add-architect`) shipped its primitive in v3.0.5 (#755), dashboard tabs in v3.0.6 (#761), and a critical routing fix in v3.0.8 (#774). But the feature is not yet a coherent, well-thought-through product. There are gaps in lifecycle management, persistence semantics, and UX that an end user discovers immediately when they try to drive it.

This issue is the umbrella for a SPIR-protocol pass to design and ship the missing pieces as a cohesive unit. Goal: by the end, a user can add, manage, evict, and recover sibling architects with the same fluency they have with builders.

## Confirmed gaps

### 1. No way to remove a sibling architect
- `afx workspace add-architect` exists; `afx workspace remove-architect <name>` does not.
- Manual workarounds today: kill the terminal from the dashboard sidebar (left pane only — see #2), or restart Tower entirely (nukes all workspaces).
- A first-class `remove-architect` CLI + dashboard affordance is needed.

### 2. Right-pane terminals (builders, shells) have no close button; sibling architects similarly cannot be closed from where they live
- The dashboard right pane (where builders and shells render) has no X / close affordance on tabs. Only the left pane does.
- Sibling architect tabs (in the multi-architect tab strip introduced in v3.0.6 #761) likewise have no close UI. The architect *is* a closable entity (unlike `main`, which is workspace-defining), so it should have one.
- This is a broader UX gap that affects more than architects, but architects make it salient.

### 3. Sibling architects are not persisted in `state.db`
- Confirmed empirically: shannon's `.agent-farm/state.db` `architect` table is empty even though Tower's in-memory `getWorkspaceTerminals()` map correctly has both `main` and `ob-refine`.
- Implication: sibling architects exist only in Tower's process memory. A Tower restart wipes them. The user has to re-run `add-architect` every time Tower goes down.
- This is the opposite design from how builders work (builders persist across Tower restarts via `state.db` + shellper auto-rebind).
- There IS an `architect` table in the schema (with `id, pid, port, cmd, started_at, terminal_id`) — the row for `main` gets written by `workspace start`, but sibling architects added via `add-architect` never write rows. So the table exists but the code path skips it for siblings.
- Question to settle in spec: should siblings persist? If yes, the auto-rebind story needs to mirror builders (Tower restart → re-spawn the architect terminal from the recorded `cmd`, re-register against the rebound shellper). If no, siblings are explicitly ephemeral and the docs should say so loudly.

### 4. Routing was broken end-to-end from v3.0.5 → v3.0.7 (fixed in #774, pending v3.0.9 publish)
- Not a new gap — but a symptom of the underlying problem: the headline value prop of the feature ("builder→architect message routes to the spawning sibling") was never exercised end-to-end before shipping.
- The spec for the umbrella SPIR must include verify-phase steps that literally run `afx send architect` from a builder spawned by a sibling, and assert the message lands on the sibling's terminal.

## Probable additional gaps (for the spec phase to confirm)

- **Recovery from a crashed sibling architect.** If `ob-refine`'s Claude process exits, what state are its in-flight builders in? Do they detect the gone-architect and surface to `main`? The current fallback chain in `tower-messages.ts:332-341` routes to `main` when the spawning architect is gone — but the in-memory map entry might be stale (terminal_id pointing at a dead PID). Spec should pin behaviour.
- **Naming constraints.** A name like `ob-refine` works; what about `main` (reserved?), empty string, names with spaces, names with `:` (collides with the `architect:<name>` address grammar)? Validation needs to be documented.
- **Cross-architect message addressing.** v3.0.5 introduced `architect:<name>` as an in-workspace address. What about messaging from architect-to-architect? Does `main` send to `ob-refine` via `architect:ob-refine`? Verify this works and document it.
- **VSCode extension surface.** The VSCode sidebar shows one Architect tab. With siblings, what does it show? Is there parity with the dashboard's tab strip?
- **`afx status` output.** When there are siblings, does the CLI list them with their PIDs / terminal IDs the way it does for `main`? Currently `afx status` mostly elides them.
- **Dashboard tab labelling.** The first architect is `id: 'architect'` (bare) and siblings are `architect:<name>` (per v3.0.6 spec). When `main` is the bare one and there are siblings, is the labelling consistent / discoverable?

## Out of scope for this SPIR

- Multi-architect-driven workflows beyond the single workspace (cross-workspace routing was deferred earlier; that stays deferred).
- Renaming architects after add. Suggest filing as a separate small ticket if needed.

## Suggested approach

Run as **SPIR (strict mode)** — feature is large enough to warrant spec-approval + plan-approval gates, and the design choices (persistence model, lifecycle semantics) deserve careful review. The architect (human) should drive the design conversation at the spec gate.

Verify-phase exit criteria must include the manual round-trip test that v3.0.5 lacked: a sibling architect, add it, spawn a builder from it, send a message to architect, observe it lands on the sibling. Repeat for remove-architect, crash-recovery, and persistence-across-Tower-restart paths.

## Severity / priority

Medium-high. The feature was promoted as a v3.0.6 headline, an external adopter (Shannon) is using it in production, and it's missing the basic lifecycle / UX hygiene that the rest of Codev provides. Not a blocker for shipping v3.0.9 (which fixes the most painful bug, #774), but the next coherent release should land this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-architect feature is underbaked — gaps in lifecycle, persistence, and UX #786

Summary

Confirmed gaps

1. No way to remove a sibling architect

2. Right-pane terminals (builders, shells) have no close button; sibling architects similarly cannot be closed from where they live

3. Sibling architects are not persisted in `state.db`

4. Routing was broken end-to-end from v3.0.5 → v3.0.7 (fixed in #774, pending v3.0.9 publish)

Probable additional gaps (for the spec phase to confirm)

Out of scope for this SPIR

Suggested approach

Severity / priority

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Multi-architect feature is underbaked — gaps in lifecycle, persistence, and UX #786

Description

Summary

Confirmed gaps

1. No way to remove a sibling architect

2. Right-pane terminals (builders, shells) have no close button; sibling architects similarly cannot be closed from where they live

3. Sibling architects are not persisted in state.db

4. Routing was broken end-to-end from v3.0.5 → v3.0.7 (fixed in #774, pending v3.0.9 publish)

Probable additional gaps (for the spec phase to confirm)

Out of scope for this SPIR

Suggested approach

Severity / priority

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

3. Sibling architects are not persisted in `state.db`