consult: 3-way CMAP silently degrades when gemini/codex fail

## Bug: 3-way CMAP infrastructure degrades silently when gemini/codex fail

### Summary

The architect-role-doc-prescribed 3-way CMAP integration review (`consult -m gemini/codex/claude --type integration --issue N` in parallel) silently degrades to a 1-way review when gemini and/or codex consults fail. There is no operator-visible alert when a model's success rate is at 0%; the review pipeline continues as if all three succeeded, producing a single-claude review the architect must notice and synthesize.

### Reproduction

```bash
consult stats --days 30 --json
```

In our project (`apple-bluetooth`, private), this returns:

```json
{
  "byModel": [
    {"model": "claude", "count": 134, "successRate": 99.25, "successCount": 133},
    {"model": "gemini", "count": 84,  "successRate": 0,     "successCount": 0},
    {"model": "codex",  "count": 81,  "successRate": 0,     "successCount": 0}
  ]
}
```

165 invocations of gemini+codex over 30 days, all failed, with no operator awareness until the architect inspects `consult stats` directly.

Specific failure modes observed:

- **gemini**: `"You have exhausted your capacity on this model"` — quota exhausted; consult retries indefinitely (loop until killed)
- **codex**: `Codex Exec exited with signal SIGKILL` — process aborts; no retry; immediate failure

### Impact

The architect-role doc prescribes:

```bash
consult -m gemini --type integration --issue N --output /tmp/cmap-gemini-N.md &
consult -m codex  --type integration --issue N --output /tmp/cmap-codex-N.md  &
consult -m claude --type integration --issue N --output /tmp/cmap-claude-N.md &
wait
```

If 2/3 silently fail and the architect doesn't notice, "3-way CMAP" becomes "1-way claude" — defeating the multi-model-review-redundancy design rationale.

### Recommended fixes

1. **Pre-flight reachability probe**: `consult --probe -m <model>` does a no-op invocation; cache result for 5 minutes. Fail fast on probe failure before launching the expensive review.

2. **Degraded-mode warning**: when `consult -m <model>` fails or success-rate over the last N invocations is 0, log a prominent warning that the architect-direction document would treat as "this model is unavailable; proceed with caution OR pin to working models in `.codev/config.json`".

3. **Reduce gemini retry loop**: 3 retries with exponential backoff cap at 30s, not 6+ retries with 30+s waits. Currently a failed gemini consult ties up a process for ~5 minutes producing nothing.

4. **Codex SIGKILL diagnostic**: surface the actual underlying error rather than just "exited with signal SIGKILL". Hard to diagnose without strace.

5. **Architect-role doc note**: update the architect-role doc to recommend running `consult stats --days 30` periodically and pinning `.codev/config.json` `porch.consultation.models` to known-working models when degradation is observed.

### Workaround we used

For our project, we pinned `.codev/config.json`:

```json
{
  "porch": {
    "consultation": {
      "models": "claude"
    }
  }
}
```

This switches porch's per-phase consultation invocations to claude-only (which has 99.25% success). For the architect-side 3-way CMAP integration review, we attempted the full 3-way, noted the gemini/codex failures in the synthesized PR comment, and relied on architect-direct review for the missing 2/3 of reviewer redundancy.

### Discovered

2026-05-24 during the integration review of PR #4 in our SPIR-protocol security-research project. We document the workaround in [our project's CLAUDE.md](https://github.com/swiftraccoon/apple-bluetooth/blob/main/CLAUDE.md#codev-workflow-gotchas).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consult: 3-way CMAP silently degrades when gemini/codex fail #837

Bug: 3-way CMAP infrastructure degrades silently when gemini/codex fail

Summary

Reproduction

Impact

Recommended fixes

Workaround we used

Discovered

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

consult: 3-way CMAP silently degrades when gemini/codex fail #837

Description

Bug: 3-way CMAP infrastructure degrades silently when gemini/codex fail

Summary

Reproduction

Impact

Recommended fixes

Workaround we used

Discovered

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions