Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions super-legal-mcp-refactored/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,66 @@ All notable changes to the Super Legal MCP Server are documented in this file.

## [Unreleased]

### Operations — `STRUCTURED_OUTPUT_ENFORCEMENT` pre-flip readiness: alerts + playbook + cohort plan + feature-flag truth-doc registration + dashboard panels + skill doc updates (PR #140)

Closes the operational surface for the bridge observability v2 work (PRs #135-#139). Zero source-code changes — documentation, YAML config, and Grafana JSON only. Unblocks the `STRUCTURED_OUTPUT_ENFORCEMENT=true` production flip by closing 6 distinct operator-readiness gaps under one coherent PR.

**Six coordinated changes**:

1. **Feature-flag truth doc registration** (`docs/feature-flags.md`): Documents that `STRUCTURED_OUTPUT_ENFORCEMENT` (PR #135) and `XLSX_RENDERER` (PR #100 era) were both added to production code but NEVER registered in the truth doc — accumulated documentation debt closed here. Bumps doc to v4.2 (41 flags total). Flag #42 entry includes an explicit 8-item pre-flip prerequisite checklist that MUST be ✅ before flipping to `true` in production.

2. **`flags.env` prerequisite comment block** (`flags.env:18-30`): Warns operators reading the env file directly that flipping `STRUCTURED_OUTPUT_ENFORCEMENT=true` without the checklist is forbidden; cross-links to the truth doc + runbooks.

3. **Six Prometheus alerts** (`prometheus/alerts.yml`):
- 4 alerts on `claude_code_bridge_envelope_outcome_total` (PR #138 counter): `CodeBridgeEnvelopeStdoutFallbackHigh`, `CodeBridgeEnvelopeNoneCritical`, `CodeBridgeTurn1SuccessLow`, `CodeBridgeUnknownCallerCategoryEmitting`
- 2 alerts on `claude_xlsx_render_turn1_envelope_success_total` (PR #136 counter): `XlsxRenderTurn1RegressionByTemplate`, `XlsxRenderTurn1RegressionByPhase`
- Each alert includes `runbook_url` annotation linking to the playbook's per-alert anchor
- Labels: `severity`, `team`, `component`, `flag` for routing + filtering

4. **Envelope-decision on-call playbook** (`docs/runbooks/envelope-decision-debug-playbook.md`, NEW): 6-section drill-down: triage decision tree, per-alert response procedures with kubectl + PromQL + SQL + Cloud Logging queries, cross-surface quick reference, rollback procedure (~5min target), escalation contacts (placeholder for team), seed-empty known-false-positives table.

5. **Cohort rollout plan** (`docs/runbooks/structured-output-enforcement-rollout.md`, NEW): Pre-flip 8-item checklist + rollback drill (rehearse in staging) + 3 stages (single-client canary 24h → 25% cohort 48h → 100% fleet 7-day) with per-stage pass/fail criteria + rollback triggers + per-stage rollback procedures + entry/exit log template + aborted-rollout recovery procedure + roles & responsibilities.

6. **Grafana dashboard panels** (`grafana/claude-sdk-dashboard.json`): 5 new panels visualizing both counters. 3 panels for PR #138 counter (caller-category mix, envelope-source distribution, turn-1 success gauge with threshold band at 0.70). 2 panels for PR #136 counter (per-template turn-1 success, per-phase turn-1 success heatmap). Dashboard goes 6 → 11 panels.

**Plus 4 minor skill doc enhancements** identified by post-PR-138 audit (bundled here since they're operator-facing documentation):
- `.claude/skills/post-deploy-verify/scripts/verify-tier2.sh` — new V7 check probing `/metrics` for the new counter registration
- `.claude/skills/infrastructure-health/references/postgresql.md` — note `code_executions.envelope_source` + `access_log.event_data` JSONB columns + indexes
- `.claude/skills/client-audit-export/SKILL.md` — note regulator handoff bundles now include both new columns via `SELECT *`
- `.claude/skills/client-offboarding/SKILL.md` — note Phase 2 archive captures `event_data` JSONB inline as quoted JSON in CSV

**Verification**:
- L1 (doc consistency): `STRUCTURED_OUTPUT_ENFORCEMENT` + `XLSX_RENDERER` now appear in `docs/feature-flags.md` Quick Reference table + detailed entry sections
- L2 (PromQL syntax): all 6 alerts pass `promtool check rules prometheus/alerts.yml` syntax validation
- L3 (Grafana JSON): dashboard JSON parses cleanly via `jq` — verified 11 panels enumerable, all 5 new panels include datasource + targets + fieldConfig
- L4 (staging soak ≥ 3 days at flag=true) — deferred to operations team execution; cannot be performed in this PR

**Files modified (11 total across PR #140 + companion PR #141)**:

In PR #140 commit (7 files in `super-legal-mcp-refactored/` worktree):
- `docs/feature-flags.md` — flag #41 (XLSX_RENDERER) + flag #42 (STRUCTURED_OUTPUT_ENFORCEMENT) entries; header bump v4.1 → v4.2; Dependency Tree section updated to include both new flags
- `flags.env` — prerequisite comment block above line 22
- `prometheus/alerts.yml` — 6 new alert rules
- `docs/runbooks/envelope-decision-debug-playbook.md` (NEW)
- `docs/runbooks/structured-output-enforcement-rollout.md` (NEW)
- `grafana/claude-sdk-dashboard.json` — 5 new panels
- `CHANGELOG.md` (this entry)

In companion PR #141 commit (4 files in `.claude/skills/`, outside this worktree boundary):
- `.claude/skills/post-deploy-verify/scripts/verify-tier2.sh` — V7 check
- `.claude/skills/infrastructure-health/references/postgresql.md` — 2 table rows enriched
- `.claude/skills/client-audit-export/SKILL.md` — 2 table rows enriched
- `.claude/skills/client-offboarding/SKILL.md` — Step 6.5 paragraph enriched

PR #141 exists as a separate commit because the 4 skill docs live at the project root (`/Users/ej/Super-Legal/.claude/skills/`) which is outside the xlsx-renderer worktree's boundary; same logical scope, separated only by git-worktree mechanics.

**Honest limits acknowledged**:
- L4 staging soak (≥ 3 days at flag=true) cannot be performed in this PR — requires operations execution per the runbook
- §5 of the debug playbook leaves escalation contacts as placeholder TBD (team-specific info)
- §6 of the debug playbook seeds the known-false-positives table empty (populated by on-call as real incidents occur)

**Plan**: `/Users/ej/.claude/plans/twinkling-glittering-comet.md`

### Added — Bridge observability v2: envelope_source DB persistence + generic Prometheus counter + access_log JSONB enrichment + structured envelope-decision logging (PR #138)

Closes 4 verified observability/auditability gaps identified by background Explore agents on 2026-05-16 after PR #137 merged. Pre-PR-138, `envelope_source` (set on every bridge return at `selectEnvelopeWithFallback`) was visible only via the multi-turn xlsx orchestrator's Prometheus counter — single-turn xlsx, MCP gateway, and Agent SDK subagent callers produced envelope outcomes that never reached any dashboard or queryable schema. This PR ships a single coherent change (~220 LOC + 4 migration files) that closes all four under one operational umbrella so `STRUCTURED_OUTPUT_ENFORCEMENT=true` can be flipped for full-fleet production rollout with confidence.
Expand Down
Loading