Number531 · Number531 · May 7, 2026 · May 7, 2026 · May 7, 2026
diff --git a/.claude/skills/post-deploy-verify/SKILL.md b/.claude/skills/post-deploy-verify/SKILL.md
@@ -0,0 +1,140 @@
+---
+name: post-deploy-verify
+description: >
+  Run a tiered post-deployment verification checklist after every deploy completes.
+  Tier 1 (Critical, ~2 min): /health 200, DB connectivity, COMMIT_SHA wired, flags
+  propagated. Tier 2 (Integration, ~5 min): § 8.4.X V1-V4 protocol — FMP tools,
+  M46-M58 code-execution models, Cloud Trace SubagentStart, citation verifier,
+  container env audit, bridge_metadata.git_sha. Tier 3 (Deep, ~10 min): metrics
+  baseline, reconciliation status, circuit breakers, Prometheus alerts, OTel
+  sampler. Read-only — never auto-rolls back. Triggers: "verify deploy",
+  "post-deploy check", "verify staging", "§ 8.4.X verification", "deploy validation",
+  "/post-deploy-verify". Supports: --tier 1|2|3|all, --url <base_url>.
+---
+
+# Post-Deploy Verify
+
+## Workflow
+
+Execute `scripts/verify.sh` from the skill directory (or invoke `/post-deploy-verify`):
+
+```bash
+bash /Users/ej/Super-Legal/.claude/skills/post-deploy-verify/scripts/verify.sh
+bash /Users/ej/Super-Legal/.claude/skills/post-deploy-verify/scripts/verify.sh --tier 1
+bash /Users/ej/Super-Legal/.claude/skills/post-deploy-verify/scripts/verify.sh --tier 2
+bash /Users/ej/Super-Legal/.claude/skills/post-deploy-verify/scripts/verify.sh --tier 3 --url http://34.26.70.60:3001
+```
+
+The script:
+1. Resolves base URL (--url arg → `super-legal-mcp-refactored/scripts/.staging-ip` → `http://localhost:3001`)
+2. Pre-flight: validates `which curl`, `which python3`, `which gcloud`, `which jq`
+3. Dispatches to selected tier(s); each tier outputs JSON
+4. `format-report.py` aggregates → markdown report
+5. Exit code: 0 PASSED, 1 WARNING, 2 FAILED
+
+## Tier 1 — Critical (~2 min)
+
+| Check | Source | Pass criteria |
+|---|---|---|
+| `GET /health` returns 200 | curl | HTTP 200 in ≤10s |
+| `dependencies.database.status` = `ok` | `/health` JSON | DB connectivity |
+| `dependencies.database.latency_ms` < 50 | `/health` JSON | Pool healthy |
+| `dependencies.circuit_breaker.state` = `CLOSED` | `/health` JSON | Anthropic API healthy |
+| Container env has `COMMIT_SHA` set | `gcloud compute ssh + docker exec env` | Reproducibility wired |
+| `COMMIT_SHA` matches `git rev-parse HEAD` | shell | No drift |
+| `feature_flags` block matches `flags.env` | `/health.feature_flags` vs `flags.env` parse | Flags propagated |
+
+Fails immediately on any CRITICAL — operator should investigate before proceeding.
+
+## Tier 2 — § 8.4.X V1-V4 + Container Env Audit (~5 min)
+
+Embeds the verification protocol from `super-legal-mcp-refactored/docs/pending-updates/equity-analyst-update.md` § 8.4.X.
+
+| Check | Pass criteria |
+|---|---|
+| **V1**: FMP tool invocations in last 5 min | When `FMP_ENABLED=true` + memo run: ≥1 row in `hook_audit_log` with `tool_name LIKE 'mcp__equities__%'`. Otherwise: WARNING "no recent invocations" |
+| **V2**: Code-execution models M46–M58 | When FMP active + memo run: ≥1 row in `code_executions` with `model_id IN ('M46',...,'M58')`. Otherwise: WARNING |
+| **V3**: Cloud Trace SubagentStart | When FMP active + memo run + sampler ≥0.1: ≥1 span in last hour. (Manual check via Cloud Console; skill prints filter URL) |
+| **V4**: Citation verifier accepts FMP | Latest memo's `qa-outputs/citation-verification-certificate.md` shows FMP URLs as `CONFIRMED`. (Manual file inspection; skill prints query path) |
+| **`bridge_metadata.git_sha` not 'unknown'** | All recent code-execution audit rows have real git SHA |
+| **Container env audit** | `OTEL_TRACES_SAMPLER`, `OTEL_TRACES_SAMPLER_ARG`, `FMP_ENABLED`, `COMMIT_SHA`, `BCRYPT_ROUNDS` all present in container env |
+
+## Tier 3 — Metrics + Reconciliation + Trace (~10 min)
+
+| Check | Source | Pass criteria |
+|---|---|---|
+| Reconciliation enabled, 0 backlog | `/health.reconciliation` | `enabled=true`, `pending_kg=0`, `pending_artifacts=0`, `stuck_kg=0` |
+| No new `claude_hook_persistence_failures_total` since deploy | `/metrics` | Counter delta = 0 since deploy |
+| All circuit breakers CLOSED | `/metrics` `claude_hook_circuit_breaker_state` | All values = 0 |
+| Memory baseline | `/health.performance` or `.memory` | RSS within ±20% of `references/baselines.json` |
+| Prometheus alerts not firing | `/metrics` evaluation | All thresholds within bounds |
+
+## Output Format
+
+```
+## Post-Deploy Verification Report
+Timestamp: <ISO> | Target: <url> | Tier: <level>
+Container: <name> | COMMIT_SHA: <short sha>
+
+### Overall: PASSED ✓ | WARNING ⚠ | FAILED ✗
+
+### Tier 1: Critical (<elapsed>)
+✓ /health 200 OK in <ms>ms
+✓ Database OK, latency <ms>ms
+✓ Anthropic circuit_breaker: CLOSED (<n>/<threshold> failures)
+✓ COMMIT_SHA in container = <sha> (matches deployed)
+✓ flags.env propagated (<m>/<n> flags match)
+
+### Tier 2: § 8.4.X V1-V4 + Container Env (<elapsed>)
+✓ V1 FMP tools: <n> invocations in last 5min
+✓ V2 Code models M46-M58: M46(<n>), M48(<n>), M50(<n>)
+⚠ V3 Cloud Trace: skill prints filter URL — manual check required
+⚠ V4 Citation verifier: skill prints qa-outputs path — manual check required
+✓ Container env: all required vars present
+✓ bridge_metadata.git_sha: all recent rows = <sha>
+
+### Tier 3: Metrics + Reconciliation (<elapsed>)
+✓ Reconciliation: enabled, 0 backlog
+✓ Hook persistence: 0 failures since <timestamp>
+✓ All circuit breakers CLOSED
+⚠ Memory: RSS <mb>MB (baseline <mb>MB; ±X.X% — watch trend)
+✓ Prometheus alerts: 0 firing
+
+### Issues detected
+NONE | <list>
+
+### Raw Signals
+[Full /health JSON + query results in collapsible blocks]
+```
+
+## Pre-flight Checks
+
+```bash
+which curl   # required
+which python3
+which gcloud  # for Tier 1 container env probe + Tier 2 SSH
+which jq      # for /health JSON parsing
+```
+
+## Read-Only Guarantee
+
+Never auto-rolls back, never invokes admin endpoints, never executes DML.
+All remediation suggestions are printed as commands the operator can run manually.
+
+## Troubleshooting
+
+| Failure | Fix |
+|---|---|
+| Tier 1 `/health` returns 502/503 | Container starting or unhealthy. Wait 60s and retry. If persistent, check `/deploy` Step 8.5 (post-IP container restart) |
+| Tier 1 `COMMIT_SHA = 'unknown'` | `--build-arg COMMIT_SHA` missed during last build. See `deploy/SKILL.md`. Re-deploy or `docker exec` to set env post-hoc (not persisted) |
+| Tier 2 V1 returns 0 rows but `FMP_ENABLED=true` | Either no memo run since deploy (run a test memo) OR `FMP_API_KEY` invalid/rate-limited (check `claude_api_client_results_total{fetch_source}` distribution — `exa_fallback` dominating = key issue) |
+| Tier 2 container env missing vars | Re-run `/deploy` and ensure deploy.sh's CONTAINER_ENV plumbing is current (lines 80-95) |
+| Tier 3 reconciliation stuck | See `session-diagnostics` Pattern 14 for forensic SQL |
+
+See `references/failure-patterns.md` for 15+ documented failure modes.
+
+## Known Constraints
+
+- **V3 Cloud Trace check is manual** — gcloud trace logs read API requires interactive auth and project scoping; skill prints the filter URL operator can paste into Cloud Console
+- **V4 Citation verifier check is manual** — requires reading the latest memo's `qa-outputs/` directory; skill prints the path
+- **Baselines drift over time** — `baselines.json` is regenerated post-deploy. If you bumped flags or memory limits, regenerate baselines before declaring memory drift
diff --git a/.claude/skills/post-deploy-verify/references/baselines.json b/.claude/skills/post-deploy-verify/references/baselines.json
@@ -0,0 +1,32 @@
+{
+  "_meta": {
+    "version": "v7.0.1",
+    "captured_at": "2026-05-07",
+    "captured_from": "super-legal-staging-bzx4 (34.26.70.60:3001)",
+    "notes": "Baseline captured at deploy time. Regenerate after major releases or when memory limits change. ±20% drift is acceptable; >20% warrants investigation."
+  },
+  "memory": {
+    "rss_mb": 145,
+    "heap_used_mb": 60,
+    "heap_total_mb": 63
+  },
+  "performance": {
+    "uptime_seconds_at_capture": 33,
+    "active_streams": 0,
+    "background_tasks": 0
+  },
+  "database": {
+    "expected_latency_ms_max": 50,
+    "pool_max_connections": 15
+  },
+  "circuit_breaker": {
+    "expected_state": "CLOSED",
+    "expected_failures": 0,
+    "threshold": 3
+  },
+  "reconciliation": {
+    "expected_enabled": true,
+    "expected_pending_kg": 0,
+    "expected_pending_artifacts": 0
+  }
+}
diff --git a/.claude/skills/post-deploy-verify/references/failure-patterns.md b/.claude/skills/post-deploy-verify/references/failure-patterns.md
@@ -0,0 +1,127 @@
+# Post-Deploy Failure Patterns
+
+Known failure modes seen in v6.x and v7.0.x deploys, with detection signal and remediation. Operator-only — read-only skill never auto-applies fixes.
+
+## Tier 1 — Critical
+
+### P1: `/health` returns 502/503
+
+**Signal**: Tier 1 fast-fail with `fatal:true`.
+**Cause**: Container starting, OOMed, or instance terminated.
+**Remediation**:
+```bash
+gcloud compute instances list --filter="name~super-legal-staging" --format="table(name,zone,status,networkInterfaces[0].accessConfigs[0].natIP)"
+gcloud compute ssh <instance> --zone=us-east1-c --command="docker ps -a && docker logs --tail 200 \$(docker ps -q | head -1)"
+```
+
+### P2: `COMMIT_SHA = 'unknown'` in container
+
+**Signal**: Tier 1 FAILED on COMMIT_SHA.
+**Cause**: `deploy.sh` missed `--build-arg COMMIT_SHA=$(git rev-parse HEAD)` plumbing.
+**Remediation**: Re-run deploy after confirming `deploy.sh:54-62` carries the build arg.
+
+### P3: DB latency > 100ms
+
+**Signal**: Tier 1 FAILED on DB latency.
+**Cause**: Pool saturation, Cloud SQL hot spot, or network partition.
+**Remediation**: Query the Postgres system catalog `pg_stat_activity` grouping by `state`. If `idle in transaction` > 5, investigate stuck client. If pool exhausted, restart container.
+
+### P4: Circuit breaker OPEN
+
+**Signal**: Tier 1 FAILED on circuit breaker.
+**Cause**: Anthropic API failures exceeded threshold (3 consecutive).
+**Remediation**: Wait 60s for half-open probe; check `https://status.anthropic.com`. If sustained, container restart resets state.
+
+### P5: Required env var missing
+
+**Signal**: Tier 1 FAILED on missing env (e.g., `OTEL_TRACES_SAMPLER`).
+**Cause**: `deploy.sh` `CONTAINER_ENV` array doesn't include the var.
+**Remediation**: Update deploy.sh, redeploy. Do NOT manually `docker exec ... export` — overwritten on restart.
+
+## Tier 2 — Integration / § 8.4.X
+
+### P6: V1 returns 0 FMP invocations
+
+**Signal**: Tier 2 WARNING on V1.
+**Cause**: Either FMP_API_KEY broken, or no equity-research memo run in last 5 min.
+**Remediation**: Run a test memo via the frontend. If still 0 after memo completes, check `claude_api_client_results_total{client="fmp"}` distribution.
+
+### P7: V2 missing M46-M58 invocations
+
+**Signal**: Tier 2 WARNING on V2.
+**Cause**: `CODE_EXECUTION_BRIDGE=false` or model catalog deploy missed.
+**Remediation**: Verify `code_executions` table has rows for any model post-deploy. If empty, code-execution bridge is dead — check `claude-sdk-server` logs for `[CODE-EXEC]` errors.
+
+### P8: V3 Cloud Trace empty
+
+**Signal**: Tier 2 WARNING on V3 (no SubagentStart spans).
+**Cause**: `OTEL_TRACES_SAMPLER_ARG` too low (e.g., 0.01) or `OTEL_ENABLED=false`.
+**Remediation**: Bump sampler to 1.0 for verification window, redeploy. After window, return to 0.1.
+
+### P9: V4 citation verifier rejected FMP
+
+**Signal**: Latest memo's `citation-verification-certificate.md` shows FMP URLs as REJECTED.
+**Cause**: Citation verifier regex doesn't match `financialmodelingprep.com` URL pattern.
+**Remediation**: Check `src/utils/citationWebsearchVerifier.js` regex; FMP entries should match `^https?://(www\.)?financialmodelingprep\.com/`.
+
+### P10: `bridge_metadata.git_sha = 'unknown'`
+
+**Signal**: Tier 2 FAILED on bridge_metadata git_sha probe.
+**Cause**: Same as P2 — `COMMIT_SHA` build arg never propagated to runtime.
+**Remediation**: Re-run deploy with build arg. Replay envelope is required for EU AI Act Art. 15 audit trail.
+
+## Tier 3 — Deep / Reconciliation
+
+### P11: Reconciliation `pending_kg > 0`
+
+**Signal**: Tier 3 FAILED on reconciliation backlog.
+**Cause**: KG worker stuck or DB write contention.
+**Remediation**:
+```sql
+SELECT session_key, kg_build_attempts, kg_build_last_error, last_kg_build_attempt_at
+FROM sessions WHERE kg_status = 'pending' AND kg_build_attempts >= 3
+ORDER BY created_at DESC LIMIT 20;
+```
+If errors are real, fix root cause. If stuck due to deploy interruption, sessions self-recover after `SESSION_RECONCILIATION` worker tick.
+
+### P12: `claude_hook_persistence_failures_total` non-zero
+
+**Signal**: Tier 3 FAILED on hook persistence.
+**Cause**: DB unreachable during hook fire OR row-too-large.
+**Remediation**:
+```sql
+SELECT event_type, count(*) FROM hook_audit_log
+WHERE persisted = false AND created_at > NOW() - INTERVAL '1 hour'
+GROUP BY event_type;
+```
+
+### P13: Memory RSS drift > 50%
+
+**Signal**: Tier 3 FAILED on memory baseline.
+**Cause**: Leak, pool re-sizing, or legitimate workload increase.
+**Remediation**: Capture heap dump via `kill -USR2 <pid>`, regenerate `baselines.json` if drift is intentional.
+
+### P14: Transcript event count < 100 for completed session (Pattern 10)
+
+**Signal**: Tier 3 FAILED on `t3-transcript-event-rate.sql`.
+**Cause**: Transcript flush broken — `TRANSCRIPT_DB_PERSISTENCE=false` OR queue saturated.
+**Remediation**: Confirm flag is true via `/health.feature_flags.TRANSCRIPT_DB_PERSISTENCE`. Check `claude_hook_persistence_failures_total{event_type="transcript"}`. Sessions completed during disabled flag are NOT recoverable.
+
+### P15: OTel sampler not propagated
+
+**Signal**: Tier 3 FAILED on `OTEL_ENABLED` or sampler arg mismatch.
+**Cause**: Container env missing or `flags.env` drifted from `--container-env` array.
+**Remediation**: Check `gcloud compute ssh ... -- docker exec <container> env | grep OTEL`. Redeploy if missing.
+
+## Cross-tier patterns
+
+### Static IP race (deploy.sh step 7)
+
+Not a verification failure but precedes Tier 1 P1. If verify-tier1 sees DB unreachable from container despite `/health` 200 from frontend — container is on ephemeral IP not whitelisted by Cloud SQL.
+
+**Detection**:
+```bash
+gcloud compute instances describe <instance> --zone=us-east1-c \
+  --format="value(networkInterfaces[0].accessConfigs[0].natIP)"
+```
+If not `34.26.70.60`, see `deploy/SKILL.md` § "Static IP assignment race".
diff --git a/.claude/skills/post-deploy-verify/references/tier1-checks.md b/.claude/skills/post-deploy-verify/references/tier1-checks.md
@@ -0,0 +1,42 @@
+# Tier 1 — Critical Checks (~2 min)
+
+Tier 1 fails fast on any CRITICAL — operator should investigate before Tier 2/3.
+
+| Check | Pass criteria | Severity on fail |
+|---|---|---|
+| `GET /health` HTTP code | 200 | FAILED (script exits early with `fatal:true`) |
+| `dependencies.database.status` | `ok` | FAILED |
+| `dependencies.database.latency_ms` | `< 50` | WARNING if 50-100, FAILED if >100 |
+| `dependencies.circuit_breaker.state` | `CLOSED` | FAILED if OPEN; WARNING if missing |
+| Container env `COMMIT_SHA` | set, not 'unknown' | FAILED if 'unknown' or missing |
+| `COMMIT_SHA` matches `git rev-parse HEAD` | exact match | WARNING if drift |
+| Required env vars present | all 7 v7.0.x flags (OTEL_ENABLED, OTEL_TRACES_SAMPLER, OTEL_TRACES_SAMPLER_ARG, TRANSCRIPT_DB_PERSISTENCE, SESSION_RECONCILIATION, HOOK_DB_PERSISTENCE, FMP_ENABLED) | FAILED if any missing |
+
+## Fast-fail behavior
+
+If `/health` returns non-200, script exits immediately with `fatal:true` in the JSON output. No further Tier 1 checks run, and Tier 2/3 are not invoked.
+
+This avoids cascading false-negatives when the container is unreachable.
+
+## Assumptions
+
+- `gcloud` CLI installed + authenticated (Tier 1 container env audit). If not, those checks emit WARNING and skip.
+- `jq` installed for JSON parsing (pre-flight).
+- `curl` installed (pre-flight).
+
+## Why these checks
+
+- **`/health` 200**: container is responsive
+- **DB connectivity + latency**: Cloud SQL whitelist intact, pool not saturated
+- **Circuit breaker state**: Anthropic API healthy enough to make tool calls
+- **`COMMIT_SHA != 'unknown'`**: deploy.sh's `--build-arg` propagation worked (compliance/audit requirement)
+- **Required env vars**: deploy.sh's `--container-env` plumbing carried v7.0.x flags through
+
+## Manual recovery hints
+
+| Failure | Likely cause | Fix |
+|---|---|---|
+| `/health` 502/503 | Container unhealthy or starting | Wait 60s; check `gcloud compute ssh` + `docker logs` |
+| `COMMIT_SHA = 'unknown'` | deploy.sh missed `--build-arg COMMIT_SHA=$(git rev-parse HEAD)` | Re-deploy with the fix in deploy.sh:54-62 |
+| Missing env var | `--container-env` plumbing didn't include it | Update deploy.sh CONTAINER_ENV array; redeploy |
+| DB latency >100ms | Pool exhaustion or Cloud SQL hot spot | Check `pg_pool` metrics; consider `pg_stat_activity` |
diff --git a/.claude/skills/post-deploy-verify/references/tier2-checks.md b/.claude/skills/post-deploy-verify/references/tier2-checks.md
@@ -0,0 +1,28 @@
+# Tier 2 — § 8.4.X V1-V4 + Container Env Audit (~5 min)
+
+Embeds the verification protocol from `super-legal-mcp-refactored/docs/pending-updates/equity-analyst-update.md` § 8.4.X.
+
+## V1: FMP tool invocations
+SQL: `scripts/queries/v1-fmp-tool-invocations.sql` — counts `mcp__equities__%` tool calls in last 5 min.
+
+## V2: Code-execution models M46-M58
+SQL: `scripts/queries/v2-code-execution-models.sql` — per-model invocation/success/duration distribution.
+
+## V3: Cloud Trace SubagentStart
+Manual: filter Cloud Trace for `attribute.agent_type='equity-analyst' AND attributes.stage='research_support'`. Skill prints filter URL.
+
+## V4: Citation verifier accepts FMP
+Manual: grep latest memo's `qa-outputs/citation-verification-certificate.md` for `financialmodelingprep.com` entries. Skill prints path.
+
+## Container env audit
+Required: `OTEL_TRACES_SAMPLER`, `OTEL_TRACES_SAMPLER_ARG`, `FMP_ENABLED`, `COMMIT_SHA`. Optional: `FMP_API_KEY`, `BCRYPT_ROUNDS`.
+
+## bridge_metadata.git_sha probe
+SQL: `scripts/queries/t3-bridge-metadata-git-sha.sql`. Pass: single row with real SHA. Fail: 'unknown' = COMMIT_SHA build arg missed.
+
+## Why these checks
+
+- **V1/V2**: confirms FMP routing is live (when FMP_ENABLED=true). Distinguishes "no recent memo" from "FMP_API_KEY broken" via `claude_api_client_results_total{fetch_source}` distribution.
+- **V3**: confirms OTel sampler is sampling enough that equity-analyst spans land in Cloud Trace
+- **V4**: confirms citation_websearch_verifier accepts FMP URL patterns (regex match)
+- **bridge_metadata.git_sha**: confirms regulator-replay envelope is intact (EU AI Act Art. 15)