Skip to content
Merged
11 changes: 6 additions & 5 deletions .claude/skills/api-integration/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,13 @@ description: Build and integrate a new API client into the Super Legal MCP platf

## Overview

This skill integrates a new API data source into the Super Legal MCP platform following the canonical pattern used by all 36 existing clients. The process produces a fully operational hybrid client with native-first routing, Exa two-phase fallback (search + /contents enrichment), circuit breaker protection, caching, observability, and frontend catalog display.
This skill integrates a new API data source into the Super Legal MCP platform following the canonical pattern used by all 38 existing clients. The process produces a fully operational hybrid client with native-first routing, Exa two-phase fallback (search + /contents enrichment), circuit breaker protection, caching, observability, and frontend catalog display.

**Current platform state** (update these counts after each integration):
- API clients: 36
- Base MCP domains: 33 (+conditional: code-execution, direct-fetch, exa-search)
- Tool schemas: 149+
**Current platform state** (v7.0.1; update these counts after each integration):
- API clients: 38 (+FMP equity-research gated by FMP_ENABLED, +DirectFetch)
- Base MCP domains: 34 (+conditional: code-execution, direct-fetch, exa-search, equities)
- Tool schemas: 197 (161 base + 36 FMP equity tools when FMP_ENABLED=true)
- For high-precision integrations (live financial data, regulatory APIs), follow the FMP-derived empirical-first methodology — see Phase 1.5 (Empirical Capture & Probing) and Phase 1.6 (Endpoint Classification) below.
- Production entry point: `Dockerfile:59` → `bootstrap.js` → `claude-sdk-server.js` → `clientRegistry.js`
- `EnhancedLegalMcpServer.js` is legacy/local-dev only — do NOT wire new clients there

Expand Down
17 changes: 15 additions & 2 deletions .claude/skills/client-backup-restore/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,10 +187,23 @@ gcloud sql backups restore {backup_id} \

| Type | What's included | Size estimate | Duration |
|---|---|---|---|
| `full` | Database + reports directory | ~150-300 MB | 3-5 min |
| `database-only` | Cloud SQL export (all tables + data) | ~30-80 MB | 1-2 min |
| `full` | Database + reports directory | ~200-400 MB (v7.0.0+: includes transcript_events ~700KB-1MB/session) | 3-6 min |
| `database-only` | Cloud SQL export (all tables + data) | ~30-100 MB (larger on v7.0.0+ deployments due to transcript_events) | 1-3 min |
| `reports-only` | Reports directory (sessions, raw sources) | ~100-250 MB | 2-4 min |

**v7.0.0+ database-only scope** — Cloud SQL export captures all tables, including the new compliance/observability tables introduced in v7.0.0:
- `transcript_events` — full SSE event history per session (~700KB-1MB per session × N sessions). **Largest growth vector** at 10K+ sessions.
- `code_executions` (now with 13+ reproducibility columns: model_id, llm_name, anthropic_request_id, anthropic_message_id, system_prompt_hash, python_code, container_id, tool_use_id, stop_reason, turn_count, pause_count, refusal_detected, etc.) — required for byte-replay envelope per EU AI Act Art. 15
- `code_execution_inputs` — data lineage junction (small table, 1-5 rows per execution)
- `citation_source_links` — citation→source bridge with confidence scores (1 row per matched citation)
- `hook_audit_log` — now includes `bridge_metadata` JSONB column with `git_sha + sdk_version + container_id + system_prompt_hash` (regulator-replay envelope)

Restore verification (Phase 4) should confirm these row counts post-restore for v7.0.0+ deployments:
- `SELECT COUNT(*) FROM transcript_events` matches pre-backup count
- `SELECT COUNT(*) FROM code_executions WHERE model_id IS NOT NULL` matches pre-backup count (NULL model_id = pre-v6.8.4 row, allowed)
- `SELECT COUNT(*) FROM citation_source_links` matches pre-backup count
- `SELECT event_data->'bridge_metadata' IS NOT NULL FROM hook_audit_log WHERE tool_name='run_python_analysis'` — bridge_metadata preserved on restore

## Storage Locations

All backups stored in the client's WORM bucket:
Expand Down
7 changes: 6 additions & 1 deletion .claude/skills/client-offboarding/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,12 @@ bash /Users/ej/Super-Legal/.claude/skills/client-offboarding/scripts/offboard-cl

**Step 7**: Verify archives — checks that archive files exist in GCS and have non-zero size. Reports checksums.

**Step 6.5**: Archive Wave 3 audit tables as dedicated CSV artifacts — `access_log` (EU AI Act Article 12 read-side evidence) and `human_interventions` (EU AI Act Article 14 operator governance evidence) exported via `psql COPY TO STDOUT` + gzip to `gs://super-legal-worm-{client_id}/archive/{table}-{date}.csv.gz`. Cloud SQL's native `gcloud sql export csv` doesn't support table-level `--query` filtering, so the script uses psql directly against the connection string resolved from Secret Manager. Runs AFTER archive verification (Step 7) and BEFORE any destructive deletion (Phase 3) — these tables must survive the DB drop as standalone legal records. Legacy clients predating Wave 3 (tables don't exist) gracefully skip via `2>/dev/null || warn`. v6.5.1+ instances also have `hook_audit_log WHERE event_type = 'KGBuild'` entries — include in archive for KG build audit trail. Requires v6.6.0+ for complete telemetry (background task tracking, pool survival).
**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope):
- `transcript_events` — full SSE event history per session (~700KB-1MB per session × N sessions; may be largest archive file). Required for byte-faithful session-reload audit if regulator queries any session history.
- `citation_source_links` — citation→raw-source bridge with confidence scores. Required for hallucination audit (any citation with `confidence < 0.85` flagged for QA review at session-time should be reproducible from this archive).
- `code_execution_inputs` — data lineage junction linking code executions to upstream subagent reports/embeddings/KG nodes. Required for EU AI Act Art. 15 reproducibility chain ("which subagent's output drove this DCF result?").

All exported via `psql COPY TO STDOUT` + gzip to `gs://super-legal-worm-{client_id}/archive/{table}-{date}.csv.gz`. Cloud SQL's native `gcloud sql export csv` doesn't support table-level `--query` filtering, so the script uses psql directly against the connection string resolved from Secret Manager. Runs AFTER archive verification (Step 7) and BEFORE any destructive deletion (Phase 3) — these tables must survive the DB drop as standalone legal records. Legacy clients predating each table gracefully skip via `2>/dev/null || warn`. v6.5.1+ instances also have `hook_audit_log WHERE event_type = 'KGBuild'` entries — include in archive for KG build audit trail. **v7.0.0+ instances** have `hook_audit_log` rows with `bridge_metadata` JSONB (`git_sha + sdk_version + container_id + system_prompt_hash`) — these are the regulator-replay envelope and MUST be preserved in the audit log archive. Requires v6.6.0+ for complete telemetry (background task tracking, pool survival).

### Phase 3: Resource Deletion (DESTRUCTIVE — requires --confirm)

Expand Down
4 changes: 2 additions & 2 deletions .claude/skills/code-execution-models/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: code-execution-models
description: Add new financial models to the code execution sandbox catalog. Use when the user asks to "add a model", "create a financial model", "add [model name] to code execution", "new analysis model", or wants to expand the PE/IB/M&A quantitative analysis toolkit. The sandbox runs Claude-generated Python (pandas, numpy, scipy, sklearn, matplotlib, seaborn) via the Anthropic code_execution_20260120 tool to produce structured JSON results, charts (PNG), and formatted tables. Currently 45 models across 13 categories. Also use when the user says "/code-execution-models".
description: Add new financial models to the code execution sandbox catalog. Use when the user asks to "add a model", "create a financial model", "add [model name] to code execution", "new analysis model", or wants to expand the PE/IB/M&A quantitative analysis toolkit. The sandbox runs Claude-generated Python (pandas, numpy, scipy, sklearn, matplotlib, seaborn) via the Anthropic code_execution_20260120 tool to produce structured JSON results, charts (PNG), and formatted tables. Currently 56 models across 13 categories (M46–M55, M58 added in v7.0.0 for FMP equity research, gated by FMP_ENABLED). Also use when the user says "/code-execution-models".
---

# Code Execution Models — Add Financial Analysis Model
Expand Down Expand Up @@ -159,7 +159,7 @@ Append to the `CODE_EXECUTION_MODELS` array after the last entry:
}
```

**Format guidelines** (match existing 45 models):
**Format guidelines** (match existing 56 models):
- `description`: 100-300 words, business-context-rich, mentions charts/tables produced
- `methodology`: cites specific standards (ASC, IRC, academic papers) with thresholds
- `outputFormat`: explicitly states chart types and table formats to generate
Expand Down
36 changes: 36 additions & 0 deletions .claude/skills/deploy/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,42 @@ gcloud compute instances add-access-config $INSTANCE --zone=us-east1-c --access-
gcloud compute ssh $INSTANCE --zone=us-east1-c --command='docker restart $(docker ps -q | head -1)'
```

### Variant: MIG instance replacement mid-retries

**Observed on**: 2026-05-06 v7.0.1 deploy.

**Symptom**: Step 7 retries 5x with `Could not fetch resource: super-legal-staging-XXXX` even though the script logs `IP is RESERVED` and proceeds to `Attempt 1/5: Assigning ...`. The log line keeps showing the SAME instance name across all 5 attempts. Meanwhile, `gcloud compute instances list` reveals a DIFFERENT instance name is actually running.

**Root cause**: The MIG terminated the instance the script was targeting (e.g., `super-legal-staging-0239`) and rolled forward to a new one (e.g., `super-legal-staging-bzx4`) DURING step 7's retry budget. The script captured the original instance name in step 6 and did not re-resolve it on each retry. Every `add-access-config` call hits a deleted resource.

**Detection between retries**:
```bash
gcloud compute instances list --filter='name~super-legal-staging AND status=RUNNING' --format='value(name)'
```
If this returns a different instance name than what the script's log shows, the variant has triggered.

**Manual recovery on the new instance**:
```bash
NEW_INSTANCE=$(gcloud compute instances list --filter='name~super-legal-staging AND status=RUNNING' --format='value(name)' | head -1)
gcloud compute instances delete-access-config $NEW_INSTANCE --zone=us-east1-c --access-config-name=external-nat --quiet
sleep 10
gcloud compute instances add-access-config $NEW_INSTANCE --zone=us-east1-c --access-config-name=external-nat --address=34.26.70.60 --quiet
sed -i '' '/compute\./d' ~/.ssh/google_compute_known_hosts
gcloud compute ssh $NEW_INSTANCE --zone=us-east1-c --command='docker restart $(docker ps -q | head -1)'
```

Wait 60s, then verify via `curl http://34.26.70.60:3001/health`.

**Future deploy.sh hardening** (not yet implemented): Step 7's retry loop should re-resolve the instance name on each attempt:
```bash
for attempt in 1 2 3 4 5; do
INSTANCE=$(gcloud compute instances list --filter='name~super-legal-staging AND status=RUNNING' --format='value(name)' | head -1)
gcloud compute instances add-access-config $INSTANCE --zone=us-east1-c --access-config-name=external-nat --address=34.26.70.60 --quiet 2>err && break
sleep 30
done
```
Filed as v7.0.x follow-up code change.

### Docker push transient broken pipe

**Observed on**: 2026-04-27 v6.7.0 deploy.
Expand Down
4 changes: 2 additions & 2 deletions .claude/skills/infrastructure-health/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: infrastructure-health
description: >
Tiered infrastructure health monitoring for Super Legal MCP platform. Monitors GCE instances,
PostgreSQL/pgvector, Anthropic API circuit breakers, 36 API clients, Gemini embedding
PostgreSQL/pgvector, Anthropic API circuit breakers, 38 API clients (incl. FMP equity-research, gated), Gemini embedding
service, memory trends, EPO OAuth tokens, Prometheus alerts, session hygiene, API key expiration,
Docker image drift, and dependency vulnerabilities. Triggers on: "infrastructure health",
"health check", "infra status", "system health", "check infrastructure", "run health checks",
Expand Down Expand Up @@ -135,7 +135,7 @@ Read these subskill references:
- [references/dependency-vulnerabilities.md](references/dependency-vulnerabilities.md) — npm audit

### Execution
1. Fetch `<base_url>/metrics` and check for circuit breaker trips, high error rates. Wave 4 metrics to verify: `claude_subagent_duration_ms`, `claude_api_client_results_total` (check for `outcome="zero_results"`), `claude_document_conversion_duration_ms`, `claude_document_conversion_errors_total`, `claude_embedding_duration_ms`, `claude_gate_check_results_total`, `claude_kg_build_total` (check for `status="error"` or `status="skipped_breaker"`), `claude_kg_build_duration_ms`
1. Fetch `<base_url>/metrics` and check for circuit breaker trips, high error rates. Wave 4 metrics to verify: `claude_subagent_duration_ms`, `claude_api_client_results_total` (check for `outcome="zero_results"` and `fetch_source` distribution — `exa_fallback` dominating for FMP tools indicates `FMP_API_KEY` issues), `claude_document_conversion_duration_ms`, `claude_document_conversion_errors_total`, `claude_embedding_duration_ms`, `claude_gate_check_results_total`, `claude_kg_build_total` (check for `status="error"` or `status="skipped_breaker"`), `claude_kg_build_duration_ms`. **v7.0.0 metrics to verify**: `claude_hook_persistence_failures_total` (any non-`unknown` reason = data loss vector), `claude_hook_circuit_breaker_state` (any value ≥2 = persistence skipping), `claude_code_execution_failures_total` by reason, `claude_hook_invocations_total` (success path counter — should grow during active sessions), `claude_tool_invocations_v2_total` (replaces deprecated v1; verify both still emitting during dual-emission window). **OTel sampler check**: container env `OTEL_TRACES_SAMPLER_ARG` — `1.0` indicates verification window, `0.1` is steady-state. See `references/prometheus-alerts.md` for full alert rule + remediation table.
2. Run `scripts/pg-health.sh` for session hygiene and table sizes
3. Calculate days until SAM_GOV_API_KEY expiry (set 2026-02-11, 90-day lifetime → ~2026-05-12)
4. Run `scripts/docker-drift.sh` (requires gcloud auth — skip gracefully if unavailable)
Expand Down
13 changes: 10 additions & 3 deletions .claude/skills/infrastructure-health/references/postgresql.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
# PostgreSQL Health — Subskill Reference

**Version**: v7.0.1 (2026-05-06)

## Connection
- Pool max: `PG_POOL_MAX` env (default: 10)
- Pool max: `PG_POOL_MAX` env (default: **15** — bumped from 10 in v7.0.0 for 33% burst margin during simultaneous live stream + 3-rebuild reconciliation + transcript flush)
- `statement_timeout`: 120,000 ms (preserved — extending was found unnecessary and risky during v6.8.0 audit)
- Connection string: `PG_CONNECTION_STRING` or `DATABASE_URL`
- Extension: pgvector (required when `EMBEDDING_PERSISTENCE=true`)

Expand All @@ -10,11 +13,15 @@
|-------|---------|----------------|
| sessions | Session tracking | 1 row per pipeline run |
| reports | Report versions | ~10-20 rows per session |
| hook_audit_log | Agent activity audit | 100-500 rows per session |
| hook_audit_log | Agent activity audit | 100-500 rows per session; v7.0.0 adds `bridge_metadata` JSONB + `tool_use_id` columns |
| code_executions | Per-`run_python_analysis` execution audit | 1 row per code execution; **v7.0.0 adds reproducibility columns** (model_id, llm_name, anthropic_request_id, anthropic_message_id, input/output/cache tokens, system_prompt_hash, python_code, container_id, tool_use_id, stop_reason, turn_count, pause_count, refusal_detected) |
| code_execution_inputs | **v7.0.0** — data lineage junction linking each code execution to upstream subagent reports/embeddings/KG nodes | 1-5 rows per code execution |
| transcript_events | **v7.0.0** — full-fidelity SSE event capture (`migrations/012_transcript-events.up.sql`); buffered batch insert | ~4,000-6,000 rows per 30-50 min session; **~700KB-1MB storage per session** |
| citation_source_links | **v7.0.0** — citation→source bridge with fuzzy matching (URL exact / URL fuzzy / title fuzzy / embedding cosine) + confidence score | 1 row per memo footnote matched |
| report_embeddings | pgvector embeddings | ~50-100 chunks per report |
| agent_states | Agent lifecycle | ~40 rows per session |
| source_writes | Wave 3 WAL — raw source persistence reconciliation | 1 row per raw source capture; hourly reconciler |
| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget) |
| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged |
| human_interventions | Wave 3 — EU AI Act Art. 14 operator governance audit | 0-5 rows per session (admin actions only) |
| pii_mappings | Wave 3 — GDPR Art. 17 pseudonymization backing store | 0-N per session when PII detected |

Expand Down
Loading