Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .claude/skills/client-backup-restore/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,12 +196,14 @@ gcloud sql backups restore {backup_id} \
- `code_executions` (now with 13+ reproducibility columns: model_id, llm_name, anthropic_request_id, anthropic_message_id, system_prompt_hash, python_code, container_id, tool_use_id, stop_reason, turn_count, pause_count, refusal_detected, etc.) — required for byte-replay envelope per EU AI Act Art. 15
- `code_execution_inputs` — data lineage junction (small table, 1-5 rows per execution)
- `citation_source_links` — citation→source bridge with confidence scores (1 row per matched citation)
- `citation_verdicts` — per-footnote G5 verdicts (v6.8.6 T1, PR #122). 1 row per verified footnote; ~300-500 rows per memo session that ran citation-websearch-verifier. FK ON DELETE CASCADE on reports + sessions — backed up automatically via pg_dump; no manual handling needed.
- `hook_audit_log` — now includes `bridge_metadata` JSONB column with `git_sha + sdk_version + container_id + system_prompt_hash` (regulator-replay envelope)

Restore verification (Phase 4) should confirm these row counts post-restore for v7.0.0+ deployments:
- `SELECT COUNT(*) FROM transcript_events` matches pre-backup count
- `SELECT COUNT(*) FROM code_executions WHERE model_id IS NOT NULL` matches pre-backup count (NULL model_id = pre-v6.8.4 row, allowed)
- `SELECT COUNT(*) FROM citation_source_links` matches pre-backup count
- `SELECT COUNT(*) FROM citation_verdicts` matches pre-backup count (zero rows is acceptable for sessions that ran before v6.8.6 OR that never invoked citation-websearch-verifier)
- `SELECT event_data->'bridge_metadata' IS NOT NULL FROM hook_audit_log WHERE tool_name='run_python_analysis'` — bridge_metadata preserved on restore

## Storage Locations
Expand Down
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added — G5 citation-verifier observability T1+T2 (v6.8.6 / v6.8.7, 2026-05-12, PRs [#122](https://github.com/Number531/Legal-API/pull/122) + [#124](https://github.com/Number531/Legal-API/pull/124))

Two-tier observability remediation closing the regulator gap (T1) and ops/SLO gap (T2) on the G5 citation-verifier subagent. Built on the production-fidelity A/B baseline established the same day (Exa 96.8% / Anthropic 96.1%, PRs [#118](https://github.com/Number531/Legal-API/pull/118) + [#119](https://github.com/Number531/Legal-API/pull/119)).

**v6.8.6 T1 — regulator persistence (PR [#122](https://github.com/Number531/Legal-API/pull/122))**:
- New `citation_verdicts` junction table (dual-path: `migrations/015_*.sql` + `CITATION_VERDICTS_DDL` in postgres.js + `ensureHookSchema()` call). FK ON DELETE CASCADE on reports + sessions; UNIQUE (report_id, footnote_id) for idempotent upsert; 3 indexes.
- Parser promoted: `test/sdk/_lib/certificateParser.mjs` → `src/utils/certificateParser.js`.
- Fire-and-forget batch INSERT in `persistReport` mirrors Wave 2 `citation_source_links` pattern (single round-trip for ~500 footnotes).
- `/api/session/:sessionKey/audit-report` v1.1 returns `citation_verification_certificate` (full markdown + verdict_summary) and `citation_verdicts` (per-footnote array). Access logged to `access_log`.
- `client-audit-export` skill ships `citation_verdicts__csv.gz` + `citation_verification_certificate__csv.gz` in the per-client regulator-handoff WORM bundle.

**v6.8.7 T2 — telemetry + alerts (PR [#124](https://github.com/Number531/Legal-API/pull/124))**:
- 4 Prometheus series in sdkMetrics.js: `citation_verifier_confirmation_rate_pct` (Gauge, `mode` label), `citation_verifier_{confirmed,unconfirmed,errors}_total` (Counters). 13 series total; bounded enums prevent explosion.
- Recording in `hookDBBridge.persistState()` from `state_data.verification_results` (in-hand at SubagentStop; no race with T1's fire-and-forget INSERT).
- Structured log `event=citation_verifier_completed` with counts/mode/duration/turns/tool-call counts.
- 3 alert rules in `prometheus/alerts.yml`: rate <90% sustained 1h (WARN), <80% 30m (CRIT), error spike >50/15m (WARN). Thresholds calibrated against measured baseline + 7pp WARN margin.
- Bundled fix: corrected pre-existing `access_log` SELECT in audit-report endpoint that queried non-existent columns (`actor`/`action`/`accessed_at`) — these had been silently failing via `.catch()` since Wave 3 shipped.

**Operator skills aligned (PR [#125](https://github.com/Number531/Legal-API/pull/125))**:
- `post-deploy-verify` Tier 2 — new V6 check probes `/metrics` for 4 new series + companion `v6-citation-verdicts-presence.sql`.
- `infrastructure-health` Tier 3 — `citation_verifier_*` PromQL guidance + new `references/citation-verifier-telemetry.md` runbook (PromQL dashboards, Cloud Logging schema, T1↔T2 reconciliation SQL, alert response by severity).
- `session-diagnostics` — new Section 11 (G5 forensics) with 5 forensic SQL queries (verdict distribution, state-vs-table reconciliation, per-batch failure pattern, citation→source provenance).

**Risk**: 3/10 combined (2/10 T1 + 1/10 T2). Pattern mirrors Wave 2 + Exa A3 telemetry — battle-tested. Independent review on both PRs surfaced zero blockers.

**Pre-existing bug surfaced (not in scope)**: schema-doc-validator caught that `source_chunk_embeddings` does NOT have a `metadata` JSONB column despite Wave 2 code at `hookDBBridge.js:454` assuming it does. The `.catch()` masks the error; citation-source-link enrichment silently falls back to bare candidates. Flagged as separate investigation.

### Added — Citation-verifier A/B test harnesses (test-only, 2026-05-12, PRs [#118](https://github.com/Number531/Legal-API/pull/118) + [#119](https://github.com/Number531/Legal-API/pull/119))

Empirical validation of the production `EXA_WEB_TOOLS=true` config (live in `flags.env` since 2026-04-18 PR #76 but never directly measured). Test-only; no production code touched.
Expand Down
2 changes: 1 addition & 1 deletion super-legal-mcp-refactored/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -611,7 +611,7 @@ End-to-end machinery for EU AI Act Art. 12 (logging), Art. 13 (transparency), Ar

**Reproducibility metadata** (`hook_audit_log.event_data->'bridge_metadata'`): `sdk_version`, `git_sha`, `beta_headers`, `bridge_version`, `otel_trace_id` — enables byte-exact replay of historical executions.

**Regulator-facing endpoint**: `GET /api/session/:sessionKey/audit-report` aggregates 5 audit tables (code_executions, bridge_metadata, citations, human_interventions, access_log) as JSON or CSV (`?format=csv`). Auth via existing `cookieAuth`. See [docs/runbooks/v6.8.5-audit-export.md](docs/runbooks/v6.8.5-audit-export.md) for operator workflow.
**Regulator-facing endpoint**: `GET /api/session/:sessionKey/audit-report` (v1.1 since v6.8.6 T1, PR [#122](https://github.com/Number531/Legal-API/pull/122)) aggregates **8 audit surfaces** as JSON or CSV (`?format=csv`): session lifecycle, `code_executions`, `bridge_metadata`, `citations` (citation_source_links), `citation_verification_certificate` (full G5 markdown + verdict_summary), `citation_verdicts` (per-footnote array), `human_interventions`, and `access_log`. Auth via existing `cookieAuth`. See [docs/runbooks/v6.8.5-audit-export.md](docs/runbooks/v6.8.5-audit-export.md) and [docs/api-reference.md §3](docs/api-reference.md) for operator workflow + full response shape.

**PII redaction** (GDPR Art. 17): `redactSessionEventData(sessionId)` in `src/utils/retentionManager.js` walks `hook_audit_log.event_data` JSONB and scrubs sensitive paths (tool_input.{code,text,url,query,email,…}, error.context.*, python_code, stdout, raw_output) with `[REDACTED]` markers. Idempotent. Audit shape preserved (regulator sees "an event happened" — not the content). Invoke before `DELETE FROM sessions` for compliance erasure.

Expand Down
35 changes: 31 additions & 4 deletions super-legal-mcp-refactored/docs/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,22 +175,49 @@ Two access-audit middlewares wrap subsets of read paths:
{ "tool_use_id": "...", "bridge_metadata": { "git_sha": "...", "sdk_version": "...", "container_id": "...", "system_prompt_hash": "..." }, "created_at": "..." }
],
"citations": [
{ "report_id": "...", "citation_marker": "[12]", "source_hash": "...", "match_method": "url_exact", "confidence_score": 1.00, "matched_at": "..." }
{ "report_id": "...", "citation_marker": "[12]", "source_hash": "...", "confidence": 1.00, "matched_via": "url", "report_type": "memo", "report_key": "..." }
],
"citation_verification_certificate": {
"report_id": "uuid",
"agent_type": "citation-websearch-verifier",
"certificate_text": "## CERTIFICATION STATUS: PASS\n\n**Confirmation Rate:** 96.8% (358 confirmed / 370 verifiable footnotes)\n\n...",
"created_at": "...",
"word_count": 12450,
"verdict_summary": {
"total": 370,
"confirmed": 358,
"unconfirmed": 12,
"errors": 0,
"skipped": 0,
"pass_with_note": 0,
"paywalled": 0,
"confirmation_rate": 0.9676
}
},
"citation_verdicts": [
{ "footnote_id": "^43", "footnote_row": 43, "citation_text": "...", "source_type": "case law", "verification_method": "Exa fetch_document", "verdict": "CONFIRMED", "paywalled": false, "notes": "", "created_at": "..." }
],
"human_interventions": [],
"access_log": [
{ "user_id": 42, "endpoint": "/api/db/sessions/.../report/...", "method": "GET", "accessed_at": "..." }
]
{ "requester": "user@example.com", "resource_type": "session_data", "resource_key": "...", "purpose_code": "regulator_audit", "ip_address": "...", "created_at": "..." }
],
"generated_at": "...",
"report_version": "1.1"
}
```

**Response (404)**: `{ "error": "Session not found" }`
**Response (400)**: `{ "error": "Invalid session key format" }`
**Response (503)**: `{ "error": "Database not configured" }`

### Version history

- **v1.0** (initial) — sessions, code_executions, bridge_metadata, citations, human_interventions, access_log
- **v1.1** (v6.8.6 T1, PR [#122](https://github.com/Number531/Legal-API/pull/122)) — added `citation_verification_certificate` (full G5 markdown + parsed summary) and `citation_verdicts` (per-footnote verdict array sourced from `citation_verdicts` table). Closes EU AI Act Art. 12/13 query-reconstruction for the verification layer. `access_log` columns corrected to match `ACCESS_LOG_DDL` (requester/resource_type/resource_key/purpose_code/created_at) in v6.8.7 T2, PR [#124](https://github.com/Number531/Legal-API/pull/124).

**Note on PII**: redaction fires at offboarding-time via `retentionManager.redactSessionEventData()` per GDPR Art. 17, NOT at admin read-time. Regulators require full audit data; redaction is the erasure boundary, not the access boundary.

**Reference**: `docs/runbooks/v6.8.5-audit-export.md` for response interpretation and CSV format.
**Reference**: `docs/runbooks/v6.8.5-audit-export.md` for response interpretation and CSV format. T1+T2 cross-reference: `docs/metrics-catalog.md` §9.2 for the parallel Prometheus telemetry path.

### `GET /api/db/sessions/:sessionKey/transcript`

Expand Down