diff --git a/.claude/skills/client-audit-export/SKILL.md b/.claude/skills/client-audit-export/SKILL.md index 9000f07d1..bc8540ac4 100644 --- a/.claude/skills/client-audit-export/SKILL.md +++ b/.claude/skills/client-audit-export/SKILL.md @@ -42,12 +42,12 @@ The skill reuses `_shared/gcp-fleet-discover.sh` for multi-client discovery when | Table | Purpose | Sensitive? | |-------|---------|-----------| | `sessions` | session lifecycle, retention class, legal hold | rows include `pseudonym_id` references but no raw PII | -| `code_executions` | code-execution metadata + git_sha + model_id | safe | +| `code_executions` | code-execution metadata + git_sha + model_id. **PR #138**: bundle now includes `envelope_source` column (extraction-path provenance: `parsed_output`\|`text`\|`stdout`\|`merged:*`\|`none`) via `SELECT *` — regulator can trace which API path produced each artifact. | safe | | `code_execution_inputs` | content-hash lineage of inputs (Wave 2) | safe (hashes only) | | `hook_audit_log` | per-tool audit (incl. `event_data` JSONB with bridge metadata + EU AI Act Art. 12 A3 query reconstruction via `event_data.exa_a3` — see PR #114) | safe — strip raw user input | | `citations` | citation chain (URL, source_type) | safe | | `human_interventions` | retention/erasure/legal-hold actions (Art. 14) | safe | -| `access_log` | RBAC audit trail (actor_user_id, endpoint, status) | safe | +| `access_log` | RBAC audit trail (actor_user_id, endpoint, status). **PR #138**: bundle now includes `event_data JSONB` column via `SELECT *` — carries operational provenance (e.g., `{"execution_id","envelope_source","success"}` for code-execution artifact accesses). Enables regulator queries like "show all accesses of fallback-extracted artifacts". | safe | | `pii_mappings` | pseudonym_id ↔ encrypted_value mapping | **DANGEROUS** — export `pseudonym_id` + `created_at` ONLY, NEVER `encrypted_value` | | `source_writes` | upstream API source provenance (Wave 2) | safe | | `citation_verdicts` | per-footnote G5 verification verdicts (v6.8.6 T1) — CONFIRMED/UNCONFIRMED/ERROR/SKIP/PASS_WITH_NOTE + verification method + paywalled flag + notes | safe | diff --git a/.claude/skills/client-offboarding/SKILL.md b/.claude/skills/client-offboarding/SKILL.md index 6a30cd1d7..f248fb556 100644 --- a/.claude/skills/client-offboarding/SKILL.md +++ b/.claude/skills/client-offboarding/SKILL.md @@ -55,7 +55,7 @@ bash /Users/ej/Super-Legal/.claude/skills/client-offboarding/scripts/offboard-cl **Step 7**: Verify archives — checks that archive files exist in GCS and have non-zero size. Reports checksums. -**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope): +**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence — **PR #138**: now includes `event_data JSONB` column carrying operational provenance per row, e.g., `{"execution_id","envelope_source","success"}` for code-execution accesses; archived via `SELECT *` so JSONB serializes inline to CSV as quoted JSON string) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope): - `transcript_events` — full SSE event history per session (~700KB-1MB per session × N sessions; may be largest archive file). Required for byte-faithful session-reload audit if regulator queries any session history. - `citation_source_links` — citation→raw-source bridge with confidence scores. Required for hallucination audit (any citation with `confidence < 0.85` flagged for QA review at session-time should be reproducible from this archive). - `code_execution_inputs` — data lineage junction linking code executions to upstream subagent reports/embeddings/KG nodes. Required for EU AI Act Art. 15 reproducibility chain ("which subagent's output drove this DCF result?"). diff --git a/.claude/skills/infrastructure-health/references/postgresql.md b/.claude/skills/infrastructure-health/references/postgresql.md index b069f853c..5b249806d 100644 --- a/.claude/skills/infrastructure-health/references/postgresql.md +++ b/.claude/skills/infrastructure-health/references/postgresql.md @@ -18,10 +18,11 @@ | code_execution_inputs | **v7.0.0** — data lineage junction linking each code execution to upstream subagent reports/embeddings/KG nodes | 1-5 rows per code execution | | transcript_events | **v7.0.0** — full-fidelity SSE event capture (`migrations/012_transcript-events.up.sql`); buffered batch insert | ~4,000-6,000 rows per 30-50 min session; **~700KB-1MB storage per session** | | citation_source_links | **v7.0.0** — citation→source bridge with fuzzy matching (URL exact / URL fuzzy / title fuzzy / embedding cosine) + confidence score | 1 row per memo footnote matched | +| code_executions | bridge execution audit. **PR #138**: `envelope_source VARCHAR(30)` column + partial index `idx_code_exec_envelope_source` (excludes NULLs) — persists extraction-path decision from `selectEnvelopeWithFallback`. Values: `parsed_output`\|`text`\|`stdout`\|`none`\|`merged:parsed_output+stdout`\|`merged:text+stdout`. Pre-PR-138 rows = NULL (forward-only persistence). | 1 row per `runPythonAnalysis()` call | | report_embeddings | pgvector embeddings | ~50-100 chunks per report | | agent_states | Agent lifecycle | ~40 rows per session | | source_writes | Wave 3 WAL — raw source persistence reconciliation | 1 row per raw source capture; hourly reconciler | -| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged | +| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged. **PR #138**: `event_data JSONB NOT NULL DEFAULT '{}'::jsonb` column + GIN index `idx_access_log_event_data_gin` — carries operational provenance (e.g., `{"execution_id","envelope_source","success"}` from `adminRouter.js` code_execution_json endpoint) for forensic JSONB containment queries. | | human_interventions | Wave 3 — EU AI Act Art. 14 operator governance audit | 0-5 rows per session (admin actions only) | | pii_mappings | Wave 3 — GDPR Art. 17 pseudonymization backing store | 0-N per session when PII detected | diff --git a/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh b/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh index ffa9ea938..b909e48bc 100755 --- a/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh +++ b/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh @@ -106,6 +106,24 @@ else fi fi +# ── V7 — PR #138 envelope-outcome counter (presence check) ─────────────────── +# Probes /metrics for claude_code_bridge_envelope_outcome_total registration. +# Counter emits from inside runPythonAnalysis on every bridge call regardless of +# caller_category. Presence ≠ correctness — see envelope-decision-debug-playbook.md +# for value-based alerts. This check just confirms the counter is registered; +# pre-PR-138 images won't have it. +if [ -z "$T2_METRICS_BODY" ]; then + add_check "V7 — Bridge envelope outcome counter (PR #138)" "WARNING" "$BASE_URL/metrics unreachable; cannot verify claude_code_bridge_envelope_outcome_total registration" +else + V7_FOUND=$(echo "$T2_METRICS_BODY" | grep -c -E '^(# HELP|# TYPE) claude_code_bridge_envelope_outcome_total ' || true) + V7_FOUND="${V7_FOUND:-0}" + if [ "$V7_FOUND" -ge 2 ]; then + add_check "V7 — Bridge envelope outcome counter (PR #138)" "PASSED" "claude_code_bridge_envelope_outcome_total registered (HELP+TYPE lines present). After first bridge call, expect series with caller_category, structured_output, envelope_source, turn_outcome labels." + else + add_check "V7 — Bridge envelope outcome counter (PR #138)" "WARNING" "claude_code_bridge_envelope_outcome_total NOT in /metrics. Either pre-PR-138 image OR sdkMetrics registration broken. Check container image includes commit ≥d7a96ad2 (PR #138 initial)." + fi +fi + # ── Container env audit ─────────────────────────────────────────────────────── if [ "$HAS_GCLOUD" = "1" ]; then INSTANCE=$(gcloud compute instances list --filter="name~super-legal-staging AND status=RUNNING" \