Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .claude/skills/client-audit-export/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@ The skill reuses `_shared/gcp-fleet-discover.sh` for multi-client discovery when
| Table | Purpose | Sensitive? |
|-------|---------|-----------|
| `sessions` | session lifecycle, retention class, legal hold | rows include `pseudonym_id` references but no raw PII |
| `code_executions` | code-execution metadata + git_sha + model_id | safe |
| `code_executions` | code-execution metadata + git_sha + model_id. **PR #138**: bundle now includes `envelope_source` column (extraction-path provenance: `parsed_output`\|`text`\|`stdout`\|`merged:*`\|`none`) via `SELECT *` — regulator can trace which API path produced each artifact. | safe |
| `code_execution_inputs` | content-hash lineage of inputs (Wave 2) | safe (hashes only) |
| `hook_audit_log` | per-tool audit (incl. `event_data` JSONB with bridge metadata + EU AI Act Art. 12 A3 query reconstruction via `event_data.exa_a3` — see PR #114) | safe — strip raw user input |
| `citations` | citation chain (URL, source_type) | safe |
| `human_interventions` | retention/erasure/legal-hold actions (Art. 14) | safe |
| `access_log` | RBAC audit trail (actor_user_id, endpoint, status) | safe |
| `access_log` | RBAC audit trail (actor_user_id, endpoint, status). **PR #138**: bundle now includes `event_data JSONB` column via `SELECT *` — carries operational provenance (e.g., `{"execution_id","envelope_source","success"}` for code-execution artifact accesses). Enables regulator queries like "show all accesses of fallback-extracted artifacts". | safe |
| `pii_mappings` | pseudonym_id ↔ encrypted_value mapping | **DANGEROUS** — export `pseudonym_id` + `created_at` ONLY, NEVER `encrypted_value` |
| `source_writes` | upstream API source provenance (Wave 2) | safe |
| `citation_verdicts` | per-footnote G5 verification verdicts (v6.8.6 T1) — CONFIRMED/UNCONFIRMED/ERROR/SKIP/PASS_WITH_NOTE + verification method + paywalled flag + notes | safe |
Expand Down
2 changes: 1 addition & 1 deletion .claude/skills/client-offboarding/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ bash /Users/ej/Super-Legal/.claude/skills/client-offboarding/scripts/offboard-cl

**Step 7**: Verify archives — checks that archive files exist in GCS and have non-zero size. Reports checksums.

**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope):
**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence — **PR #138**: now includes `event_data JSONB` column carrying operational provenance per row, e.g., `{"execution_id","envelope_source","success"}` for code-execution accesses; archived via `SELECT *` so JSONB serializes inline to CSV as quoted JSON string) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope):
- `transcript_events` — full SSE event history per session (~700KB-1MB per session × N sessions; may be largest archive file). Required for byte-faithful session-reload audit if regulator queries any session history.
- `citation_source_links` — citation→raw-source bridge with confidence scores. Required for hallucination audit (any citation with `confidence < 0.85` flagged for QA review at session-time should be reproducible from this archive).
- `code_execution_inputs` — data lineage junction linking code executions to upstream subagent reports/embeddings/KG nodes. Required for EU AI Act Art. 15 reproducibility chain ("which subagent's output drove this DCF result?").
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@
| code_execution_inputs | **v7.0.0** — data lineage junction linking each code execution to upstream subagent reports/embeddings/KG nodes | 1-5 rows per code execution |
| transcript_events | **v7.0.0** — full-fidelity SSE event capture (`migrations/012_transcript-events.up.sql`); buffered batch insert | ~4,000-6,000 rows per 30-50 min session; **~700KB-1MB storage per session** |
| citation_source_links | **v7.0.0** — citation→source bridge with fuzzy matching (URL exact / URL fuzzy / title fuzzy / embedding cosine) + confidence score | 1 row per memo footnote matched |
| code_executions | bridge execution audit. **PR #138**: `envelope_source VARCHAR(30)` column + partial index `idx_code_exec_envelope_source` (excludes NULLs) — persists extraction-path decision from `selectEnvelopeWithFallback`. Values: `parsed_output`\|`text`\|`stdout`\|`none`\|`merged:parsed_output+stdout`\|`merged:text+stdout`. Pre-PR-138 rows = NULL (forward-only persistence). | 1 row per `runPythonAnalysis()` call |
| report_embeddings | pgvector embeddings | ~50-100 chunks per report |
| agent_states | Agent lifecycle | ~40 rows per session |
| source_writes | Wave 3 WAL — raw source persistence reconciliation | 1 row per raw source capture; hourly reconciler |
| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged |
| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged. **PR #138**: `event_data JSONB NOT NULL DEFAULT '{}'::jsonb` column + GIN index `idx_access_log_event_data_gin` — carries operational provenance (e.g., `{"execution_id","envelope_source","success"}` from `adminRouter.js` code_execution_json endpoint) for forensic JSONB containment queries. |
| human_interventions | Wave 3 — EU AI Act Art. 14 operator governance audit | 0-5 rows per session (admin actions only) |
| pii_mappings | Wave 3 — GDPR Art. 17 pseudonymization backing store | 0-N per session when PII detected |

Expand Down
18 changes: 18 additions & 0 deletions .claude/skills/post-deploy-verify/scripts/verify-tier2.sh
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,24 @@ else
fi
fi

# ── V7 — PR #138 envelope-outcome counter (presence check) ───────────────────
# Probes /metrics for claude_code_bridge_envelope_outcome_total registration.
# Counter emits from inside runPythonAnalysis on every bridge call regardless of
# caller_category. Presence ≠ correctness — see envelope-decision-debug-playbook.md
# for value-based alerts. This check just confirms the counter is registered;
# pre-PR-138 images won't have it.
if [ -z "$T2_METRICS_BODY" ]; then
add_check "V7 — Bridge envelope outcome counter (PR #138)" "WARNING" "$BASE_URL/metrics unreachable; cannot verify claude_code_bridge_envelope_outcome_total registration"
else
V7_FOUND=$(echo "$T2_METRICS_BODY" | grep -c -E '^(# HELP|# TYPE) claude_code_bridge_envelope_outcome_total ' || true)
V7_FOUND="${V7_FOUND:-0}"
if [ "$V7_FOUND" -ge 2 ]; then
add_check "V7 — Bridge envelope outcome counter (PR #138)" "PASSED" "claude_code_bridge_envelope_outcome_total registered (HELP+TYPE lines present). After first bridge call, expect series with caller_category, structured_output, envelope_source, turn_outcome labels."
else
add_check "V7 — Bridge envelope outcome counter (PR #138)" "WARNING" "claude_code_bridge_envelope_outcome_total NOT in /metrics. Either pre-PR-138 image OR sdkMetrics registration broken. Check container image includes commit ≥d7a96ad2 (PR #138 initial)."
fi
fi

# ── Container env audit ───────────────────────────────────────────────────────
if [ "$HAS_GCLOUD" = "1" ]; then
INSTANCE=$(gcloud compute instances list --filter="name~super-legal-staging AND status=RUNNING" \
Expand Down