From 73c7f27fed1e4dc99692fd3d0fa68a0cb1a3ae9a Mon Sep 17 00:00:00 2001 From: Number531 <120485065+Number531@users.noreply.github.com> Date: Sat, 16 May 2026 01:35:13 -0400 Subject: [PATCH] =?UTF-8?q?ops(skills):=20PR=20#140=20companion=20?= =?UTF-8?q?=E2=80=94=204=20skill=20doc=20enhancements=20for=20bridge=20obs?= =?UTF-8?q?ervability=20v2?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Companion commit to PR #140 (operator-runbook readiness). These 4 skill docs live in /Users/ej/Super-Legal/.claude/skills/ which is outside the xlsx-renderer worktree boundary, so they couldn't be bundled into the same PR commit despite belonging to the same logical scope. Committed here on a companion branch to land alongside PR #140. CHANGES: 1. .claude/skills/post-deploy-verify/scripts/verify-tier2.sh New V7 check probing /metrics for claude_code_bridge_envelope_outcome_total counter registration (presence-only; PR #138 introduced the counter, so pre-PR-138 images will WARN). Pattern matches existing V6 check for claude_citation_verifier_* metrics. 2. .claude/skills/infrastructure-health/references/postgresql.md - access_log table row: notes new event_data JSONB column + GIN index (PR #138), with explanation of operational provenance use (e.g., {execution_id, envelope_source, success} for code_execution_json forensic queries) - NEW code_executions table row: notes envelope_source VARCHAR(30) column + partial index idx_code_exec_envelope_source. Values enumerated; pre-PR-138 rows = NULL (forward-only persistence). 3. .claude/skills/client-audit-export/SKILL.md - code_executions table row: notes regulator handoff bundles now include envelope_source column via SELECT * — traceability for "which API path produced each artifact" - access_log table row: notes bundles now include event_data JSONB column via SELECT * — enables regulator queries like "show all accesses of fallback-extracted artifacts" 4. .claude/skills/client-offboarding/SKILL.md Step 6.5 paragraph: notes access_log Phase 2 archive now captures event_data JSONB inline as quoted JSON in CSV via SELECT * pattern (no procedural change, just documentation of automatic backward-compatible inclusion of the new column). Background: 3 audit agents reviewed PR #138 post-merge for skill alignment. Agent 2 (operator skills) identified these 4 enhancements as ⚠️ Minor (non-blocking, audit-trail context for operators/regulators). Bundled here since they're operator-facing documentation that belongs with the operator-runbook readiness PR. Companion PR: #140 (feature/operator-runbook-soe-rollout) Plan: /Users/ej/.claude/plans/twinkling-glittering-comet.md Co-Authored-By: Claude Opus 4.7 (1M context) --- .claude/skills/client-audit-export/SKILL.md | 4 ++-- .claude/skills/client-offboarding/SKILL.md | 2 +- .../references/postgresql.md | 3 ++- .../post-deploy-verify/scripts/verify-tier2.sh | 18 ++++++++++++++++++ 4 files changed, 23 insertions(+), 4 deletions(-) diff --git a/.claude/skills/client-audit-export/SKILL.md b/.claude/skills/client-audit-export/SKILL.md index 9000f07d1..bc8540ac4 100644 --- a/.claude/skills/client-audit-export/SKILL.md +++ b/.claude/skills/client-audit-export/SKILL.md @@ -42,12 +42,12 @@ The skill reuses `_shared/gcp-fleet-discover.sh` for multi-client discovery when | Table | Purpose | Sensitive? | |-------|---------|-----------| | `sessions` | session lifecycle, retention class, legal hold | rows include `pseudonym_id` references but no raw PII | -| `code_executions` | code-execution metadata + git_sha + model_id | safe | +| `code_executions` | code-execution metadata + git_sha + model_id. **PR #138**: bundle now includes `envelope_source` column (extraction-path provenance: `parsed_output`\|`text`\|`stdout`\|`merged:*`\|`none`) via `SELECT *` — regulator can trace which API path produced each artifact. | safe | | `code_execution_inputs` | content-hash lineage of inputs (Wave 2) | safe (hashes only) | | `hook_audit_log` | per-tool audit (incl. `event_data` JSONB with bridge metadata + EU AI Act Art. 12 A3 query reconstruction via `event_data.exa_a3` — see PR #114) | safe — strip raw user input | | `citations` | citation chain (URL, source_type) | safe | | `human_interventions` | retention/erasure/legal-hold actions (Art. 14) | safe | -| `access_log` | RBAC audit trail (actor_user_id, endpoint, status) | safe | +| `access_log` | RBAC audit trail (actor_user_id, endpoint, status). **PR #138**: bundle now includes `event_data JSONB` column via `SELECT *` — carries operational provenance (e.g., `{"execution_id","envelope_source","success"}` for code-execution artifact accesses). Enables regulator queries like "show all accesses of fallback-extracted artifacts". | safe | | `pii_mappings` | pseudonym_id ↔ encrypted_value mapping | **DANGEROUS** — export `pseudonym_id` + `created_at` ONLY, NEVER `encrypted_value` | | `source_writes` | upstream API source provenance (Wave 2) | safe | | `citation_verdicts` | per-footnote G5 verification verdicts (v6.8.6 T1) — CONFIRMED/UNCONFIRMED/ERROR/SKIP/PASS_WITH_NOTE + verification method + paywalled flag + notes | safe | diff --git a/.claude/skills/client-offboarding/SKILL.md b/.claude/skills/client-offboarding/SKILL.md index 6a30cd1d7..f248fb556 100644 --- a/.claude/skills/client-offboarding/SKILL.md +++ b/.claude/skills/client-offboarding/SKILL.md @@ -55,7 +55,7 @@ bash /Users/ej/Super-Legal/.claude/skills/client-offboarding/scripts/offboard-cl **Step 7**: Verify archives — checks that archive files exist in GCS and have non-zero size. Reports checksums. -**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope): +**Step 6.5**: Archive compliance audit tables as dedicated CSV artifacts. **Wave 3 tables**: `access_log` (EU AI Act Article 12 read-side evidence — **PR #138**: now includes `event_data JSONB` column carrying operational provenance per row, e.g., `{"execution_id","envelope_source","success"}` for code-execution accesses; archived via `SELECT *` so JSONB serializes inline to CSV as quoted JSON string) and `human_interventions` (EU AI Act Article 14 operator governance evidence). **v7.0.0 tables** (added in this scope): - `transcript_events` — full SSE event history per session (~700KB-1MB per session × N sessions; may be largest archive file). Required for byte-faithful session-reload audit if regulator queries any session history. - `citation_source_links` — citation→raw-source bridge with confidence scores. Required for hallucination audit (any citation with `confidence < 0.85` flagged for QA review at session-time should be reproducible from this archive). - `code_execution_inputs` — data lineage junction linking code executions to upstream subagent reports/embeddings/KG nodes. Required for EU AI Act Art. 15 reproducibility chain ("which subagent's output drove this DCF result?"). diff --git a/.claude/skills/infrastructure-health/references/postgresql.md b/.claude/skills/infrastructure-health/references/postgresql.md index b069f853c..5b249806d 100644 --- a/.claude/skills/infrastructure-health/references/postgresql.md +++ b/.claude/skills/infrastructure-health/references/postgresql.md @@ -18,10 +18,11 @@ | code_execution_inputs | **v7.0.0** — data lineage junction linking each code execution to upstream subagent reports/embeddings/KG nodes | 1-5 rows per code execution | | transcript_events | **v7.0.0** — full-fidelity SSE event capture (`migrations/012_transcript-events.up.sql`); buffered batch insert | ~4,000-6,000 rows per 30-50 min session; **~700KB-1MB storage per session** | | citation_source_links | **v7.0.0** — citation→source bridge with fuzzy matching (URL exact / URL fuzzy / title fuzzy / embedding cosine) + confidence score | 1 row per memo footnote matched | +| code_executions | bridge execution audit. **PR #138**: `envelope_source VARCHAR(30)` column + partial index `idx_code_exec_envelope_source` (excludes NULLs) — persists extraction-path decision from `selectEnvelopeWithFallback`. Values: `parsed_output`\|`text`\|`stdout`\|`none`\|`merged:parsed_output+stdout`\|`merged:text+stdout`. Pre-PR-138 rows = NULL (forward-only persistence). | 1 row per `runPythonAnalysis()` call | | report_embeddings | pgvector embeddings | ~50-100 chunks per report | | agent_states | Agent lifecycle | ~40 rows per session | | source_writes | Wave 3 WAL — raw source persistence reconciliation | 1 row per raw source capture; hourly reconciler | -| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged | +| access_log | Wave 3 — EU AI Act Art. 12 read-side audit | 1 row per `/api/sessions/:id/*` read (fire-and-forget); v7.0.0 audit-export reads also logged. **PR #138**: `event_data JSONB NOT NULL DEFAULT '{}'::jsonb` column + GIN index `idx_access_log_event_data_gin` — carries operational provenance (e.g., `{"execution_id","envelope_source","success"}` from `adminRouter.js` code_execution_json endpoint) for forensic JSONB containment queries. | | human_interventions | Wave 3 — EU AI Act Art. 14 operator governance audit | 0-5 rows per session (admin actions only) | | pii_mappings | Wave 3 — GDPR Art. 17 pseudonymization backing store | 0-N per session when PII detected | diff --git a/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh b/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh index ffa9ea938..b909e48bc 100755 --- a/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh +++ b/.claude/skills/post-deploy-verify/scripts/verify-tier2.sh @@ -106,6 +106,24 @@ else fi fi +# ── V7 — PR #138 envelope-outcome counter (presence check) ─────────────────── +# Probes /metrics for claude_code_bridge_envelope_outcome_total registration. +# Counter emits from inside runPythonAnalysis on every bridge call regardless of +# caller_category. Presence ≠ correctness — see envelope-decision-debug-playbook.md +# for value-based alerts. This check just confirms the counter is registered; +# pre-PR-138 images won't have it. +if [ -z "$T2_METRICS_BODY" ]; then + add_check "V7 — Bridge envelope outcome counter (PR #138)" "WARNING" "$BASE_URL/metrics unreachable; cannot verify claude_code_bridge_envelope_outcome_total registration" +else + V7_FOUND=$(echo "$T2_METRICS_BODY" | grep -c -E '^(# HELP|# TYPE) claude_code_bridge_envelope_outcome_total ' || true) + V7_FOUND="${V7_FOUND:-0}" + if [ "$V7_FOUND" -ge 2 ]; then + add_check "V7 — Bridge envelope outcome counter (PR #138)" "PASSED" "claude_code_bridge_envelope_outcome_total registered (HELP+TYPE lines present). After first bridge call, expect series with caller_category, structured_output, envelope_source, turn_outcome labels." + else + add_check "V7 — Bridge envelope outcome counter (PR #138)" "WARNING" "claude_code_bridge_envelope_outcome_total NOT in /metrics. Either pre-PR-138 image OR sdkMetrics registration broken. Check container image includes commit ≥d7a96ad2 (PR #138 initial)." + fi +fi + # ── Container env audit ─────────────────────────────────────────────────────── if [ "$HAS_GCLOUD" = "1" ]; then INSTANCE=$(gcloud compute instances list --filter="name~super-legal-staging AND status=RUNNING" \