From e67288a33c23b9873bd01efff979a38304eafaf8 Mon Sep 17 00:00:00 2001 From: Number531 <120485065+Number531@users.noreply.github.com> Date: Wed, 6 May 2026 16:32:13 -0400 Subject: [PATCH] =?UTF-8?q?docs(changelog):=20v7.0.0=20Compliance=20Era=20?= =?UTF-8?q?=E2=80=94=20major=20release=20consolidating=20v6.8.x?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the mislabeled "[6.8.0] - 2026-05-02 — equity-analyst integration (FMP)" entry with a comprehensive v7.0.0 release entry that bundles all work shipped between v6.7.0 and v6.8.5.1 into a single major-release narrative aligned with the platform's transition from internal AI memo tool to regulator-ready compliance platform. Why v7.0.0: - The cumulative Wave 5 + reconciliation + transcript persistence + FMP equity stack represents a real capability epoch - Marketing/positioning value of a major release banner aligned with EU AI Act Art. 12-15 compliance narrative - Buyer-facing artifact: enterprise prospects (PE/IB/M&A) read major versions; "v7.0.0 Compliance Era" is more compelling than "v6.8.5" - The deprecated `claude_tool_invocations_total` counter removal in the 7-day dual-emission window IS a breaking change for Grafana dashboards — semver-justified major bump The v7.0.0 entry covers, in 11 sections: 1. APIs & Endpoints (5 NEW): audit-export, transcript replay, admin rebuild-kg/rebuild-artifacts, 36 FMP equity tools (gated) 2. Code Execution Models: 11 new equity-analysis (M46-M55, M58), 45→56 catalog, 13th category (Earnings Quality folded into M&A) 3. Traceability & Reproducibility: 13 new code_executions columns, code_execution_inputs lineage junction, bridge_metadata JSONB, COMMIT_SHA build arg, byte-replay envelope (EU AI Act Art. 15) 4. Observability: 5 new Prometheus metrics (hook_persistence_failures, hook_circuit_breaker_state, code_execution_failures, hook_invocations, tool_invocations_v2), OTel code_execution.lifecycle root span, sampler config, 9 new alert rules, helper functions 5. Provenance & Citation Bridge: citation_source_links table with fuzzy URL/title/embedding cosine matching, citationParser.js, confidence scores 6. Knowledge Graph & Embeddings: auto-reconciliation loop, 9 new sessions columns (kg_status, embedding_status, etc.), 3-cap concurrency, BACKFILL semantics, 4 reconciliation metrics 7. Transcript Persistence: transcript_events table with cascade DELETE FK, buffered batch insert, buffer-until-resolved FK timing, frontend replay mode with delta batching 8. Subagent & Capability Tracking: equity-analyst standalone subagent (45th), 7 CAPABILITY constants, 5 zod tool envelope schemas (drift canary), observability registration across 9 surfaces 9. Regulatory & Compliance: full EU AI Act Art. 12/13/14/15 + GDPR Art. 17/20/32 + SEC 17a-4 mapping with implementation refs 10. Documentation: 5 new docs files (metrics-catalog, api-reference, runbooks, testing-integration-tests, equity-analyst notes), plus refreshed system-design + GTM docs 11. CI / Build Infrastructure: integration-tests workflow, Dockerfile COMMIT_SHA arg, 3 migration files, prometheus/alerts.yml Plus comprehensive Changed, Fixed, Migration & Schema, Architectural Invariants Preserved, Deferred to v7.0.x, and Verification sections. The original "[6.8.0] - 2026-05-02 — equity-analyst (FMP)" entry is replaced with a "[6.8.0-fmp-equity] SUPERSEDED" placeholder pointing to v7.0.0. Underlying merge commits (a94b7115..0c5a3500 for FMP, 6fd7832a/6eb60f71/40d4a01b/aed2f4f5 for v6.8.x stack) remain individually addressable in git log. The legitimate "[6.8.0] - 2026-04-27 — transcript persistence" entry (line 465, PR #88) is preserved unchanged for historical detail. Pre-existing chronological-vs-version ordering inconsistency NOT fixed in this commit (v6.8.0 transcript persistence remains below v6.7.1-v6.7.3 hotfixes per existing file convention). Reorder is a separate concern if the team wants to migrate to strict version- descending order at the same time as Keep a Changelog format. No code changes. No DDL impact. Documentation only. Co-Authored-By: Claude Opus 4.7 (1M context) --- super-legal-mcp-refactored/CHANGELOG.md | 246 +++++++++++++++++++++++- 1 file changed, 243 insertions(+), 3 deletions(-) diff --git a/super-legal-mcp-refactored/CHANGELOG.md b/super-legal-mcp-refactored/CHANGELOG.md index a8bc2b508..2fd046ef6 100644 --- a/super-legal-mcp-refactored/CHANGELOG.md +++ b/super-legal-mcp-refactored/CHANGELOG.md @@ -2,11 +2,251 @@ All notable changes to the Super Legal MCP Server are documented in this file. -## [6.8.0] - 2026-05-02 — equity-analyst integration (FMP) +## [7.0.0] - 2026-05-06 — Compliance Era -### Added +Major release. Marks the platform's transition from internal AI memo tool to **regulator-ready compliance platform with byte-replayable code execution and EU AI Act Art. 12-15 mapping**. Bundles all work shipped between v6.7.0 and v6.8.5.1: auto-reconciliation loop, full-fidelity transcript persistence, Wave 5 compliance machinery, regulator-facing audit-export endpoint, code-execution traceability, persistence observability, FMP equity-analyst integration, and the deferred-docs patch. -New `equity-analyst` subagent + 36-tool FMP client + 11 code-execution models, gated behind `FMP_ENABLED=false` feature flag (production activation requires FMP Enterprise contract + Data Display & Redistribution Agreement, expected 4–8 weeks parallel commercial track). +This entry consolidates what would have been incremental sub-versions (v6.8.0 transcript persistence, v6.8.1–v6.8.5 Wave 5 stack, v6.8.5.1 docs) into a single major-release narrative aligned with the Compliance Era positioning. Underlying merge commits remain individually addressable in `git log`; their SHAs are referenced inline below. + +--- + +### Added — APIs & Endpoints + +- `GET /api/session/:sessionKey/audit-report` (NEW, W5.10) — EU AI Act Art. 13 transparency export. Aggregates per-session: `code_executions` with all 13 traceability fields, `code_execution_inputs` lineage counts, `hook_audit_log` event sequence with `bridge_metadata`, `human_interventions`, `access_log`, `citation_source_links`. JSON default; `?format=csv` returns CSV.gz. Auth: `cookieAuthMiddleware` server-wide + `createAccessAuditMiddleware('session_data')` router-wide. Commit `aed2f4f5`. +- `GET /api/db/sessions/:sessionKey/transcript` (NEW) — full-fidelity SSE event replay for session reload. Returns `{status, events[], total}`; pre-v7.0.0 sessions return `status='no_transcript'` with banner fallback. Auth + `transcriptAccessAudit`. PR #88. +- `POST /api/admin/sessions/:sessionKey/rebuild-kg` (NEW) — admin-triggered immediate KG rebuild outside the 5-minute reconciliation cycle. Sets `kg_status='building'`, dispatches via `setImmediate`. Surfaces in admin frontend as "Rebuild KG" button on each session row. +- `POST /api/admin/sessions/:sessionKey/rebuild-artifacts` (NEW) — admin-triggered immediate DOCX/PDF artifact rebuild. +- 36 FMP equity tools (NEW, gated `FMP_ENABLED=false`) — peer cohorts, multiple decomposition, premium/discount bridges, EPS surprise, reverse DCF, earnings call transcripts, analyst rating distributions, institutional 13F holdings, batch quotes. + +### Added — Code Execution Models + +- 11 new equity-analysis models (M46–M55 + M58) — Valuation cohort × 8: peer cohort identification, multiple decomposition regression, premium/discount bridge, quality-adjusted multiples, EPS surprise analysis, reverse DCF, earnings-call sentiment. M&A Analysis × 3: deal-comparable selection, control-premium estimation, PE/strategic buyer pricing differential. Live test runner: 11/11 PASS (32 charts, ~$3 API cost, ~22 min runtime). +- New "Earnings Quality" category (13th) — folded M58 into M&A Analysis post-review. +- Total catalog: 45 → 56 models across 13 categories. +- 11 input examples with realistic mock data (real tickers AAPL/MSFT/GOOGL/JPM/XLK/SPY). + +### Added — Traceability & Reproducibility + +**13 new columns on `code_executions`** (`migrations/013_code-executions-model-id.up.sql`, runtime DDL in `src/db/postgres.js`): + +| Column | Purpose | +|---|---| +| `model_id` | Exact `claude-sonnet-4-6-...` revision | +| `llm_name` | Provider identity (`anthropic`) | +| `anthropic_request_id` | Server-side correlation ID for replay | +| `input_tokens`, `output_tokens` | Per-execution token counts | +| `cache_read_tokens`, `cache_creation_tokens` | Prompt-caching metrics | +| `system_prompt_hash` | SHA-256 of system prompt (drift detection) | +| `python_code` | Full generated Python source | +| `python_code_hash` | SHA-256 for deduplication | +| `container_id` | Anthropic code-execution sandbox identifier | +| `tool_use_id` | Exact correlation to PostToolUse hook | +| `stop_reason` | `end_turn` \| `pause_turn` \| `refusal` \| `max_tokens` | +| `refusal_detected` | Boolean — Anthropic-side refusal trapped in bridge | + +- **`code_execution_inputs` lineage junction** (NEW table) — links each execution to upstream subagent reports/embeddings/KG nodes. Enables data-lineage queries: "which subagent's output drove this DCF result?" +- **`bridge_metadata` JSONB column on `hook_audit_log`** — `{git_sha, sdk_version, container_id, system_prompt_hash}` captured at PostToolUse time. Combined with `python_code` from `code_executions`, gives a complete reproducibility envelope. +- **`COMMIT_SHA` build arg** (`Dockerfile`, `src/utils/buildVersion.js`) — `deploy.sh` and `client-provisioner` skill invoke `docker build --build-arg COMMIT_SHA=$(git rev-parse HEAD)` automatically. Without it, `bridge_metadata.git_sha='unknown'`. +- **Replay envelope**: every `run_python_analysis` is byte-faithfully reproducible from `system_prompt_hash + python_code + git_sha + sdk_version + container_id + model_id + anthropic_request_id`. EU AI Act Article 15 satisfied. + +Commits: `40d4a01b`, `9227bd55` (v6.8.4 + post-audit P1–P4 polish). + +### Added — Observability + +**5 new Prometheus metrics** (`src/utils/sdkMetrics.js`): + +- `claude_hook_persistence_failures_total{hook, reason}` — bounded enum: `unique_violation`, `fk_violation`, `not_null_violation`, `connection_refused`, `connection_timeout`, `dns_failed`, `pool_error`, `envelope_non_json`, `envelope_shape_drift`, `other_db`, `unknown`. Cap: ~70 series. +- `claude_hook_circuit_breaker_state{hook}` gauge — `0=closed, 1=half-open, 2=open`. +- `claude_code_execution_failures_total{reason}` — bounded 5-value enum: `refusal_detected`, `timeout`, `api_error`, `container_error`, `envelope_parse_error`. W5.7. +- `claude_hook_invocations_total{hook}` — per-hook success counter (closes "only failure metrics" gap). 9 series. W5.8. +- `claude_tool_invocations_v2_total{tool_name, status}` — replaces deprecated `claude_tool_invocations_total{tool}` with bounded `KNOWN_TOOL_NAMES` enum. W5.6. + +**OTel spans**: +- `code_execution.lifecycle` (NEW, W5.1) — root span wrapping multi-turn execution including pause-turn continuations. Attributes: all 13 traceability fields + turn count + pause count + chart count + refusal_detected. Commit `3ccd0aa6`. +- Sampler config (W5.1): `OTEL_TRACES_SAMPLER=parentbased_traceidratio` + `OTEL_TRACES_SAMPLER_ARG=0.1` (10% default). Tunable via `flags.env` → `deploy.sh` → container `--container-env`. Commit `8b983105`. + +**9 new Prometheus alert rules** (`prometheus/alerts.yml`): +- `HookPersistenceFailures` (warning, 5m), `HookCircuitBreakerOpen` (critical, 2m), `HookEnvelopeShapeDrift` (critical, 1m TTL — silent data loss vector) +- `ReconciliationKgBacklog` (warning, >50 sessions, 10m), `ReconciliationKgCritical` (critical, >100, 5m) +- `ReconciliationArtifactsBacklog` (warning, >50, 10m), `ReconciliationScanSlow` (warning, P95 >15min, 15m), `ReconciliationScanErrors` (warning, >1/hr, 30m) + +**Helpers**: `classifyPersistenceFailure(err)`, `recordPersistenceFailure(hook, reason)`, `setCircuitBreakerState(hook, state)`. + +Commits: `6fd7832a` (v6.8.1 metrics), `66f4ad6b` (v6.8.1 polish), `caa2e50e` (W5 metrics hardening). + +### Added — Provenance & Citation Bridge + +- **`citation_source_links` table** (NEW) — closes the loop from memo footnote → archived raw source. Schema: `report_id`, `citation_marker`, `source_hash`, `match_method` (`url_exact` \| `url_fuzzy` \| `title_fuzzy` \| `embedding_cosine`), `confidence_score` (0.00–1.00), `matched_at`. Cascade DELETE via FK on report removal. +- **`src/utils/citationParser.js`** (NEW) — extracts citation markers from memo content, fuzzy-matches against raw source archive, emits confidence scores. 1.00 = URL exact match; <0.85 = fuzzy match flagged for QA review. +- **Wave 3 provenance bridge integration** (preserved) — citation→source matches feed back into `kg_provenance.source_hash` for closing the conclusion → provenance → raw evidence loop. + +Commit: `aed2f4f5` (W5.4 + W5.10). + +### Added — Knowledge Graph & Embeddings + +**Auto-reconciliation loop** (`src/observability/reconciliationDaemon.js`): +- 9 new columns on `sessions` (`ALTER TABLE ADD COLUMN IF NOT EXISTS` per dual-path convention): `kg_status`, `kg_node_count`, `kg_edge_count`, `kg_built_at`, `kg_error`, `embedding_status`, `embedding_count`, `embedding_built_at`, `embedding_error`, `reconciliation_attempts`. +- 5-minute scan cycle dispatches `kgBuilder.buildFromReports()` and `embeddingService.persistReportEmbeddings()` via `setImmediate` for sessions with `kg_status IS NULL` OR `kg_status='failed' AND reconciliation_attempts < 3`. +- 3 simultaneous-rebuild concurrency cap — bounded so reconciliation cannot saturate the pool while live streaming is active. +- After 3 failed attempts, sessions are parked (manual intervention via admin endpoint). + +**Backfill semantics**: `BACKFILL_SESSION_STATUS_SEMANTICS_DDL` migrates sessions with non-null `final_score` to `status='complete'` so they become reconciliation candidates. `IF EXISTS` gate added v6.8.2 — fixes fresh-DB throw on new client provisioning. + +**Reconciliation metrics**: `claude_reconciliation_scans_total{status}`, `claude_reconciliation_rebuilds_total{type, status}`, `claude_reconciliation_scan_duration_ms` histogram, `claude_reconciliation_pending_sessions{type}` gauge. + +Commits: v6.7.0 series (PR #87), `6eb60f71` (v6.8.2 schema split). + +### Added — Transcript Persistence + +- **`transcript_events` table** (NEW, `migrations/012_transcript-events.up.sql`) — captures every SSE event flowing through `ctx.send()` for byte-faithful session reload. Schema: `id BIGSERIAL`, `session_id UUID FK CASCADE` (GDPR Art. 17), `session_key`, `sequence_number`, `event_type`, `event_data JSONB`, `created_at`. Single replay-path index `idx_transcript_session_seq(session_id, sequence_number ASC)`. +- **Buffered batch insert** — 50-event flush size / 2s flush interval / final flush in `ctx.end()`. Collapses ~5,000 individual writes into ~100 batched inserts per session. Pool max bumped 10 → 15 (~33% burst margin under simultaneous live + 3-rebuild reconciliation + flush). +- **Buffer-until-resolved FK timing** — early SSE events (`system_init`, `prompt_enhancement_status`) fire 1–5ms before sessions row INSERT (~1–3s in). `_flushTranscriptBuffer()` retries until `dbSessionId` resolves, then back-fills null sessionIds before INSERT. Zero events lost. +- **Frontend replay mode** (`test/react-frontend/app.js`) — `replayTranscript(sessionKey)` with delta batching. Consecutive `delta` events for the same bubble batched into a single render — replay time <100ms for ~5,000 events. `replayMode=true` suppresses auto-scroll during reload. +- **Fallback**: pre-v7.0.0 sessions return `status='no_transcript'`. Frontend shows banner + falls back to existing `reconstructPhaseProgress()` + `reconstructTimeline()`. +- **Storage**: ~700KB-1MB per session × 10K sessions ≈ 7-10 GB; ~$1-2/month at GCP storage pricing. + +Flag: `TRANSCRIPT_DB_PERSISTENCE` (default ON post-v7.0.0; was default OFF during v6.8.0 phased rollout). PR #88, commits `73f66d3b`..`15809c35`. + +### Added — Subagent & Capability Tracking + +- **`equity-analyst` standalone subagent** (45th) — full peer parity (`## Your Expertise / ## Research Methodology / ## Output Format` prompt structure, 14-keyword `MUST BE USED` line). Domain assignment: `['equities', 'sec', 'fred', 'code-execution', ...]` when `FMP_ENABLED=true`. Catalog stage: `research_support` (parallel to `financial-analyst`, `data-analyst`). +- **7 subagent CAPABILITY constants** — each declares tool surface, output schema, and `data_provenance` claim. Surfaces in `code_execution_inputs` for full lineage queries. +- **5 zod tool envelope schemas** (`src/schemas/toolEnvelopes.js`) — strict schemas for `run_python_analysis`, `generate_chart`, `search_sec_filings`, `get_court_opinions`, `analyze_patent`. Each tool's input validated before persistence; mismatches emit `claude_hook_persistence_failures_total{reason='envelope_drift'}`. Drift canary for Anthropic SDK upgrades. +- **Observability registration**: `classifyAgent` (`hookSSEBridge.js`) maps equity-analyst → `research_support`; `RESEARCH_AGENTS` set membership (`p0GateHook.js`); `AGENT_STAGE_MAP` entry (app.js); `.phase-badge.research_support` CSS rule (also fixes latent peer issue); `agentDisplayMeta` + `agentClassifications` catalog metadata. +- **Production-gate runbook** — `docs/pending-updates/equity-analyst-update.md` § 8.4.X documents 4 mandatory deployment-test verifications (V1 hook_audit_log for FMP tools, V2 hook_audit_log for M46–M58, V3 Cloud Trace SubagentStart with `stage=research_support`, V4 citation-websearch-verifier accepts FMP citations). + +Commits: Day 1–6.E series for FMP equity-analyst (`a94b7115`..`0c5a3500`), `c16cfcc3` (W5 subagent + zod envelopes). + +### Added — Regulatory & Compliance + +**EU AI Act mapping**: + +| Article | Aperture Implementation | +|---|---| +| **Art. 12 (Logging)** | `hook_audit_log` records every tool invocation, agent dispatch, and code execution with bounded reason enums and Prometheus failure counters | +| **Art. 13 (Transparency)** | `GET /api/session/:sessionKey/audit-report` returns the complete audit trail per session | +| **Art. 14 (Human Oversight)** | Admin endpoints (`/halt`, `/override`, `/legal-hold`, `/tombstone`) write to `human_interventions` with user identity, reason, timestamp | +| **Art. 15 (Reproducibility)** | Every `run_python_analysis` byte-replayable from `system_prompt_hash + python_code + git_sha + sdk_version + container_id + model_id + anthropic_request_id` | + +**GDPR mapping**: + +| Article | Aperture Implementation | +|---|---| +| **Art. 17 (Right to Erasure)** | `redactSessionEventData()` overwrites JSONB content paths to `[REDACTED]` at offboarding-time (UPDATE not DELETE; idempotent; preserves row structure for SEC 17a-4 metadata retention). Cascade DELETE via FK on session removal. `pii_mappings.erasePII()` permanently removes pseudonym mappings | +| **Art. 20 (Data Portability)** | Audit-export endpoint supports JSON and CSV.gz formats | +| **Art. 32 (Security)** | Cloud SQL TLS + Google-managed CMEK; bcrypt password hashing at cost factor 12 | + +**SEC 17a-4 record-keeping**: 5-class retention manager with `LITIGATION_HOLD`, `REGULATORY_7Y`, `STANDARD_3Y`, `RESEARCH_1Y`, `TRANSIENT_90D`. Legal hold prevents all deletion regardless of class. WORM bucket (`gs://super-legal-worm-us-east1`) with 8-year object lock for raw source archive. + +**Compliance posture summary**: copilots cannot offer this because their architecture treats the LLM as a black box invoked through chat. Aperture treats the LLM as a component in an instrumented pipeline — every call is audited at the boundary, not at the chat surface. This is why the audit-export endpoint can claim byte-faithful reproducibility while the streaming pipeline stays under p99 200ms hook latency. + +### Added — Documentation + +- `docs/metrics-catalog.md` (NEW, 305 lines) — comprehensive Prometheus + OTel reference: 33 metrics across 12 categories, 13 alert rules, 8 OTel manual spans, recording-function quick reference. +- `docs/api-reference.md` (NEW, 465 lines) — 40+ operator/regulator-facing endpoints across 8 categories. Two-layer auth model documented. Common Response Conventions appendix. +- `docs/runbooks/v6.8.5-audit-export.md` (NEW) — endpoint usage, response shape, PII handling, escalation paths. +- `docs/runbooks/v6.7.0-session-reconciliation.md` (NEW) — auto-reconciliation operator runbook. +- `docs/feature-flags.md` §31a/b (NEW) — `OTEL_TRACES_SAMPLER`, `OTEL_TRACES_SAMPLER_ARG`, `COMMIT_SHA` documentation. +- `docs/pending-updates/equity-analyst-update.md` § 8.4.X — 4-step production-gate runbook. +- `docs/pending-updates/equity-analyst-merge-notes.md` (NEW, 434 lines) — Day 0–6.E build narrative. +- `docs/pending-updates/fmp-filter-capabilities.md` (NEW, 301 lines) — 73 live filter probes documented. +- `docs/testing-integration-tests.md` (NEW, 149 lines) — production canaries reference. +- `company-strategy/system-design.md` v6.8.5 refresh — new §14c with 9 subsections covering v6.7-v6.8.5; topology counts updated (42→45 subagents, 36→38 hybrid clients, 149+→197 tools). +- `company-strategy/gtm-positioning-strategy.md` v6.8.5 refresh — new §3.8 Compliance & Audit Posture (EU AI Act + GDPR + SEC 17a-4 mapping tables) + new §3.9 The Equity Research Layer. +- `company-strategy/gtm-sales-playbook.md` v6.8.5 refresh — Step 4 Output Delivery adds audit-export bundle as closing artifact; §10.1 split into Wave 3 + Wave 5 sections; new §10.2 Common Regulatory Objections (4 scripted Q&A items). + +PR #93 (commits `0f8c9b0a`, `f3d346db`, `ebd8d64d`). + +### Added — CI / Build Infrastructure + +- `.github/workflows/integration-tests.yml` (NEW, 80 lines) — pull-request-only path filter; pgvector/pgvector:pg16 service container; Jest --runInBand; advisory not blocking. +- `Dockerfile` `ARG COMMIT_SHA=unknown` + `ENV COMMIT_SHA=${COMMIT_SHA}` — propagates deployed git SHA into runtime. +- `migrations/012_transcript-events.{up,down}.sql`, `013_code-executions-model-id.{up,down}.sql`, `014_audit-log-model-id-index.{up,down}.sql` — version-controlled DDL mirroring runtime `ensure*Schema()`. +- `prometheus/alerts.yml` — 9 new alert rules consolidated in operator copy-paste source. + +### Changed + +- **Topology**: 36 → 38 hybrid clients (FMP, DirectFetch); 161 → 197 tools (gated +36 FMP); 45 → 56 code-execution models (+11 equity research); 44 → 45 subagents (+equity-analyst). +- **Database pool**: `max=10` → `max=15` (33% burst margin during simultaneous live stream + 3-rebuild reconciliation + transcript flush). `statement_timeout=120_000` preserved (extending was found unnecessary and risky during v6.8.0 audit). +- **Metrics deprecation**: `claude_tool_invocations_total{tool}` deprecated → `claude_tool_invocations_v2_total{tool_name}` with bounded `KNOWN_TOOL_NAMES` enum (prevents cardinality explosion). 7-day dual-emission window before v7.0.x removal. `alertingRules.js` `ClaudeToolErrorRateHigh` migrated to v2 counter (W5.6). +- **Audit-export endpoint comment correction** (`dbFrontendRouter.js:1215`): "post-W5.5 PII-redacted" → accurate description that redaction fires at offboarding-time per GDPR Art. 17, not at admin read-time (regulators require complete data). +- **Operator skills updated**: + - `deploy.sh` adds `--build-arg COMMIT_SHA=$(git rev-parse HEAD)` + - `deploy.sh` Step 7 hardened — 90s pre-flight wait for IP RESERVED, 5×30s retries (was 3×15s), stderr captured to tempfile, manual recovery commands printed on final failure + - `deploy.sh` Step 8.5 — post-IP container restart so `ensureHookSchema()` runs with whitelisted IP (Cloud SQL only whitelists `34.26.70.60`; new GCE instance briefly holds ephemeral IP before MIG reassigns) + - `client-provisioner` grants Compute SA `roles/storage.objectAdmin` on per-client WORM bucket + explicit `BCRYPT_ROUNDS=12` in CONTAINER_ENV + - `client-offboarding` Step 6.5 calls `redactSessionEventData()` before `gcloud sql export` + - `infrastructure-health` probes 4 new tables + 7 admin endpoints + 4 new feature flags + - `session-diagnostics` probes `/health.reconciliation` field +- **Project-level skills** updated with patterns from FMP integration so all future API integrations + code-execution model batches benefit by default. `api-integration/SKILL.md` (591 → 857 lines): NEW Phase 1.5 + Phase 1.6 + Phase 7. `code-execution-models/SKILL.md` (365 → 454 lines): NEW Phase 5.5. + +### Fixed + +- **`BACKFILL_RECONCILIATION_STATUS_DDL` IF EXISTS gate** (commit `6eb60f71`) — wraps the UPDATE in an `information_schema.tables` IF EXISTS gate. Fixes fresh-DB throw on new client provisioning. Idempotent on both fresh and existing databases. +- **6 fire-and-forget durability gaps** (commit `55d2e03a`) — kg-build daemon, embedding daemon, transcript flush, citation persistence, audit log INSERT, archive-old-sessions sweep now register with `backgroundTasks` Set. Confirms graceful-shutdown semantics across every async write path. +- **`persistCodeExecution` SIGTERM durability** (commit `6fd7832a`) — changed from `.catch()` fire-and-forget to awaited try/catch. Cloud Run SIGTERM does NOT flush `.catch()`-only writes; awaiting matches every other `persist*()` in the dispatcher. Adds ~50ms to PostToolUse hot path; CircuitBreaker bounds latency under DB stress. +- **Static IP race recovery** (`deploy.sh`) — 90s pre-flight wait for IP RESERVED, 5×30s retries. +- **Boot-time schema-init container restart** (`deploy.sh` Step 8.5) — forces `ensureHookSchema()` retry with whitelisted IP after MIG completes IP reassignment. +- **`selectBestJSON` in `codeExecutionBridge.js`** — replaces `tryParseJSON(raw_output)` (last block only) with richest-JSON-object selection across all stdout blocks. Recovers JSON previously masked by trailing debug prints. Null-fallback preserved for prose-only models. +- **Audit findings post-v6.8.4 (P1–P4)** (commit `9227bd55`) — `bridge_metadata` crash-path coverage, `tool_use_id` exact correlation column, CI workflow path filter extended for `Dockerfile` + `flags.env`. + +### Migration & Schema + +**New tables** (additive; cascade DELETE FK to `sessions(id)` per GDPR Art. 17): +- `transcript_events` — `migrations/012_transcript-events.up.sql` +- `code_execution_inputs` — runtime DDL in `src/db/postgres.js` +- `citation_source_links` — runtime DDL in `src/db/postgres.js` + +**New columns** (`ALTER TABLE ADD COLUMN IF NOT EXISTS` per dual-path convention): +- `code_executions` — 13 reproducibility columns (see Traceability section) +- `sessions` — 10 reconciliation status columns (kg_*, embedding_*, reconciliation_attempts) +- `hook_audit_log` — `bridge_metadata JSONB`, `tool_use_id` + +**New indexes**: +- `idx_transcript_session_seq` on `transcript_events(session_id, sequence_number ASC)` — single replay-path index +- `idx_audit_session_tool` on `hook_audit_log(session_id, tool_name)` — supports regulator query patterns from audit-export endpoint +- (Note: non-CONCURRENT — brief AccessExclusiveLock during boot. Watch first restart on busy production DB.) + +**Migration files**: `012_transcript-events`, `013_code-executions-model-id`, `014_audit-log-model-id-index` — each with `.up.sql` + `.down.sql` mirror. + +### Architectural Invariants Preserved + +Every release in this version adheres to four invariants tested empirically through audit cycles: + +1. **Dual-path schema discipline** — every new column lands as `ALTER TABLE ADD COLUMN IF NOT EXISTS` in BOTH a versioned migration file AND `ensure*Schema()` runtime DDL. +2. **Feature-flag gating** — every release ships behind a default-OFF flag; activation is a `flags.env` flip + redeploy. No release activates implicitly. +3. **Zero hot-path blocking** — every persistence write is fire-and-forget via `setImmediate()` or `backgroundTasks.add()`. The streaming chokepoint (`ctx.send()`) is augmented with buffered writes but never awaited. +4. **Graceful shutdown** — `backgroundTasks` Set is awaited on SIGTERM; in-flight writes complete before exit. Verified across SIGTERM tests in `code-execution-hook-e2e.test.js`. + +These invariants are why the audit-export endpoint can claim byte-faithful reproducibility while the streaming pipeline stays under p99 200ms hook latency. + +### Deferred to v7.0.x + +- Removal of deprecated `claude_tool_invocations_total` legacy counter (after 7-day dual-emission window completes) +- `docs/database-enhancements/database-enhancements.md` schema refresh (still v1.1.0) +- `CLAUDE.md` creation +- Field-level encryption for `python_code` column (separate hardening pass) +- Brand-doc count sweep (`website-spec.md`, `zee-super-legal-strategy-brand.md`, `aperture-ai-governance-positioning.md`) +- FMP Enterprise contract activation (parallel commercial track; `FMP_ENABLED=true` flip gated on contract signing) +- Backfill of historical `code_executions` rows (NULL is honest) +- Cross-quarter transcript embedding via existing embeddingService.js pipeline + +### Verification (already passing) + +- 30/30 integration tests (`code-execution-hook-e2e` + `hookDBBridge-integration`) on CI parity +- 77/77 unit tests (`hookDBBridge` + `domain-mcp-servers`) +- 143/143 FMP hybrid tests +- 11/11 equity code-execution model live tests (32 charts produced) +- `node --check` clean on all modified files +- Manual `/metrics` scrape — all 5 new metrics + bounded enum labels render correctly +- OTel sampler env propagation verified `flags.env` → `deploy.sh:90` → container `--container-env` +- `EXPLAIN` on regulator query confirms `idx_audit_session_tool` utilized (no seq scan) + +--- + +## [6.8.0-fmp-equity] - SUPERSEDED — content consolidated into v7.0.0 above + +The original `[6.8.0] - 2026-05-02 — equity-analyst integration (FMP)` entry is preserved in git history (commit `8e65476e`). Day 1–6.E build narrative for the FMP equity-analyst subagent + 36-tool client + 11 code-execution models is now part of the comprehensive v7.0.0 release entry above. Underlying merge commits remain individually addressable in `git log`. **Day 1–2** (commits `a94b7115`..`68069e4f`) — empirical foundation: 182 FMP `/stable` fixtures across 36 endpoint families; 73 live filter probes; surfaced 8 hard-filter tools.