From e64712e455bce27f1ad7a74c179afed23474a2b8 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 03:14:29 -0400
Subject: [PATCH 01/11] docs(kg): plan + audit for risk-layer fix (#231)

Root-cause + remediation plan for the KG dropping the entire risk node
layer on current-format sessions: risk-summary.json never persisted +
Phase 7 parser keyed on the legacy risk_categories schema.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../pending-updates/kg-risk-layer-fix-231.md  | 36 +++++++++++++++++++
 1 file changed, 36 insertions(+)
 create mode 100644 super-legal-mcp-refactored/docs/pending-updates/kg-risk-layer-fix-231.md

diff --git a/super-legal-mcp-refactored/docs/pending-updates/kg-risk-layer-fix-231.md b/super-legal-mcp-refactored/docs/pending-updates/kg-risk-layer-fix-231.md
new file mode 100644
index 000000000..5ead87c3a
--- /dev/null
+++ b/super-legal-mcp-refactored/docs/pending-updates/kg-risk-layer-fix-231.md
@@ -0,0 +1,36 @@
+# KG Risk-Layer Fix — Plan + Audit (issue #231)
+
+## Problem
+The KG drops the entire `risk` node layer for current-format sessions. Two independent breaks:
+1. **Persistence** — `review-outputs/risk-summary.json` is never written to the `reports` table (live hook `hookDBBridge.js:1445` and backfill `walkMarkdown` persist `.md` only). KG `CRITICAL_REPORTS` gate (`hookDBBridge.js:1359`) then times out on `risk-summary`.
+2. **Parser schema drift** — Phase 7 (`kgPhases6to8.js:380`) keys on `parsed.risk_categories || parsed.categories`; the current producer emits `exposure_by_category` with **string** exposures (`"$433.75M"`) and **string** probability (`"8% fail"`). Even when present (May-27 session), 0 risk nodes result.
+
+Cardinal (May) worked because it emitted `risk-summary-narrative.md` (Markdown → `.md` persist path → Phase 7 Path B regex).
+
+## Plan (additive, non-destructive)
+
+### Fix #1 — Persistence (scoped allowlist, NOT all `.json`)
+- `src/utils/hookDBBridge.js:1445` — persist `risk-summary.json` via a `JSON_REPORT_FILENAMES` allowlist (in `hookDBBridgeConfig.js`). **Do not broaden to all `.json`** (would pull in `*-state.json`, `banker-*.json`, `entities.json`).
+- `scripts/backfill-local-to-db.mjs` — `walkMarkdown` additionally ingests files whose basename ∈ the same allowlist; also scan `review-outputs/` for `*-state.json` (pre-existing gap).
+- `persistReport` is already content-agnostic; `extractReportKey` already strips `.json` → `review/risk-summary`. No change.
+
+### Fix #2 — Parser (modularized, unit-tested)
+- Extract the JSON risk-block parsing into a **pure exported function** `buildRiskBlocksFromJson(content)` in `kgPhases6to8.js` (refactor commit: byte-equivalent for the legacy schema).
+- Extend it to accept `exposure_by_category` + string exposure fields (`weighted_exposure`, `exposure_low/high`) and string `probability` (passed through; already contains `%`).
+- **Ordering constraint (from node-creation loop @ line 442 `amounts.slice(0,5)`):** the synth block must list `Exposure:` BEFORE `Mitigation:` so real exposure `$` amounts lead the extracted array (mitigation prose contains `$1,237,262,000`).
+- Phase 13 (`kgPhase13ProbabilisticValue.js`) intentionally untouched — it requires numeric `p10/p50/p90` and correctly skips string-only findings.
+
+### Tests (`node --test`, matching kg-phase13 style)
+- `test/sdk/kg-phase7-risk-parser.test.js` — unit tests on `buildRiskBlocksFromJson` for: legacy `risk_categories` numeric, `categories` alias, current `exposure_by_category` + string exposures, exposure-before-mitigation ordering, malformed JSON → `[]`.
+
+## Audit of the plan
+| Dimension | Finding |
+|---|---|
+| **Blast radius** | Confined to observability/storage: `reports` table + KG `risk`/`closing_condition` nodes. Generative pipeline reads `risk-summary.json` from **disk** (`_promptConstants.js:3066`), unaffected. `documentConverter` already excludes `risk-summary.json` (no garbage DOCX). |
+| **Best practices** | Allowlist (not blanket `.json`) avoids polluting `reports`. Parser extracted to a pure, testable function. Both persist paths fail-soft (`hookDBBridge.js:6`). |
+| **Modularity** | One pure function (`buildRiskBlocksFromJson`) shared intent; one config constant (`JSON_REPORT_FILENAMES`) shared by live hook + backfill. No new tables, no schema migration. |
+| **Seamless integration** | Legacy numeric schema preserved (refactor is byte-equivalent); new schema added as fallthrough. Existing 23 kg-phase13 tests must stay green. |
+| **Anti-recurrence** | Producer↔consumer drift is the root cause → add the parser unit test built from the real `risk-summary.json` shape as the contract anchor. |
+
+## Remediation for affected sessions
+After fix: ingest `risk-summary.json` + re-run KG (upsert) + reapply embeddings for 2026-06-16 (and later 2026-06-08, 2026-05-27).

From c3105bad0539c42de354e3af07d98afc02ad8fc0 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 03:14:29 -0400
Subject: [PATCH 02/11] refactor(kg): extract buildRiskBlocksFromJson pure
 helper

Move the Phase 7 risk-summary JSON parsing into an exported, side-effect-free
function so the schema handling is unit-testable in isolation. Byte-equivalent
behavior for the legacy risk_categories/categories schema (existing 33 kg-phase
tests stay green); no functional change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../src/utils/knowledgeGraph/kgPhases6to8.js  | 97 +++++++++++--------
 1 file changed, 54 insertions(+), 43 deletions(-)

diff --git a/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js b/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
index ea6d3120e..1384a1cf4 100644
--- a/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
+++ b/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
@@ -348,6 +348,57 @@ async function phase6_dealStructure(pool, sessionId, evolutionLog, resolver) {
   console.log(`[KG] Phase 6: ${condCount} conditions, ${entityCount} entities, ${milestoneCount} milestones`);
 }
 
+/**
+ * Parse a risk-summary JSON document into uniform { title, block } risk blocks
+ * that the Phase 7 node-creation loop consumes identically to Markdown-extracted
+ * blocks. Returns [] for non-JSON or unparseable content (caller then falls back
+ * to the Markdown regex path).
+ *
+ * Extracted from phase7_riskAndFacts and exported so the risk-summary schema
+ * handling is unit-testable in isolation (test/sdk/kg-phase7-risk-parser.test.js).
+ */
+export function buildRiskBlocksFromJson(content) {
+  const blocks = [];
+  const trimmed = (content || '').trim();
+  if (!trimmed.startsWith('{') && !trimmed.startsWith('[')) return blocks;
+  let parsed;
+  try {
+    parsed = JSON.parse(trimmed);
+  } catch (err) {
+    console.warn('[KG Phase 6 risk] JSON parse failed, falling back to markdown:', err.message);
+    return blocks;
+  }
+  const categories = parsed.risk_categories || parsed.categories || [];
+  for (const cat of categories) {
+    const catName = cat.category || cat.name || 'Uncategorized';
+    for (const finding of (cat.findings || [])) {
+      const fid = finding.id || '';
+      const title = (finding.finding || finding.title || finding.name || '').toString();
+      if (!title || title.length < 5) continue;
+      const exposureBits = [];
+      if (finding.p50 != null) exposureBits.push(`$${(finding.p50 / 1e9).toFixed(2)}B (p50)`);
+      if (finding.p10 != null && finding.p10 !== finding.p50) exposureBits.push(`$${(finding.p10 / 1e9).toFixed(2)}B (p10)`);
+      if (finding.p90 != null && finding.p90 !== finding.p50) exposureBits.push(`$${(finding.p90 / 1e9).toFixed(2)}B (p90)`);
+      if (finding.probability_weighted != null) exposureBits.push(`$${(finding.probability_weighted / 1e9).toFixed(2)}B (probability-weighted)`);
+      if (finding.npv_at_8pct != null) exposureBits.push(`NPV $${(finding.npv_at_8pct / 1e9).toFixed(2)}B`);
+      if (finding.dcf_present_value != null) exposureBits.push(`DCF PV $${(finding.dcf_present_value / 1e9).toFixed(2)}B`);
+      const probPct = finding.probability != null ? `${Math.round(finding.probability * 100)}%` : '';
+      const synthBlock = [
+        `**${fid ? fid + ': ' : ''}${title}**`,
+        `Category: ${catName}`,
+        `Severity: ${finding.severity || cat.severity || 'UNCLASSIFIED'}`,
+        `Exposure: ${exposureBits.join(', ') || 'unquantified'}`,
+        probPct ? `Probability: ${probPct}` : '',
+        finding.source ? `Source: ${finding.source}` : '',
+        finding.notes ? `Notes: ${finding.notes}` : '',
+        finding.correlation_note ? `Correlation: ${finding.correlation_note}` : '',
+      ].filter(Boolean).join('\n');
+      blocks.push({ title: `${fid ? fid + ': ' : ''}${title}`, block: synthBlock });
+    }
+  }
+  return blocks;
+}
+
 async function phase7_riskAndFacts(pool, sessionId, evolutionLog, resolver, tNumberMap) {
   let riskCount = 0, factCount = 0;
 
@@ -370,49 +421,9 @@ async function phase7_riskAndFacts(pool, sessionId, evolutionLog, resolver, tNum
     // Two source formats supported:
     //   - JSON (e.g., risk-summary.json with risk_categories[].findings[]) — code-execution output
     //   - Markdown (e.g., risk-summary-narrative.md with **Title** + $exposure prose blocks) — LLM output
-    const riskBlocks = [];
-
-    // Path A: detect JSON content (Cardinal-style risk-summary.json)
-    const trimmed = content.trim();
-    if (trimmed.startsWith('{') || trimmed.startsWith('[')) {
-      try {
-        const parsed = JSON.parse(trimmed);
-        const categories = parsed.risk_categories || parsed.categories || [];
-        for (const cat of categories) {
-          const catName = cat.category || cat.name || 'Uncategorized';
-          for (const finding of (cat.findings || [])) {
-            // Synthesize a markdown-equivalent block from the JSON finding so the
-            // downstream regex-based property extractors still work identically.
-            // Format: **<id>: <finding>** \n exposure $... probability ...% notes...
-            const fid = finding.id || '';
-            const title = (finding.finding || finding.title || finding.name || '').toString();
-            if (!title || title.length < 5) continue;
-            const exposureBits = [];
-            if (finding.p50 != null) exposureBits.push(`$${(finding.p50 / 1e9).toFixed(2)}B (p50)`);
-            if (finding.p10 != null && finding.p10 !== finding.p50) exposureBits.push(`$${(finding.p10 / 1e9).toFixed(2)}B (p10)`);
-            if (finding.p90 != null && finding.p90 !== finding.p50) exposureBits.push(`$${(finding.p90 / 1e9).toFixed(2)}B (p90)`);
-            if (finding.probability_weighted != null) exposureBits.push(`$${(finding.probability_weighted / 1e9).toFixed(2)}B (probability-weighted)`);
-            if (finding.npv_at_8pct != null) exposureBits.push(`NPV $${(finding.npv_at_8pct / 1e9).toFixed(2)}B`);
-            if (finding.dcf_present_value != null) exposureBits.push(`DCF PV $${(finding.dcf_present_value / 1e9).toFixed(2)}B`);
-            const probPct = finding.probability != null ? `${Math.round(finding.probability * 100)}%` : '';
-            const synthBlock = [
-              `**${fid ? fid + ': ' : ''}${title}**`,
-              `Category: ${catName}`,
-              `Severity: ${finding.severity || cat.severity || 'UNCLASSIFIED'}`,
-              `Exposure: ${exposureBits.join(', ') || 'unquantified'}`,
-              probPct ? `Probability: ${probPct}` : '',
-              finding.source ? `Source: ${finding.source}` : '',
-              finding.notes ? `Notes: ${finding.notes}` : '',
-              finding.correlation_note ? `Correlation: ${finding.correlation_note}` : '',
-            ].filter(Boolean).join('\n');
-            riskBlocks.push({ title: `${fid ? fid + ': ' : ''}${title}`, block: synthBlock });
-          }
-        }
-      } catch (err) {
-        // JSON parse failed; fall through to markdown path
-        console.warn('[KG Phase 6 risk] JSON parse failed, falling back to markdown:', err.message);
-      }
-    }
+    // Path A: JSON content (risk-summary.json) — extracted to buildRiskBlocksFromJson()
+    // so the schema handling is unit-testable in isolation.
+    const riskBlocks = buildRiskBlocksFromJson(content);
 
     // Path B: markdown regex (fallback; also runs when JSON path extracted nothing)
     if (riskBlocks.length === 0) {

From 187e8a1c7fe21ea3144f925db68907e0c3258ec4 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:06:52 -0400
Subject: [PATCH 03/11] feat(kg): parse exposure_by_category risk schema +
 string exposures

The current risk-aggregator emits risk-summary.json keyed on exposure_by_category
with string exposures ("$433.75M") and string probability ("8% fail"); the parser
only handled the legacy risk_categories numeric schema, yielding 0 risk nodes.

- Add exposure_by_category as the primary category source (legacy keys preserved).
- Synthesize $-amounts from weighted_exposure/exposure_low/high when no numeric
  p10/p50/p90 bits exist (legacy numeric path unchanged).
- Pass string probability through (already carries % for the downstream regex).
- Emit Exposure BEFORE Mitigation so amounts.slice(0,5) leads with real exposures,
  not the RRTF figure embedded in mitigation prose.

Fixes #231 (parser half).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../src/utils/knowledgeGraph/kgPhases6to8.js  | 29 ++++++++++++++++---
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js b/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
index 1384a1cf4..684591e71 100644
--- a/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
+++ b/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
@@ -368,7 +368,10 @@ export function buildRiskBlocksFromJson(content) {
     console.warn('[KG Phase 6 risk] JSON parse failed, falling back to markdown:', err.message);
     return blocks;
   }
-  const categories = parsed.risk_categories || parsed.categories || [];
+  // Schema support (first non-empty wins):
+  //   - exposure_by_category — CURRENT producer (risk-aggregator.js), string exposures
+  //   - risk_categories / categories — LEGACY, numeric p10/p50/p90 distributions
+  const categories = parsed.exposure_by_category || parsed.risk_categories || parsed.categories || [];
   for (const cat of categories) {
     const catName = cat.category || cat.name || 'Uncategorized';
     for (const finding of (cat.findings || [])) {
@@ -376,20 +379,38 @@ export function buildRiskBlocksFromJson(content) {
       const title = (finding.finding || finding.title || finding.name || '').toString();
       if (!title || title.length < 5) continue;
       const exposureBits = [];
+      // Legacy numeric distribution fields (base units → $B).
       if (finding.p50 != null) exposureBits.push(`$${(finding.p50 / 1e9).toFixed(2)}B (p50)`);
       if (finding.p10 != null && finding.p10 !== finding.p50) exposureBits.push(`$${(finding.p10 / 1e9).toFixed(2)}B (p10)`);
       if (finding.p90 != null && finding.p90 !== finding.p50) exposureBits.push(`$${(finding.p90 / 1e9).toFixed(2)}B (p90)`);
       if (finding.probability_weighted != null) exposureBits.push(`$${(finding.probability_weighted / 1e9).toFixed(2)}B (probability-weighted)`);
       if (finding.npv_at_8pct != null) exposureBits.push(`NPV $${(finding.npv_at_8pct / 1e9).toFixed(2)}B`);
       if (finding.dcf_present_value != null) exposureBits.push(`DCF PV $${(finding.dcf_present_value / 1e9).toFixed(2)}B`);
-      const probPct = finding.probability != null ? `${Math.round(finding.probability * 100)}%` : '';
+      // Current schema: pre-formatted string exposures (already contain "$..."). Only
+      // when no numeric bits were produced, so legacy numeric behavior is unchanged.
+      if (exposureBits.length === 0) {
+        if (finding.weighted_exposure) exposureBits.push(String(finding.weighted_exposure));
+        if (finding.exposure_low) exposureBits.push(`low ${finding.exposure_low}`);
+        if (finding.exposure_high) exposureBits.push(`high ${finding.exposure_high}`);
+      }
+      // Probability: numeric legacy (0–1) → "NN%"; string current (e.g. "8% fail")
+      // passed through verbatim (already carries a "%" for the downstream regex).
+      let probStr = '';
+      if (typeof finding.probability === 'number') probStr = `${Math.round(finding.probability * 100)}%`;
+      else if (typeof finding.probability === 'string' && finding.probability.trim()) probStr = finding.probability.trim();
       const synthBlock = [
         `**${fid ? fid + ': ' : ''}${title}**`,
         `Category: ${catName}`,
         `Severity: ${finding.severity || cat.severity || 'UNCLASSIFIED'}`,
+        // Exposure MUST precede Mitigation: the node-creation loop takes
+        // amounts.slice(0,5) in block order, and mitigation prose carries its own
+        // "$" figures (e.g. the RRTF). Leading with exposure keeps exposure_amounts
+        // accurate.
         `Exposure: ${exposureBits.join(', ') || 'unquantified'}`,
-        probPct ? `Probability: ${probPct}` : '',
-        finding.source ? `Source: ${finding.source}` : '',
+        probStr ? `Probability: ${probStr}` : '',
+        finding.section_reference ? `Section: ${finding.section_reference}` : '',
+        finding.source || finding.source_report ? `Source: ${finding.source || finding.source_report}` : '',
+        finding.mitigation ? `Mitigation: ${finding.mitigation}` : '',
         finding.notes ? `Notes: ${finding.notes}` : '',
         finding.correlation_note ? `Correlation: ${finding.correlation_note}` : '',
       ].filter(Boolean).join('\n');

From cc08cbb4c15c08e219953cf6cade08ca5cc184a0 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:06:52 -0400
Subject: [PATCH 04/11] test(kg): unit tests for risk-summary parser (legacy +
 current schema)

Contract anchor for #231: pins both the legacy risk_categories numeric schema
and the current exposure_by_category string schema, plus the
exposure-before-mitigation ordering, malformed-JSON fallback, and precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../test/sdk/kg-phase7-risk-parser.test.js    | 137 ++++++++++++++++++
 1 file changed, 137 insertions(+)
 create mode 100644 super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js

diff --git a/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js b/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js
new file mode 100644
index 000000000..880d1bf65
--- /dev/null
+++ b/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js
@@ -0,0 +1,137 @@
+/**
+ * Unit tests for buildRiskBlocksFromJson (KG Phase 7 risk parser).
+ *
+ * Contract anchor for issue #231: the risk-aggregator emits the
+ * `exposure_by_category` schema with STRING exposures; the parser must produce
+ * risk blocks whose synth text yields correct downstream extraction. Also pins
+ * the LEGACY `risk_categories`/`categories` numeric schema so it never regresses.
+ */
+import { test } from 'node:test';
+import assert from 'node:assert/strict';
+import { buildRiskBlocksFromJson } from '../../src/utils/knowledgeGraph/kgPhases6to8.js';
+
+// Mirrors the downstream node-creation loop extractors (kgPhases6to8.js ~431).
+const AMOUNT_RE = /\$[\d,.]+[BMK]?/g;
+const PROB_RE = /(\d{1,3})[\-–]?(\d{1,3})?%/;
+
+test('current schema: exposure_by_category with string exposures → risk blocks', () => {
+  const content = JSON.stringify({
+    exposure_by_category: [
+      {
+        category: 'Regulatory/Antitrust',
+        findings: [
+          {
+            id: 'R-ANT-001',
+            finding: 'HSR Second Request probable; behavioral-remedy clearance base case.',
+            severity: 'HIGH',
+            probability: '8% fail / 65–80% Second Request',
+            weighted_exposure: '$433.75M (break EL, marked) / $621.71M (headline)',
+            exposure_low: '$0 (clears)',
+            exposure_high: '$1.237B RRTF',
+            mitigation: '$1,237,262,000 regulatory reverse-termination fee (Fox pays).',
+            source_report: 'antitrust-competition-analyst-report.md',
+            section_reference: '§IV.A',
+          },
+        ],
+      },
+    ],
+  });
+  const blocks = buildRiskBlocksFromJson(content);
+  assert.equal(blocks.length, 1, 'one finding → one block');
+  const { title, block } = blocks[0];
+  assert.match(title, /^R-ANT-001: /);
+
+  // Downstream amount extraction must LEAD with real exposures, not the
+  // mitigation RRTF figure (exposure-before-mitigation ordering constraint).
+  const amounts = block.match(AMOUNT_RE) || [];
+  assert.ok(amounts.length >= 2, 'multiple exposure amounts extracted');
+  assert.equal(amounts[0], '$433.75M', 'first amount is the weighted exposure, not mitigation');
+  const exposureIdx = block.indexOf('Exposure:');
+  const mitigationIdx = block.indexOf('Mitigation:');
+  assert.ok(exposureIdx < mitigationIdx, 'Exposure line precedes Mitigation line');
+
+  // Probability regex picks up the leading "8%".
+  const prob = block.match(PROB_RE);
+  assert.ok(prob && prob[0] === '8%', 'probability "8%" extracted from string');
+
+  assert.match(block, /Severity: HIGH/);
+  assert.match(block, /Section: §IV\.A/);
+});
+
+test('current schema: exposure_low/high used when weighted_exposure absent', () => {
+  const content = JSON.stringify({
+    exposure_by_category: [
+      { category: 'Privacy', findings: [
+        { id: 'P-1', finding: 'VPPA class-action exposure tied to ACR data.', severity: 'MEDIUM-HIGH',
+          exposure_low: '$0', exposure_high: '$113M' },
+      ] },
+    ],
+  });
+  const blocks = buildRiskBlocksFromJson(content);
+  assert.equal(blocks.length, 1);
+  const amounts = blocks[0].block.match(AMOUNT_RE) || [];
+  assert.ok(amounts.includes('$113M'), 'exposure_high amount present');
+});
+
+test('legacy schema: risk_categories with numeric p10/p50/p90 (unchanged behavior)', () => {
+  const content = JSON.stringify({
+    risk_categories: [
+      { category: 'Financial', findings: [
+        { id: 'F-1', finding: 'Delivered-value erosion vs headline.', severity: 'HIGH',
+          p10: 1.0e9, p50: 2.09e9, p90: 2.33e9, probability: 0.65 },
+      ] },
+    ],
+  });
+  const blocks = buildRiskBlocksFromJson(content);
+  assert.equal(blocks.length, 1);
+  const { block } = blocks[0];
+  assert.match(block, /\$2\.09B \(p50\)/, 'numeric p50 rendered as $B');
+  const prob = block.match(PROB_RE);
+  assert.ok(prob && prob[0] === '65%', 'numeric probability 0.65 → 65%');
+});
+
+test('legacy "categories" alias still works', () => {
+  const content = JSON.stringify({
+    categories: [{ category: 'X', findings: [{ id: 'C-1', finding: 'Some risk finding here.', p50: 5e8 }] }],
+  });
+  const blocks = buildRiskBlocksFromJson(content);
+  assert.equal(blocks.length, 1);
+  assert.match(blocks[0].block, /\$0\.50B \(p50\)/);
+});
+
+test('malformed JSON → [] (caller falls back to markdown)', () => {
+  assert.deepEqual(buildRiskBlocksFromJson('{not valid json'), []);
+});
+
+test('non-JSON content → []', () => {
+  assert.deepEqual(buildRiskBlocksFromJson('## Risk Narrative\n**Some risk** $10M'), []);
+});
+
+test('empty / missing categories → []', () => {
+  assert.deepEqual(buildRiskBlocksFromJson('{}'), []);
+  assert.deepEqual(buildRiskBlocksFromJson(JSON.stringify({ exposure_by_category: [] })), []);
+  assert.deepEqual(buildRiskBlocksFromJson(''), []);
+  assert.deepEqual(buildRiskBlocksFromJson(null), []);
+});
+
+test('findings with too-short titles are skipped', () => {
+  const content = JSON.stringify({
+    exposure_by_category: [{ category: 'X', findings: [
+      { id: 'S-1', finding: 'abc' },                 // < 5 chars → skipped
+      { id: 'S-2', finding: 'A real finding title' }, // kept
+    ] }],
+  });
+  const blocks = buildRiskBlocksFromJson(content);
+  assert.equal(blocks.length, 1);
+  assert.match(blocks[0].title, /^S-2: /);
+});
+
+test('exposure_by_category takes precedence over legacy keys when both present', () => {
+  const content = JSON.stringify({
+    exposure_by_category: [{ category: 'New', findings: [{ id: 'N-1', finding: 'New schema finding wins.', weighted_exposure: '$5M' }] }],
+    risk_categories:      [{ category: 'Old', findings: [{ id: 'O-1', finding: 'Legacy finding ignored.', p50: 9e9 }] }],
+  });
+  const blocks = buildRiskBlocksFromJson(content);
+  assert.equal(blocks.length, 1);
+  assert.match(blocks[0].title, /^N-1: /);
+});

From f811ac9cfbff52edbe45db935883766228d7b59f Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:07:55 -0400
Subject: [PATCH 05/11] feat(hookdb): persist risk-summary.json as a review
 report
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The live PostToolUse hook only persisted .md files, so the risk-aggregator's
risk-summary.json never reached the reports table and the KG risk layer + the
CRITICAL_REPORTS gate (risk-summary) silently failed.

- Add JSON_REPORT_FILENAMES allowlist (hookDBBridgeConfig.js) — exact basenames,
  NOT all .json, so *-state.json / banker-*.json / entities.json are excluded.
- Broaden the persist gate to the allowlist. persistReport is content-agnostic;
  extractReportType maps /review-outputs/ → review and extractReportKey strips
  .json → report_key 'risk-summary'. Fail-soft (all DB writes are try/caught).

Fixes #231 (live-persistence half).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../src/config/hookDBBridgeConfig.js              | 15 +++++++++++++++
 .../src/utils/hookDBBridge.js                     |  7 ++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/super-legal-mcp-refactored/src/config/hookDBBridgeConfig.js b/super-legal-mcp-refactored/src/config/hookDBBridgeConfig.js
index 6ccdd587f..fd003607f 100644
--- a/super-legal-mcp-refactored/src/config/hookDBBridgeConfig.js
+++ b/super-legal-mcp-refactored/src/config/hookDBBridgeConfig.js
@@ -80,6 +80,21 @@ export const REPORT_TYPE_MATCHERS = [
 
 export const REPORT_TYPE_DEFAULT = 'document';
 
+/**
+ * Non-Markdown report deliverables that MUST be persisted to the `reports`
+ * table, keyed by exact basename. The risk-aggregator emits structured JSON
+ * (`risk-summary.json`) consumed by KG Phase 7/13 and the executive-summary
+ * synthesizer; without this allowlist it is never stored and the KG risk layer
+ * is silently empty (issue #231).
+ *
+ * SCOPED BY DESIGN: persistence gates broaden to these exact filenames, NOT to
+ * all `.json`, so state sidecars (`*-state.json`), banker context/metadata
+ * (`banker-*.json`), and `entities.json` are never mis-ingested as reports.
+ */
+export const JSON_REPORT_FILENAMES = new Set([
+  'risk-summary.json',
+]);
+
 // ============================================================
 // AGENT TYPE CLASSIFICATION (State Key → agent_type)
 // ============================================================
diff --git a/super-legal-mcp-refactored/src/utils/hookDBBridge.js b/super-legal-mcp-refactored/src/utils/hookDBBridge.js
index ac9e23ba6..3fa93a32c 100644
--- a/super-legal-mcp-refactored/src/utils/hookDBBridge.js
+++ b/super-legal-mcp-refactored/src/utils/hookDBBridge.js
@@ -35,6 +35,7 @@ import {
   STATE_FILE_DIR_DEFAULT,
   AUDIT_SKIP_TOOLS,
   P0_EXCLUDED_SUFFIXES,
+  JSON_REPORT_FILENAMES,
 } from '../config/hookDBBridgeConfig.js';
 
 export const backgroundTasks = new Set();
@@ -1442,7 +1443,11 @@ async function persistHookEvent(pool, sessionCache, hookName, input, result) {
       const filePath = tool_input?.file_path || '';
 
       if (tool_name === 'Write' && filePath.includes('/reports/')) {
-        if (filePath.endsWith('.md')) {
+        // Persist Markdown reports, plus the allowlisted non-Markdown deliverables
+        // (e.g. risk-summary.json) that downstream KG/synthesis consume. Scoped to
+        // exact basenames so state/context JSON sidecars are never mis-ingested (#231).
+        const baseName = filePath.split('/').pop() || '';
+        if (filePath.endsWith('.md') || JSON_REPORT_FILENAMES.has(baseName)) {
           await persistReport(pool, sessionCache, input, result);
         }
         const isExcluded = P0_EXCLUDED_SUFFIXES.some(suffix => filePath.endsWith(suffix));

From 0256eb8eaf7dde8e5634b23f3e7aefc7afdef2df Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:08:59 -0400
Subject: [PATCH 06/11] fix(backfill): ingest risk-summary.json +
 review-outputs state files

Bring local backfill to parity with live persistence for recovery of
unpersisted sessions:
- walkMarkdown now also ingests allowlisted JSON deliverables (risk-summary.json)
  so the KG risk layer can be rebuilt (#231); scoped by basename, state sidecars
  excluded.
- Scan review-outputs/ (not just qa-outputs/) for *-state.json, so
  fact-validator / coverage-gap-analyzer / risk-aggregator states are captured.
- (Carries the banker_intake/banker_qa report-type matchers added during session
  recovery so banker artifacts get their canonical types.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../scripts/backfill-local-to-db.mjs          | 349 ++++++++++++++++++
 1 file changed, 349 insertions(+)
 create mode 100644 super-legal-mcp-refactored/scripts/backfill-local-to-db.mjs

diff --git a/super-legal-mcp-refactored/scripts/backfill-local-to-db.mjs b/super-legal-mcp-refactored/scripts/backfill-local-to-db.mjs
new file mode 100644
index 000000000..8f2012e06
--- /dev/null
+++ b/super-legal-mcp-refactored/scripts/backfill-local-to-db.mjs
@@ -0,0 +1,349 @@
+#!/usr/bin/env node
+/**
+ * Backfill Local-to-DB — Upload a fully-completed local session directory
+ * into PostgreSQL so the staging frontend can display it.
+ *
+ * What it does:
+ *   1. UPDATE sessions row (status, sdk_session_id, counts) from session-summary.json
+ *   2. INSERT/UPSERT reports for every non-.pandoc .md file (uses extractReportType/Key)
+ *   3. INSERT/UPSERT charts/*.png as report_artifacts via persistSessionArtifacts()
+ *   4. INSERT/UPSERT agent_states from *-state-*.json files
+ *   5. INSERT session_metrics from counter totals
+ *
+ * After running, hit:
+ *   POST /api/admin/sessions/<key>/rebuild-artifacts  (regenerate PDFs/DOCX)
+ *   POST /api/admin/sessions/<key>/rebuild-kg         (regenerate entities + KG)
+ *
+ * report_embeddings + citation_source_links populate automatically via the
+ * setImmediate() side-effect path inside persistReport() — NO, that runs only
+ * from the live hook. We replicate the same INSERT here but skip the side effect;
+ * rebuild-kg will resynthesise citation_source_links and KG.
+ *
+ * Usage:
+ *   node scripts/backfill-local-to-db.mjs <session-key> [--dry-run]
+ */
+
+import 'dotenv/config';
+import pg from 'pg';
+import { promises as fs } from 'fs';
+import path from 'path';
+import { createHash } from 'crypto';
+import { fileURLToPath } from 'url';
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+
+const args = process.argv.slice(2);
+const sessionKey = args.find(a => !a.startsWith('--'));
+const dryRun = args.includes('--dry-run');
+const sessionDirArg = args.find(a => a.startsWith('--dir='))?.split('=')[1];
+
+if (!sessionKey || !/^\d{4}-\d{2}-\d{2}-\d+$/.test(sessionKey)) {
+  console.error('Usage: node scripts/backfill-local-to-db.mjs <session-key> [--dry-run] [--dir=/abs/path]');
+  process.exit(1);
+}
+
+const sessionDir = sessionDirArg
+  ? path.resolve(sessionDirArg)
+  : path.resolve(__dirname, '..', 'reports', sessionKey);
+
+const REPORT_TYPE_MATCHERS = [
+  { match: '/documents/',            type: 'extraction' },
+  { match: '/specialist-reports/',   type: 'specialist' },
+  { match: '/section-reports/',      type: 'section' },
+  { match: '/review-outputs/',       type: 'review' },
+  { match: '/qa-outputs/',           type: 'qa' },
+  { match: '/remediation-outputs/',  type: 'remediation' },
+  // Banker Q&A workflow (mirrors src/config/hookDBBridgeConfig.js:76-78) — must
+  // precede 'final-memorandum'/generic so the KG banker phases (1b/1c) can find
+  // these by their canonical report_type. Specific filename matches.
+  { match: 'banker-questions-presented', type: 'banker_intake' },
+  { match: 'banker-question-answers',    type: 'banker_qa' },
+  { match: 'final-memorandum',       type: 'final' },
+  { match: 'executive-summary',      type: 'synthesis' },
+  { match: 'research-plan',          type: 'synthesis' },
+  { match: 'consolidated-footnotes', type: 'synthesis' },
+];
+const REPORT_TYPE_DEFAULT = 'document';
+
+// Non-Markdown deliverables persisted as reports by exact basename — mirrors
+// JSON_REPORT_FILENAMES in src/config/hookDBBridgeConfig.js so local backfill
+// matches live persistence (#231). Scoped: state/context JSON are NOT reports.
+const JSON_REPORT_FILENAMES = new Set(['risk-summary.json']);
+
+const AGENT_TYPE_MATCHERS = [
+  { match: 'section-writer',      type: 'section-writer' },
+  { match: 'qa-diagnostic',       type: 'qa-diagnostic' },
+  { match: 'qa-certifier',        type: 'qa-certifier' },
+  { match: 'synthesis',           type: 'synthesis' },
+  { match: 'executive-summary',   type: 'executive-summary' },
+  { match: 'citation-validator',  type: 'citation-validator' },
+  { match: 'remediation-wave',    type: 'remediation' },
+  { match: 'risk-aggregator',     type: 'risk-aggregator' },
+  { match: 'orchestrator',        type: 'orchestrator' },
+  { match: 'document-processing', type: 'document-processing' },
+  { match: 'research-review',     type: 'research-review' },
+  { match: 'assembly',            type: 'assembly' },
+  { match: 'intake-research',     type: 'intake-research-analyst' },
+];
+
+function extractReportType(filePath) {
+  for (const { match, type } of REPORT_TYPE_MATCHERS) {
+    if (filePath.includes(match)) return type;
+  }
+  return REPORT_TYPE_DEFAULT;
+}
+
+function extractReportKey(filePath) {
+  const filename = filePath.split('/').pop() || 'unknown';
+  return filename.replace(/\.pandoc\.md$|\.md$|\.json$/, '');
+}
+
+function extractAgentType(stateKey) {
+  for (const { match, type } of AGENT_TYPE_MATCHERS) {
+    if (stateKey.includes(match)) return type;
+  }
+  return 'unknown';
+}
+
+async function walkMarkdown(dir) {
+  const out = [];
+  async function recurse(d) {
+    let entries;
+    try { entries = await fs.readdir(d, { withFileTypes: true }); }
+    catch { return; }
+    for (const e of entries) {
+      const full = path.join(d, e.name);
+      if (e.isDirectory()) {
+        if (['documents', 'charts', 'wrapped-subagent-transcripts'].includes(e.name)) continue;
+        await recurse(full);
+      } else if (e.isFile() && ((e.name.endsWith('.md') && !e.name.endsWith('.pandoc.md')) || JSON_REPORT_FILENAMES.has(e.name))) {
+        out.push(full);
+      }
+    }
+  }
+  await recurse(dir);
+  return out;
+}
+
+async function main() {
+  const exists = await fs.access(sessionDir).then(() => true, () => false);
+  if (!exists) {
+    console.error(`Session dir not found: ${sessionDir}`);
+    process.exit(1);
+  }
+
+  const pool = new pg.Pool({ connectionString: process.env.PG_CONNECTION_STRING });
+
+  // ── 0. Resolve sessions.id ──
+  const sessRes = await pool.query(
+    `SELECT id, status FROM sessions WHERE session_key = $1`,
+    [sessionKey],
+  );
+  if (sessRes.rows.length === 0) {
+    console.error(`Session ${sessionKey} not found in DB — create the sessions row first (e.g. via a live run intake) or insert manually.`);
+    process.exit(1);
+  }
+  const sessionId = sessRes.rows[0].id;
+  console.log(`[Backfill] Session ${sessionKey} → id=${sessionId} (current status=${sessRes.rows[0].status})`);
+
+  // ── 1. Read session-summary.json ──
+  let summary = {};
+  try {
+    summary = JSON.parse(await fs.readFile(path.join(sessionDir, 'session-summary.json'), 'utf8'));
+  } catch (err) {
+    console.warn(`[Backfill] No session-summary.json (${err.message}) — counters skipped`);
+  }
+
+  // ── 2. UPDATE sessions ──
+  const finalMemoPath = path.join(sessionDir, 'final-memorandum.md');
+  let finalWordCount = null;
+  let sectionCount = null;
+  try {
+    const text = await fs.readFile(finalMemoPath, 'utf8');
+    finalWordCount = text.split(/\s+/).filter(Boolean).length;
+  } catch { /* skip */ }
+  try {
+    const sectionDir = path.join(sessionDir, 'section-reports');
+    const entries = await fs.readdir(sectionDir);
+    sectionCount = entries.filter(f => f.endsWith('.md') && !f.endsWith('.pandoc.md')).length;
+  } catch { /* skip */ }
+
+  if (!dryRun) {
+    await pool.query(
+      `UPDATE sessions
+          SET status = 'completed',
+              sdk_session_id = COALESCE($2, sdk_session_id),
+              word_count = COALESCE($3, word_count),
+              section_count = COALESCE($4, section_count),
+              metadata = metadata || $5::jsonb,
+              updated_at = NOW()
+        WHERE id = $1`,
+      [
+        sessionId,
+        summary.sdk_session_id || null,
+        finalWordCount,
+        sectionCount,
+        JSON.stringify({
+          backfilled_from_local: true,
+          backfilled_at: new Date().toISOString(),
+          local_summary: summary,
+        }),
+      ],
+    );
+  }
+  console.log(`[Backfill] sessions UPDATE: status=completed, word_count=${finalWordCount}, section_count=${sectionCount}`);
+
+  // ── 3. INSERT reports ──
+  const mdFiles = await walkMarkdown(sessionDir);
+  console.log(`[Backfill] Found ${mdFiles.length} markdown files`);
+  let reportsInserted = 0;
+  for (const file of mdFiles) {
+    const content = await fs.readFile(file, 'utf8');
+    const reportType = extractReportType(file);
+    const reportKey = extractReportKey(file);
+    const wordCount = content.split(/\s+/).filter(Boolean).length;
+    const contentHash = createHash('sha256').update(content).digest('hex');
+    if (dryRun) {
+      console.log(`  [dry] ${reportType}/${reportKey} (${wordCount}w)`);
+      continue;
+    }
+    await pool.query(
+      `INSERT INTO reports (session_id, report_type, report_key, content,
+                            content_hash, word_count, file_path)
+       VALUES ($1, $2, $3, $4, $5, $6, $7)
+       ON CONFLICT (session_id, report_type, report_key)
+       DO UPDATE SET content = EXCLUDED.content,
+                     content_hash = EXCLUDED.content_hash,
+                     word_count = EXCLUDED.word_count,
+                     file_path = EXCLUDED.file_path,
+                     updated_at = NOW()`,
+      [sessionId, reportType, reportKey, content, contentHash, wordCount, file],
+    );
+    reportsInserted++;
+  }
+  console.log(`[Backfill] reports upserted: ${reportsInserted}`);
+
+  // ── 4. INSERT agent_states from *-state-*.json ──
+  let stateFiles = [];
+  try {
+    const rootEntries = await fs.readdir(sessionDir);
+    for (const e of rootEntries) {
+      if (e.endsWith('-state.json') || (/-state-.*\.json$/.test(e))) {
+        stateFiles.push(path.join(sessionDir, e));
+      }
+    }
+    // qa-outputs and review-outputs may also hold state files (e.g.
+    // fact-validator-state.json, coverage-gap-analyzer-state.json live in
+    // review-outputs per STATE_FILE_DIR_MAP) — scan both subdirs.
+    for (const sub of ['qa-outputs', 'review-outputs']) {
+      const subEntries = await fs.readdir(path.join(sessionDir, sub)).catch(() => []);
+      for (const e of subEntries) {
+        if (e.endsWith('-state.json')) stateFiles.push(path.join(sessionDir, sub, e));
+      }
+    }
+  } catch { /* none */ }
+
+  let statesInserted = 0;
+  for (const file of stateFiles) {
+    try {
+      const raw = await fs.readFile(file, 'utf8');
+      const data = JSON.parse(raw);
+      const stateKey = path.basename(file, '.json');
+      const agentType = extractAgentType(stateKey);
+      if (dryRun) {
+        console.log(`  [dry] state ${agentType}/${stateKey}`);
+        continue;
+      }
+      await pool.query(
+        `INSERT INTO agent_states (session_id, agent_type, state_key, state_data, file_path)
+         VALUES ($1, $2, $3, $4, $5)
+         ON CONFLICT (session_id, agent_type, state_key)
+         DO UPDATE SET state_data = EXCLUDED.state_data,
+                       file_path = EXCLUDED.file_path,
+                       updated_at = NOW()`,
+        [sessionId, agentType, stateKey, JSON.stringify(data), file],
+      );
+      statesInserted++;
+    } catch (err) {
+      console.warn(`  [warn] state file ${file}: ${err.message}`);
+    }
+  }
+  console.log(`[Backfill] agent_states upserted: ${statesInserted}`);
+
+  // ── 5. INSERT session_metrics ──
+  const c = summary.counters || {};
+  const u = summary.tool_usage_breakdown || {};
+  if (!dryRun && Object.keys(c).length > 0) {
+    await pool.query(
+      `INSERT INTO session_metrics (session_id, total_events, agents_started, agents_stopped,
+                                    tool_calls, tool_failures, compactions, total_duration_ms,
+                                    gate_checks_passed, gate_checks_total)
+       VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)
+       ON CONFLICT (session_id) DO UPDATE SET
+         total_events = EXCLUDED.total_events,
+         agents_started = EXCLUDED.agents_started,
+         agents_stopped = EXCLUDED.agents_stopped,
+         tool_calls = EXCLUDED.tool_calls,
+         tool_failures = EXCLUDED.tool_failures,
+         compactions = EXCLUDED.compactions,
+         total_duration_ms = EXCLUDED.total_duration_ms,
+         gate_checks_passed = EXCLUDED.gate_checks_passed,
+         gate_checks_total = EXCLUDED.gate_checks_total`,
+      [
+        sessionId,
+        (u.totalToolCalls || 0) + (c.subagentsStarted || 0) * 2,
+        c.subagentsStarted || 0,
+        c.subagentsStopped || 0,
+        u.totalToolCalls || 0,
+        c.toolFailures || 0,
+        c.compactions || 0,
+        summary.duration_ms || null,
+        c.gateChecks?.passed || 0,
+        (c.gateChecks?.passed || 0) + (c.gateChecks?.failed || 0),
+      ],
+    );
+    console.log(`[Backfill] session_metrics upserted`);
+  }
+
+  // ── 6. INSERT charts as report_artifacts (PDFs/DOCX skipped — rebuild-artifacts will regenerate) ──
+  const chartsDir = path.join(sessionDir, 'charts');
+  let chartsInserted = 0;
+  try {
+    const chartFiles = await fs.readdir(chartsDir);
+    for (const cf of chartFiles) {
+      if (!cf.endsWith('.png')) continue;
+      const chartPath = path.join(chartsDir, cf);
+      const data = await fs.readFile(chartPath);
+      const filePath = `charts/${cf}`;
+      if (dryRun) {
+        console.log(`  [dry] chart ${cf} (${data.length} bytes)`);
+        continue;
+      }
+      await pool.query(
+        `INSERT INTO report_artifacts (session_id, file_name, file_path, category, mime_type, file_size, file_data, source)
+         VALUES ($1, $2, $3, 'chart', 'image/png', $4, $5, 'local_backfill')
+         ON CONFLICT (session_id, file_path) DO UPDATE SET
+           file_data = EXCLUDED.file_data,
+           file_size = EXCLUDED.file_size`,
+        [sessionId, cf, filePath, data.length, data],
+      );
+      chartsInserted++;
+    }
+  } catch (err) {
+    console.warn(`[Backfill] charts dir issue: ${err.message}`);
+  }
+  console.log(`[Backfill] charts upserted: ${chartsInserted}`);
+
+  await pool.end();
+  console.log(`\n[Backfill] DONE${dryRun ? ' (dry run)' : ''}.`);
+  if (!dryRun) {
+    console.log(`\nNext steps (admin endpoints, require admin JWT):`);
+    console.log(`  curl -X POST https://<host>/api/admin/sessions/${sessionKey}/rebuild-artifacts -H "Authorization: Bearer $TOKEN"`);
+    console.log(`  curl -X POST https://<host>/api/admin/sessions/${sessionKey}/rebuild-kg        -H "Authorization: Bearer $TOKEN"`);
+  }
+}
+
+main().catch(err => {
+  console.error(err);
+  process.exit(1);
+});

From c64bf5ce84e16f9b4dbe9322157173283a2dd451 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:09:15 -0400
Subject: [PATCH 07/11] chore(scripts): add restore-unpersisted-session
 recovery tool

One-command recovery for a completed local session directory that never got a
sessions row: bootstraps the row (idempotent), derives transaction_name from
banker-deal-context.json, delegates to backfill-local-to-db.mjs, and optionally
fires the admin rebuild endpoints. Closes the gap that backfill-local-to-db.mjs
aborts when no sessions row exists.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../scripts/restore-unpersisted-session.mjs   | 137 ++++++++++++++++++
 1 file changed, 137 insertions(+)
 create mode 100644 super-legal-mcp-refactored/scripts/restore-unpersisted-session.mjs

diff --git a/super-legal-mcp-refactored/scripts/restore-unpersisted-session.mjs b/super-legal-mcp-refactored/scripts/restore-unpersisted-session.mjs
new file mode 100644
index 000000000..e8003a4b9
--- /dev/null
+++ b/super-legal-mcp-refactored/scripts/restore-unpersisted-session.mjs
@@ -0,0 +1,137 @@
+#!/usr/bin/env node
+/**
+ * Restore Unpersisted Session — One-command recovery for a completed local
+ * session directory that never got a `sessions` row in PostgreSQL.
+ *
+ * Why this exists:
+ *   `backfill-local-to-db.mjs` is an UPSERT keyed on an EXISTING sessions row —
+ *   it aborts at step 0 if the row is missing. A session that died before any
+ *   live persistence has no row, so the backfill can never run. This wrapper
+ *   bootstraps the row (idempotent), then delegates to the existing, tested
+ *   backfill, then optionally fires the admin rebuilds.
+ *
+ * Pipeline:
+ *   1. INSERT sessions row (session_key + status + sdk_session_id + transaction_name
+ *      derived from banker-deal-context.json/session-summary.json) — ON CONFLICT DO NOTHING
+ *   2. exec scripts/backfill-local-to-db.mjs <key> --dir=<dir> [--dry-run]
+ *   3. (optional) POST rebuild-artifacts + rebuild-kg when --host and --token given
+ *
+ * Usage:
+ *   node scripts/restore-unpersisted-session.mjs <session-key> [--dir=/abs/path] [--dry-run]
+ *       [--host=https://staging.example] [--token=<admin-jwt>]
+ *
+ * Idempotent: re-running re-upserts reports/states/charts and leaves the row intact.
+ */
+
+import 'dotenv/config';
+import pg from 'pg';
+import { promises as fs } from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+import { execFileSync } from 'child_process';
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+
+const args = process.argv.slice(2);
+const sessionKey = args.find(a => !a.startsWith('--'));
+const dryRun = args.includes('--dry-run');
+const dirArg = args.find(a => a.startsWith('--dir='))?.split('=')[1];
+const host = args.find(a => a.startsWith('--host='))?.split('=')[1];
+const token = args.find(a => a.startsWith('--token='))?.split('=')[1];
+
+if (!sessionKey || !/^\d{4}-\d{2}-\d{2}-\d+$/.test(sessionKey)) {
+  console.error('Usage: node scripts/restore-unpersisted-session.mjs <session-key> [--dir=/abs/path] [--dry-run] [--host=...] [--token=...]');
+  process.exit(1);
+}
+
+const sessionDir = dirArg
+  ? path.resolve(dirArg)
+  : path.resolve(__dirname, '..', 'reports', sessionKey);
+
+async function readJson(file) {
+  try { return JSON.parse(await fs.readFile(file, 'utf8')); }
+  catch { return null; }
+}
+
+/** Build a human-readable transaction_name from whatever the session left behind. */
+function deriveTransactionName(deal, summary) {
+  const d = deal?.deal || deal || {};
+  if (d.target && d.acquirer) return `${d.acquirer} / ${d.target}`;
+  if (d.target) return d.target;
+  return summary?.transaction_name || null;
+}
+
+async function main() {
+  if (!(await fs.access(sessionDir).then(() => true, () => false))) {
+    console.error(`Session dir not found: ${sessionDir}`);
+    process.exit(1);
+  }
+  if (!process.env.PG_CONNECTION_STRING) {
+    console.error('PG_CONNECTION_STRING not set (.env) — cannot reach the database.');
+    process.exit(1);
+  }
+
+  const summary = await readJson(path.join(sessionDir, 'session-summary.json'));
+  const deal = await readJson(path.join(sessionDir, 'banker-deal-context.json'));
+  const transactionName = deriveTransactionName(deal, summary);
+  const sdkSessionId = summary?.sdk_session_id || null;
+
+  // ── 1. Bootstrap the sessions row (idempotent) ──
+  const pool = new pg.Pool({ connectionString: process.env.PG_CONNECTION_STRING });
+  const existing = await pool.query('SELECT id, status, transaction_name FROM sessions WHERE session_key = $1', [sessionKey]);
+  if (existing.rows.length > 0) {
+    const row = existing.rows[0];
+    console.log(`[Restore] sessions row already present (id=${row.id}, status=${row.status}) — skipping bootstrap`);
+    // The row pre-exists (e.g. an errored live run), so the INSERT was skipped and
+    // transaction_name was never set. Backfill's UPDATE doesn't touch it either, so
+    // fill it here — non-destructively, only when currently NULL.
+    if (transactionName && !row.transaction_name) {
+      if (dryRun) {
+        console.log(`[Restore] [dry] would set transaction_name=${JSON.stringify(transactionName)} on existing row (currently NULL)`);
+      } else {
+        await pool.query(
+          `UPDATE sessions SET transaction_name = COALESCE(transaction_name, $2), updated_at = NOW()
+           WHERE id = $1 AND transaction_name IS NULL`,
+          [row.id, transactionName],
+        );
+        console.log(`[Restore] transaction_name set to ${JSON.stringify(transactionName)} on existing row`);
+      }
+    }
+  } else if (dryRun) {
+    console.log(`[Restore] [dry] would INSERT sessions row: key=${sessionKey}, transaction_name=${JSON.stringify(transactionName)}, sdk_session_id=${sdkSessionId}`);
+  } else {
+    const ins = await pool.query(
+      `INSERT INTO sessions (session_key, status, sdk_session_id, transaction_name, metadata)
+       VALUES ($1, 'in_progress', $2, $3, $4::jsonb)
+       ON CONFLICT (session_key) DO NOTHING
+       RETURNING id`,
+      [sessionKey, sdkSessionId, transactionName,
+       JSON.stringify({ bootstrapped_by: 'restore-unpersisted-session', bootstrapped_at: new Date().toISOString() })],
+    );
+    console.log(`[Restore] sessions row bootstrapped (id=${ins.rows[0]?.id}, transaction_name=${JSON.stringify(transactionName)})`);
+  }
+  await pool.end();
+
+  // ── 2. Delegate to the existing, tested backfill ──
+  const backfill = path.resolve(__dirname, 'backfill-local-to-db.mjs');
+  const backfillArgs = [backfill, sessionKey, `--dir=${sessionDir}`, ...(dryRun ? ['--dry-run'] : [])];
+  console.log(`\n[Restore] → node ${backfillArgs.join(' ')}\n`);
+  execFileSync(process.execPath, backfillArgs, { stdio: 'inherit' });
+
+  // ── 3. Optional admin rebuilds ──
+  if (host && token && !dryRun) {
+    for (const ep of ['rebuild-artifacts', 'rebuild-kg']) {
+      const url = `${host.replace(/\/$/, '')}/api/admin/sessions/${sessionKey}/${ep}`;
+      console.log(`\n[Restore] POST ${url}`);
+      execFileSync('curl', ['-fsS', '-X', 'POST', url, '-H', `Authorization: Bearer ${token}`], { stdio: 'inherit' });
+    }
+  } else if (!dryRun) {
+    console.log(`\n[Restore] Skipped admin rebuilds (pass --host and --token to run them). Manual:`);
+    console.log(`  curl -X POST <host>/api/admin/sessions/${sessionKey}/rebuild-artifacts -H "Authorization: Bearer $TOKEN"`);
+    console.log(`  curl -X POST <host>/api/admin/sessions/${sessionKey}/rebuild-kg        -H "Authorization: Bearer $TOKEN"`);
+  }
+
+  console.log(`\n[Restore] DONE${dryRun ? ' (dry run — no writes)' : ''}.`);
+}
+
+main().catch(err => { console.error(err); process.exit(1); });

From 03b658d079865a1545f96d8dc6fb9b0e4ec7ed06 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:09:15 -0400
Subject: [PATCH 08/11] chore(scripts): add rebuild-kg-local recovery tool

Local, faithful reproduction of POST /api/admin/sessions/:key/rebuild-kg for
when no server / admin JWT is available: entity-synthesis + citation-synthesis
pre-steps then buildSessionKnowledgeGraph, honoring BANKER_QA_OUTPUT for the
banker KG phases. Upsert-only (mirrors the endpoint); --clean is opt-in.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../scripts/rebuild-kg-local.mjs              | 124 ++++++++++++++++++
 1 file changed, 124 insertions(+)
 create mode 100644 super-legal-mcp-refactored/scripts/rebuild-kg-local.mjs

diff --git a/super-legal-mcp-refactored/scripts/rebuild-kg-local.mjs b/super-legal-mcp-refactored/scripts/rebuild-kg-local.mjs
new file mode 100644
index 000000000..dcb83842e
--- /dev/null
+++ b/super-legal-mcp-refactored/scripts/rebuild-kg-local.mjs
@@ -0,0 +1,124 @@
+#!/usr/bin/env node
+/**
+ * Rebuild KG (local) — Faithful local reproduction of the
+ * POST /api/admin/sessions/:key/rebuild-kg endpoint (adminRouter.js), for when
+ * no server / admin JWT is available. Same three steps, same fail-soft order:
+ *   1. entity-synthesis pre-step  (synthesize entities.json if absent)
+ *   2. citation-synthesis pre-step (rebuild consolidated-footnotes if truncated)
+ *   3. buildSessionKnowledgeGraph  (nodes + edges; banker phases 1b/1c when
+ *      BANKER_QA_OUTPUT=true)
+ *
+ * Mirrors the endpoint's UPSERT semantics — it does NOT delete existing KG rows
+ * (identical to the production endpoint). Run with the flag so banker nodes fire:
+ *   BANKER_QA_OUTPUT=true node scripts/rebuild-kg-local.mjs <session-key>
+ */
+
+import 'dotenv/config';
+import pg from 'pg';
+
+const sessionKey = process.argv.find(a => /^\d{4}-\d{2}-\d{2}-\d+$/.test(a));
+const clean = process.argv.includes('--clean');
+if (!sessionKey) {
+  console.error('Usage: BANKER_QA_OUTPUT=true node scripts/rebuild-kg-local.mjs <session-key> [--clean]');
+  console.error('  --clean: delete this session\'s derived KG (kg_nodes/edges/evolution/provenance,');
+  console.error('           NOT kg_messages user-chat) before rebuilding, for a pristine graph.');
+  process.exit(1);
+}
+
+async function main() {
+  const pool = new pg.Pool({ connectionString: process.env.PG_CONNECTION_STRING });
+
+  const { featureFlags } = await import('../src/config/featureFlags.js');
+  console.log(`[KG] BANKER_QA_OUTPUT = ${featureFlags.BANKER_QA_OUTPUT}`);
+
+  const sess = await pool.query('SELECT id FROM sessions WHERE session_key=$1', [sessionKey]);
+  if (sess.rows.length === 0) { console.error('Session not found'); process.exit(1); }
+  const sessionId = sess.rows[0].id;
+  console.log(`[KG] session ${sessionKey} → ${sessionId}`);
+
+  const before = await pool.query(
+    'SELECT (SELECT count(*) FROM kg_nodes WHERE session_id=$1) nodes, (SELECT count(*) FROM kg_edges WHERE session_id=$1) edges',
+    [sessionId]);
+  console.log(`[KG] before: nodes=${before.rows[0].nodes} edges=${before.rows[0].edges}`);
+
+  // ── 0. optional clean: clear session-scoped derived KG (NOT kg_messages) ──
+  if (clean) {
+    // kg_edges cascades from kg_nodes, but delete explicitly + first for clarity.
+    // kg_evolution/kg_provenance only SET NULL on node/edge delete, so their rows
+    // would linger — delete them by session_id to avoid stale-build accumulation.
+    for (const tbl of ['kg_provenance', 'kg_evolution', 'kg_edges', 'kg_nodes']) {
+      const r = await pool.query(`DELETE FROM ${tbl} WHERE session_id = $1`, [sessionId]);
+      console.log(`[KG] --clean: deleted ${r.rowCount} from ${tbl}`);
+    }
+  }
+
+  // ── 1. entity-synthesis pre-step ──
+  let entitiesSource = 'native';
+  try {
+    const existing = await pool.query(
+      `SELECT 1 FROM report_artifacts WHERE session_id=$1 AND file_name='entities.json' LIMIT 1`, [sessionId]);
+    if (existing.rows.length === 0) {
+      const { synthesizeEntitiesJson, persistSynthesizedEntities } = await import('../src/utils/entitySynthesis.js');
+      const { payload, audit } = await synthesizeEntitiesJson(pool, sessionId, sessionKey);
+      if (payload.entities.length > 0) {
+        await persistSynthesizedEntities(pool, sessionId, payload);
+        entitiesSource = 'synthesized';
+        console.log(`[KG] entities synthesized: ${payload.entities.length} (T1=${audit.tier1_count} T2=${audit.tier2_count} T3=${audit.tier3_count} T4=${audit.tier4_count})`);
+      } else {
+        console.warn('[KG] entity synthesis yielded 0 — legacy fallback will apply');
+      }
+    } else {
+      console.log('[KG] entities.json already present — skipping synthesis');
+    }
+  } catch (e) { console.warn(`[KG] entity synthesis failed (continuing): ${e.message}`); }
+
+  // ── 2. citation-synthesis pre-step ──
+  let citationsSource = 'native';
+  try {
+    const cf = await pool.query(`SELECT content FROM reports WHERE session_id=$1 AND report_key='consolidated-footnotes'`, [sessionId]);
+    const cfContent = cf.rows[0]?.content || '';
+    const {
+      countFootnotesAcrossSectionFiles, isConsolidatedFootnotesTruncated,
+      synthesizeConsolidatedFootnotes, persistConsolidatedFootnotes,
+    } = await import('../src/utils/citationSynthesis.js');
+    const sectionFnCount = await countFootnotesAcrossSectionFiles(pool, sessionId);
+    if (isConsolidatedFootnotesTruncated(cfContent, sectionFnCount)) {
+      const { markdown, citationMapMd, audit } = await synthesizeConsolidatedFootnotes(pool, sessionId, sessionKey);
+      if (audit.total_footnotes >= 50) {
+        await persistConsolidatedFootnotes(pool, sessionId, sessionKey, markdown, citationMapMd);
+        citationsSource = 'synthesized';
+        console.log(`[KG] citations synthesized: ${audit.total_footnotes} from ${audit.source_section_count} sections`);
+      } else {
+        console.warn(`[KG] citation synthesis yielded ${audit.total_footnotes} (<50) — keeping existing`);
+      }
+    } else {
+      console.log(`[KG] consolidated-footnotes healthy (section fn count=${sectionFnCount}) — skipping`);
+    }
+  } catch (e) { console.warn(`[KG] citation synthesis failed (continuing): ${e.message}`); }
+
+  // ── 3. build KG ──
+  const { buildSessionKnowledgeGraph } = await import('../src/utils/knowledgeGraphExtractor.js');
+  const result = await buildSessionKnowledgeGraph(pool, sessionId, sessionKey);
+  console.log(`[KG] build result:`, JSON.stringify(result));
+
+  const after = await pool.query(
+    'SELECT (SELECT count(*) FROM kg_nodes WHERE session_id=$1) nodes, (SELECT count(*) FROM kg_edges WHERE session_id=$1) edges',
+    [sessionId]);
+  console.log(`[KG] after: nodes=${after.rows[0].nodes} edges=${after.rows[0].edges}`);
+
+  // node-type + banker breakdown
+  const types = await pool.query(
+    `SELECT node_type, count(*) FROM kg_nodes WHERE session_id=$1 GROUP BY node_type ORDER BY 2 DESC`, [sessionId]);
+  console.log('[KG] node types:'); console.table(types.rows);
+  const banker = await pool.query(
+    `SELECT count(*) FROM kg_nodes WHERE session_id=$1 AND (node_type ILIKE '%question%' OR canonical_key ILIKE '%bq%' OR canonical_key ILIKE '%question%' OR properties::text ILIKE '%banker%')`, [sessionId]);
+  console.log(`[KG] banker-related nodes: ${banker.rows[0].count}`);
+  const edgeTypes = await pool.query(
+    `SELECT edge_type, count(*) FROM kg_edges WHERE session_id=$1 GROUP BY edge_type ORDER BY 2 DESC`, [sessionId]);
+  console.log('[KG] edge types:'); console.table(edgeTypes.rows);
+
+  console.log(`\n[KG] DONE — entities=${entitiesSource}, citations=${citationsSource}`);
+  await pool.end();
+}
+
+main().catch(e => { console.error(e); process.exit(1); });

From 968b874148b1c6788016a0cdf80501c73739e684 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:16:03 -0400
Subject: [PATCH 09/11] fix(kg): derive risk exposure_amounts/probability
 structurally (audit #1/#2)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Adversarial review found the node loop re-regexed the rendered block, so
mitigation $-figures leaked into exposure_amounts (R-ANT-001 captured the
$1,237,262,000 RRTF) and the title's % masqueraded as probability (65-80%
instead of the 8% fail prob; 53.8%→8% decimal truncation).

- buildRiskBlocksFromJson now emits structured exposureAmounts (from exposure
  bits only) and probability (first %-token of the probability field, else null).
- The node loop prefers these; markdown Path B (no structured fields) still
  falls back to whole-block regex — unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../src/utils/knowledgeGraph/kgPhases6to8.js  | 22 ++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js b/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
index 684591e71..3af7cc444 100644
--- a/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
+++ b/super-legal-mcp-refactored/src/utils/knowledgeGraph/kgPhases6to8.js
@@ -414,7 +414,13 @@ export function buildRiskBlocksFromJson(content) {
         finding.notes ? `Notes: ${finding.notes}` : '',
         finding.correlation_note ? `Correlation: ${finding.correlation_note}` : '',
       ].filter(Boolean).join('\n');
-      blocks.push({ title: `${fid ? fid + ': ' : ''}${title}`, block: synthBlock });
+      // Structured extraction so the downstream node loop need NOT re-regex the
+      // rendered block (which would pull "$" figures out of mitigation prose and
+      // a "%" out of the title). exposureAmounts come ONLY from the exposure bits;
+      // probability is the first %-token of the probability field, else null (#231).
+      const exposureAmounts = (exposureBits.join(' ').match(/\$[\d,.]+[BMK]?/g) || []).slice(0, 5);
+      const probability = (probStr.match(/(\d{1,3})[\-–]?(\d{1,3})?%/) || [])[0] || null;
+      blocks.push({ title: `${fid ? fid + ': ' : ''}${title}`, block: synthBlock, exposureAmounts, probability });
     }
   }
   return blocks;
@@ -459,9 +465,15 @@ async function phase7_riskAndFacts(pool, sessionId, evolutionLog, resolver, tNum
     }
 
     // Unified node-creation loop (consumes both JSON-synthesized and markdown-extracted blocks)
-    for (const { title, block } of riskBlocks) {
-      const amounts = block.match(/\$[\d,.]+[BMK]?/g) || [];
-      const probs = block.match(/(\d{1,3})[\-–]?(\d{1,3})?%/);
+    for (const riskBlock of riskBlocks) {
+      const { title, block } = riskBlock;
+      // Prefer structured values from the JSON parser (exposureAmounts/probability)
+      // so mitigation "$" figures and title "%"s never contaminate them. Markdown
+      // Path B blocks carry neither field → fall back to whole-block regex (#231).
+      const amounts = riskBlock.exposureAmounts ?? (block.match(/\$[\d,.]+[BMK]?/g) || []);
+      const probValue = riskBlock.probability !== undefined
+        ? riskBlock.probability
+        : ((block.match(/(\d{1,3})[\-–]?(\d{1,3})?%/) || [])[0] ?? null);
       const mitigation = block.match(/(?:mitigat|recommend|address|escrow|protect|hedge|covenant)[^.]*\.[^.]*\./i);
       const consequence = block.match(/(?:consequence|impact|result|exposure|cost|loss|failure)[:\s]*([^.]+\.[^.]*\.)/i);
       const entities = block.match(/\b(?:SoftBank|ADIA|DigitalBridge|DataBank|Switch|Marc Ganzi|Vantage|Vertical Bridge|Zayo|CFIUS|FCC|IRS|SEC)\b/gi);
@@ -472,7 +484,7 @@ async function phase7_riskAndFacts(pool, sessionId, evolutionLog, resolver, tNum
         canonical_key: `risk:${title.slice(0, 80).toLowerCase().replace(/[^a-z0-9]+/g, '-')}`,
         properties: {
           exposure_amounts: amounts.slice(0, 5),
-          probability: probs ? probs[0] : null,
+          probability: probValue,
           mitigation: mitigation ? mitigation[0].trim().slice(0, 400) : null,
           consequence: consequence ? consequence[1]?.trim().slice(0, 400) || consequence[0]?.trim().slice(0, 400) : null,
           entities_involved: entities ? [...new Set(entities.map(e => e.trim()))] : [],

From d572da4af169567dbe9a1469c8c531f5b6bd69c6 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:16:03 -0400
Subject: [PATCH 10/11] test(kg): real-shape regression tests for
 exposure/probability extraction
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Pins audit findings #1/#2 with faithful R-ANT-001 shapes: mitigation RRTF figure
must not enter exposureAmounts; probability comes from the probability field not
a title %; non-quantified probability → null; legacy numeric path unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../test/sdk/kg-phase7-risk-parser.test.js    | 60 +++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js b/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js
index 880d1bf65..913d34893 100644
--- a/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js
+++ b/super-legal-mcp-refactored/test/sdk/kg-phase7-risk-parser.test.js
@@ -135,3 +135,63 @@ test('exposure_by_category takes precedence over legacy keys when both present',
   assert.equal(blocks.length, 1);
   assert.match(blocks[0].title, /^N-1: /);
 });
+
+// ── Regression anchors for adversarial-audit findings #1 and #2 (real-data shapes) ──
+
+test('structured exposureAmounts exclude mitigation $-figures (audit #1)', () => {
+  // Faithful R-ANT-001 replica: mitigation prose carries the $1,237,262,000 RRTF;
+  // it must NOT appear in exposure_amounts even though exposure bits number < 5.
+  const content = JSON.stringify({
+    exposure_by_category: [{ category: 'Regulatory/Antitrust', findings: [{
+      id: 'R-ANT-001',
+      finding: 'HSR Second Request probable (65–80%); H2-2027 behavioral-remedy clearance base case; ~8% deal-fail.',
+      severity: 'HIGH',
+      probability: '8% fail / 65–80% Second Request',
+      weighted_exposure: '$433.75M (break EL, marked) / $621.71M (headline)',
+      exposure_low: '$0 (clears)',
+      exposure_high: '$1.237B RRTF',
+      mitigation: '$1,237,262,000 regulatory reverse-termination fee (Fox pays).',
+    }] }],
+  });
+  const block = buildRiskBlocksFromJson(content)[0];
+  assert.deepEqual(block.exposureAmounts, ['$433.75M', '$621.71M', '$0', '$1.237B'],
+    'exposureAmounts derive only from exposure fields');
+  assert.ok(!block.exposureAmounts.includes('$1,237,262,000'),
+    'mitigation RRTF figure must not leak into exposureAmounts');
+});
+
+test('structured probability is the finding probability, not a title % (audit #2)', () => {
+  // finding text contains "(65–80%)"; probability field leads with "8% fail".
+  const content = JSON.stringify({
+    exposure_by_category: [{ category: 'X', findings: [{
+      id: 'R-ANT-001',
+      finding: 'HSR Second Request probable (65–80%); ~8% deal-fail base case.',
+      probability: '8% fail / 65–80% Second Request',
+      weighted_exposure: '$433.75M',
+    }] }],
+  });
+  const block = buildRiskBlocksFromJson(content)[0];
+  assert.equal(block.probability, '8%', 'probability comes from the probability field, not the title');
+});
+
+test('non-quantified probability string → null (no stray title %)', () => {
+  const content = JSON.stringify({
+    exposure_by_category: [{ category: 'X', findings: [{
+      id: 'F-SYN-002',
+      finding: 'Synergies cover only 53.8% of premium; EPS dilutive.',
+      probability: 'Base case (realized)',
+      weighted_exposure: '$139.5M',
+    }] }],
+  });
+  const block = buildRiskBlocksFromJson(content)[0];
+  assert.equal(block.probability, null, 'no % in probability field → null, not "8%" from "53.8%"');
+});
+
+test('legacy numeric probability still yields structured NN%', () => {
+  const content = JSON.stringify({
+    risk_categories: [{ category: 'F', findings: [{ id: 'F-1', finding: 'Numeric risk finding.', p50: 2.09e9, probability: 0.65 }] }],
+  });
+  const block = buildRiskBlocksFromJson(content)[0];
+  assert.equal(block.probability, '65%');
+  assert.deepEqual(block.exposureAmounts, ['$2.09B']);
+});

From 2755cff21f71f8079dcf3b839d9982277271c4a8 Mon Sep 17 00:00:00 2001
From: Number531 <120485065+Number531@users.noreply.github.com>
Date: Wed, 17 Jun 2026 11:17:26 -0400
Subject: [PATCH 11/11] docs(changelog): record KG risk-layer fix (#231)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 super-legal-mcp-refactored/CHANGELOG.md | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/super-legal-mcp-refactored/CHANGELOG.md b/super-legal-mcp-refactored/CHANGELOG.md
index 984361eaa..cc00d7593 100644
--- a/super-legal-mcp-refactored/CHANGELOG.md
+++ b/super-legal-mcp-refactored/CHANGELOG.md
@@ -4,6 +4,16 @@ All notable changes to the Super Legal MCP Server are documented in this file.
 
 ## [Unreleased]
 
+### Fixed — KG risk layer empty for current-format sessions (#231, 2026-06-17)
+
+The knowledge graph dropped the entire `risk` node layer on every current-format session. Two independent breaks, both required for a risk node to exist:
+
+- **Persistence** — the risk-aggregator emits structured `review-outputs/risk-summary.json` (`risk-aggregator.js:25`), but the live PostToolUse hook (`hookDBBridge.js`) and the local backfill (`scripts/backfill-local-to-db.mjs`) only persisted `.md`, so it never reached the `reports` table and the KG `CRITICAL_REPORTS` gate (`risk-summary`) timed out. Fixed via a scoped `JSON_REPORT_FILENAMES` allowlist (exact basenames — `*-state.json` / `banker-*.json` / `entities.json` excluded) consulted by both persist paths; `persistReport` is content-agnostic so JSON content stores cleanly as `report_type=review, report_key=risk-summary`.
+- **Parser schema drift** — Phase 7 (`kgPhases6to8.js`) keyed on the legacy `risk_categories` schema; the current producer emits `exposure_by_category` with **string** exposures (`"$433.75M"`) and **string** probability (`"8% fail"`). The JSON parser was extracted to a pure, exported `buildRiskBlocksFromJson()` (unit-tested) and extended to the current schema with the legacy numeric path preserved.
+- **Extraction quality (adversarial-audit remediation)** — the node loop previously re-regexed the rendered block, leaking mitigation `$`-figures into `exposure_amounts` (the `$1,237,262,000` RRTF) and pulling a stray title `%` as the probability. The parser now emits structured `exposureAmounts`/`probability`; the node loop prefers them, with the Markdown fallback path unchanged.
+
+Verified on the real `2026-06-16` (Fox/Roku) `risk-summary.json`: 11 risk blocks (was 0), zero mitigation-figure leaks, correct per-finding probabilities. Tests: `test/sdk/kg-phase7-risk-parser.test.js` (13 cases, legacy + current schema + the two audit regressions). Backfill also now scans `review-outputs/` for `*-state.json`. Plan + audit: `docs/pending-updates/kg-risk-layer-fix-231.md`. Recovery tooling: `scripts/restore-unpersisted-session.mjs`, `scripts/rebuild-kg-local.mjs`.
+
 ## [8.1.0] - 2026-06-08 — Forced banker intake phase + deterministic phase harness (live-validated on staging)
 
 ### Added — Forced banker intake phase + deterministic phase harness (2026-06-08)