From 4c5a7a3260a30519e4302738ac3df3cc6ac47259 Mon Sep 17 00:00:00 2001 From: Number531 <120485065+Number531@users.noreply.github.com> Date: Fri, 15 May 2026 13:28:03 -0400 Subject: [PATCH 1/2] =?UTF-8?q?feat(xlsx-renderer)!:=20manual=20render=20e?= =?UTF-8?q?ndpoint=20async-202=20(#88)=20=E2=80=94=20BREAKING=20response?= =?UTF-8?q?=20shape?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit BREAKING CHANGE: POST /api/render-workbook/:sessionId previously returned HTTP 200 with the full sync envelope { success, xlsxPath, auditResults, artifactId, durationMs }. It now returns HTTP 202 immediately with a fire-and-forget envelope: { render_id, status:'pending', session_id, template_id, created_at, status_poll_url, sse_url } The render runs in the background; caller polls GET /api/render-workbook/ :renderId/status (new endpoint) for terminal state OR subscribes to the existing session SSE channel /api/stream?sessionId=… for live progress (events of type 'xlsx_render' with status ∈ {pending, complete, failed}). WHY: synchronous endpoint held the request thread up to OVERALL_TIMEOUT_MS (1200s). Real failure modes: - Client-side timeouts kill connections mid-render (browser ~5min, undici 30s, CI/CD 60–300s). Render completed server-side but caller could not retrieve the result. - Cloud Run / nginx idle-connection timeouts (60–300s default) killed the proxy even when the render was succeeding internally. - Concurrency cap (XLSX_RENDER_CONCURRENCY=2) held HTTP connections idle for queued requests. - Backpressure invisible to callers. Architecturally async — code-execution containers are per-request fresh (v3.7.1 fix), pause_turn checkpointing makes the work inherently turn- based, streaming Messages API (Step 2 migration), parallel phase fan-out (Gate 2), SSE bridge + xlsx_renders state machine all exist already. The sync HTTP wrapper was the outlier; this refactor makes the endpoint *consume* infrastructure this branch already built rather than bypass it. THREE REFINEMENTS: 1. Persistent task state via xlsx_renders (NEW: status='pending' enum value was declared in migration 017 but never used until now) 2. Wire to existing SSE bridge (no new SSE infra) 3. Idempotency-Key header — DEFERRED (zero precedent in codebase; inconsistent with "comprehensive alignment" framing — Issue #88 refinement-3 follow-up) STATE MACHINE CLEANUP: every xlsx_renders row now inserts at 'pending' (persist.js:81 literal change). renderForSession's _renderForSessionInner transitions 'pending' → 'running' early via transitionRenderToRunning(id) (idempotent — WHERE clause makes it a no-op on retry). Reconciliation's existing WHERE render_status IN ('pending','running') predicate covers both states. Auto-trigger path (agentStreamHandler.js SessionEnd) gets the new state machine for free. VERIFIED EMPIRICALLY: grep "render_status" test/sdk/xlsx-renderer- integration.test.js → 5 sites (T10/T11/T17/T18/T19), all ORDER BY DESC LIMIT 1 asserting terminal states only. None reads middle state, none asserts 'pending' impossibility. All existing tests unaffected. FILES CHANGED: src/server/claude-sdk-server.js — refactored POST handler (sync→202+setImmediate), added GET status endpoint with cookieAuthMiddleware src/utils/xlsxRenderer/persist.js — initial status 'running' → 'pending', added transitionRenderToRunning(id) helper src/utils/xlsxRenderer/index.js — renderForSession accepts preCreatedRenderRowId opt; auto-trigger path uses fallback INSERT src/utils/sdkMetrics.js — new outcome labels 'dispatched' + 'dispatch_failed' on xlsxRenderManualCalls counter test/sdk/xlsx-renderer-integration.test.js — T27 (pending insert + idempotent transition), T28 (status query shape), T29 (preCreatedRenderRowId adoption) = +18 assertions docs/api-reference.md — NEW "Document Generation — Workbook Rendering" section with full endpoint contract docs/pending-updates/excel-code-execution-preflight.md — Step 4 curl + DB checks updated for 202 + poll docs/pending-updates/excel-code-execution-isolation-test-plan.md — Phase J curl 202 docs/pending-updates/excel-code-execution-gate2-analysis.md — #88 marked SHIPPED docs/compliance/xlsx-art17-scope.md — clarifies Art.14+Art.12 fire PRE-202 .claude/skills/post-deploy-verify/SKILL.md — V7 row updated: terminal-state failure check + 202 smoke probe .claude/skills/infrastructure-health/SKILL.md — pending count semantics distinguish normal in-flight from stuck .claude/skills/client-backup-restore/SKILL.md — "one row per render request" not "per completion" .claude/skills/session-diagnostics/SKILL.md — NEW XLSX render lifecycle section with per-state triage guidance VERIFICATION: Unit suite: 185 / 0 / 2 (was 167 pre-#88; +18 from T27/T28/T29). All node --check pass. All conflict markers / TODOs / debug code: zero. BLAST RADIUS (verified low): - Manual endpoint response shape changes — BREAKING for direct callers only (no internal code consumed the sync shape). - persist.js initial status change — touches insertXlsxRenderRow callers (index.js:153 and the new manual endpoint). Reconciliation predicates already accept 'pending'. - New status-polling endpoint — net-new, zero existing consumers. - Auto-trigger path (agentStreamHandler.js) — unchanged; gets new state machine via insertXlsxRenderRow's new default. XLSX_RENDERER stays `false` in prod throughout — this PR is behavior- neutral against current production. Issue #88 specifically protects the ops/manual workflow; auto-trigger renders (SessionEnd path) never hold an HTTP connection and were unaffected by the sync bug. Plan: /Users/ej/.claude/plans/glittery-toasting-stardust.md (Issue #88) Co-Authored-By: Claude Opus 4.7 (1M context) --- .claude/skills/client-backup-restore/SKILL.md | 8 +- .claude/skills/infrastructure-health/SKILL.md | 6 +- .claude/skills/post-deploy-verify/SKILL.md | 2 +- .claude/skills/session-diagnostics/SKILL.md | 24 ++ .../docs/api-reference.md | 96 ++++++++ .../docs/compliance/xlsx-art17-scope.md | 3 +- .../excel-code-execution-gate2-analysis.md | 2 +- ...xcel-code-execution-isolation-test-plan.md | 15 +- .../excel-code-execution-preflight.md | 38 ++- .../src/server/claude-sdk-server.js | 191 ++++++++++++--- .../src/utils/sdkMetrics.js | 13 +- .../src/utils/xlsxRenderer/index.js | 33 ++- .../src/utils/xlsxRenderer/persist.js | 47 +++- .../sdk/xlsx-renderer-integration.test.js | 217 ++++++++++++++++++ 14 files changed, 634 insertions(+), 61 deletions(-) diff --git a/.claude/skills/client-backup-restore/SKILL.md b/.claude/skills/client-backup-restore/SKILL.md index e96a32c81..859e053cb 100644 --- a/.claude/skills/client-backup-restore/SKILL.md +++ b/.claude/skills/client-backup-restore/SKILL.md @@ -198,9 +198,9 @@ gcloud sql backups restore {backup_id} \ - `citation_source_links` — citation→source bridge with confidence scores (1 row per matched citation) - `citation_verdicts` — per-footnote G5 verdicts (v6.8.6 T1, PR #122). 1 row per verified footnote; ~300-500 rows per memo session that ran citation-websearch-verifier. FK ON DELETE CASCADE on reports + sessions — backed up automatically via pg_dump; no manual handling needed. - `hook_audit_log` — now includes `bridge_metadata` JSONB column with `git_sha + sdk_version + container_id + system_prompt_hash` (regulator-replay envelope) -- **v7.x XLSX renderer (migrations 015 + 016)**: - - `human_interventions.metadata` JSONB column (added in 015) — carries Art. 17 cascade-erasure audit payloads + Art. 14 manual-override context - - `xlsx_renders` — one row per workbook render attempt (template_id, render_status, audit_results JSONB, artifact_id, xlsx_safe_flip_count). **Decision record per Art. 12 retention — preserved on GDPR erasure** (see `docs/compliance/xlsx-art17-scope.md`). +- **v7.x XLSX renderer (migrations 016 + 017 + 018)**: + - `human_interventions.metadata` JSONB column (added in 016 — was 015 pre-merge with main) — carries Art. 17 cascade-erasure audit payloads + Art. 14 manual-override context + - `xlsx_renders` — one row per workbook render **request** (template_id, render_status, audit_results JSONB, artifact_id, xlsx_safe_flip_count + 4 generated columns from migration 018: `audit_status`, `sheet_count`, `warnings_count`, `node_audit_ran`). Lifecycle: `'pending' → 'running' → 'completed'|'failed'|'built'|'reconciled_failed'`. Async-202 endpoint (Issue #88) inserts at `'pending'`; auto-trigger path identical. **Decision record per Art. 12 retention — preserved on GDPR erasure** (see `docs/compliance/xlsx-art17-scope.md`). - `xlsx_render_inputs` — junction table linking each render to its consumed `code_executions` rows Restore verification (Phase 4) should confirm these row counts post-restore for v7.0.0+ deployments: @@ -209,7 +209,7 @@ Restore verification (Phase 4) should confirm these row counts post-restore for - `SELECT COUNT(*) FROM citation_source_links` matches pre-backup count - `SELECT COUNT(*) FROM citation_verdicts` matches pre-backup count (zero rows is acceptable for sessions that ran before v6.8.6 OR that never invoked citation-websearch-verifier) - `SELECT event_data->'bridge_metadata' IS NOT NULL FROM hook_audit_log WHERE tool_name='run_python_analysis'` — bridge_metadata preserved on restore -- `SELECT COUNT(*) FROM xlsx_renders` matches pre-backup count (when `XLSX_RENDERER=true`) +- `SELECT COUNT(*) FROM xlsx_renders` matches pre-backup count (when `XLSX_RENDERER=true`). Post-restore: any `render_status IN ('pending','running')` rows will be picked up by reconciliation within `STUCK_BUILD_THRESHOLD_MIN`=60min — this is expected (renders resume from snapshot state) - `SELECT COUNT(*) FROM xlsx_render_inputs` matches pre-backup count - `SELECT COUNT(*) FROM human_interventions WHERE metadata != '{}'::jsonb` — Art. 17 / Art. 14 audit-trail metadata preserved diff --git a/.claude/skills/infrastructure-health/SKILL.md b/.claude/skills/infrastructure-health/SKILL.md index 019fb5c46..c583d78c8 100644 --- a/.claude/skills/infrastructure-health/SKILL.md +++ b/.claude/skills/infrastructure-health/SKILL.md @@ -114,9 +114,11 @@ Timestamp: | Target: ### Reconciliation: - KG: pending= stuck= | Artifacts: pending= stuck= -- XLSX renders: pending= stuck= (from `/health.reconciliation.pending_xlsx_renders` + `stuck_xlsx_renders` — only present when `XLSX_RENDERER=true`) +- XLSX renders backlog: pending= stuck= (from `/health.reconciliation.pending_xlsx_renders` + `stuck_xlsx_renders` — only present when `XLSX_RENDERER=true`) - Last scan: -- Threshold: pending_xlsx_renders > 10 sustained → WARNING (per `docs/pending-updates/excel-code-execution.md` §12.1); presence of `xlsx_renders_error` field → CRITICAL (schema_missing or query_failed bucket) +- **Normal in-flight signal** (post-Issue#88 async-202): renders move `'pending' → 'running' → 'completed'` over ~10 minutes typical. Brief `pending` spikes track manual-endpoint traffic, NOT stuck work. Threshold `pending > 10 sustained for >15 min` indicates real backlog. +- **Stuck render signal**: `stuck_xlsx_renders` (reconciliation_attempts ≥ 3) > 0 sustained → WARNING. Investigation query: `SELECT id, render_status, started_at, reconciliation_attempts, error_message FROM xlsx_renders WHERE render_status IN ('pending','running') AND started_at < NOW() - INTERVAL '15 minutes' ORDER BY started_at`. +- **Schema/query failure**: presence of `xlsx_renders_error` field → CRITICAL (schema_missing or query_failed bucket — see `docs/pending-updates/excel-code-execution.md` §12.1) Overall: / healthy ``` diff --git a/.claude/skills/post-deploy-verify/SKILL.md b/.claude/skills/post-deploy-verify/SKILL.md index e0486f55d..881fa8d76 100644 --- a/.claude/skills/post-deploy-verify/SKILL.md +++ b/.claude/skills/post-deploy-verify/SKILL.md @@ -60,7 +60,7 @@ Embeds the verification protocol from `super-legal-mcp-refactored/docs/pending-u | **Container env audit** | `OTEL_TRACES_SAMPLER`, `OTEL_TRACES_SAMPLER_ARG`, `FMP_ENABLED`, `COMMIT_SHA`, `BCRYPT_ROUNDS` all present in container env | | **V5 (v7.6.1)**: Exa A3 telemetry + audit log | When `EXA_ADDITIONAL_QUERIES=true`: `/metrics` exposes `claude_exa_ab_latency_ms{outcome=...}` with ≥1 outcome value populated AND `hook_audit_log` has ≥1 row with `event_data ? 'exa_a3'` in last 1h after a session run. Otherwise: WARNING "no A3 traffic in window". Skip if flag off. | | **V6 (v6.8.6 T1 + v6.8.7 T2)**: G5 citation-verifier observability | `/metrics` exposes all 4 `citation_verifier_*` series (HELP/TYPE lines registered). PASSED when 4/4 found regardless of value (gauge/counter values populate after first G5 run). WARNING if partial (stale image suspected) or zero (sdkMetrics export broken). Companion DB check via `queries/v6-citation-verdicts-presence.sql` — verifies `citation_verdicts` table shape + first-session population. Post-first-G5-run: query confirms ≥1 row per session. | -| **V7 (v7.x XLSX renderer)**: workbook deliverables + schema + metrics | When `XLSX_RENDERER=true`: (a) `xlsx_renders` table exists; (b) `SELECT COUNT(*) FROM xlsx_renders WHERE render_status='failed' AND started_at > NOW() - INTERVAL '1 day'` returns 0; (c) `/metrics` exposes `claude_xlsx_render_invocations_total` and `claude_xlsx_render_duration_seconds_bucket`; (d) `/health.reconciliation.pending_xlsx_renders` field is present (success path) OR `xlsx_renders_error` reports a bucketed code (table-missing OR query-failed during deploy-order window). Skip with WARNING if `XLSX_RENDERER=false`. | +| **V7 (v7.x XLSX renderer + Issue #88 async-202)**: workbook deliverables + schema + metrics + async-202 envelope | When `XLSX_RENDERER=true`: (a) `xlsx_renders` table exists with all 4 generated columns (`audit_status`, `sheet_count`, `warnings_count`, `node_audit_ran`); (b) `SELECT COUNT(*) FROM xlsx_renders WHERE render_status='failed' AND started_at > NOW() - INTERVAL '1 day'` returns 0 (terminal-state failures only — `'pending'`/`'running'` rows older than `STUCK_BUILD_THRESHOLD_MIN`=60min indicate reconciliation backlog, not deploy issues); (c) `/metrics` exposes `claude_xlsx_render_invocations_total` and `claude_xlsx_render_duration_seconds_bucket` AND `claude_xlsx_render_manual_calls_total{outcome="dispatched"}` is a registered series (proves async-202 envelope shipped — value may be 0 until first manual render); (d) `/health.reconciliation.pending_xlsx_renders` field is present (success path) OR `xlsx_renders_error` reports a bucketed code; (e) **smoke probe** (optional, requires a test session): `curl -X POST $URL/api/render-workbook/$SESSION` returns HTTP 202 with JSON keys `render_id` + `status` + `status_poll_url` + `sse_url`; calling `GET $URL/api/render-workbook/$render_id/status` returns `status ∈ {pending, running, completed, failed}`. Skip with WARNING if `XLSX_RENDERER=false`. | ## Tier 3 — Metrics + Reconciliation + Trace (~10 min) diff --git a/.claude/skills/session-diagnostics/SKILL.md b/.claude/skills/session-diagnostics/SKILL.md index 6f857deb4..6b29157b7 100644 --- a/.claude/skills/session-diagnostics/SKILL.md +++ b/.claude/skills/session-diagnostics/SKILL.md @@ -213,3 +213,27 @@ These are surfaced in the Remediation Suggestions section as suggestions — nev - **Cloud Trace not integrated** — OTel spans exist (`reconciliation.scan → kg.extract_full → kg.phase1_*`) but the codebase doesn't yet label spans with `session_key`. Skipping Cloud Trace integration in v1 of this skill. - **Single-session scope** — one invocation per session_key. Cross-session aggregation deferred to a follow-up skill. - **GCP auth required** — fetches DB credentials from Secret Manager. If you don't have `gcloud auth` set up, set `PG_CONNECTION_STRING` in the environment to bypass. + +## XLSX render lifecycle (when XLSX_RENDERER=true) — Issue #88 async-202 + +When a session reports "workbook never arrived" or operators need to triage a manual-render call: + +```sql +SELECT id AS render_id, render_status, template_id, audit_status, sheet_count, + warnings_count, node_audit_ran, started_at, completed_at, + reconciliation_attempts, error_message +FROM xlsx_renders +WHERE session_id = (SELECT id FROM sessions WHERE session_key = :session_key) +ORDER BY started_at DESC; +``` + +State interpretation (post-Issue #88): + +- **`'pending'`** — row pre-created by the async-202 POST handler; the renderer's setImmediate has not yet started doing work. If `≥ 15 min` old, check server logs for `xlsx_manual_dispatch_failed` (setImmediate dispatcher errored). Reconciliation will sweep at `STUCK_BUILD_THRESHOLD_MIN = 60` min. +- **`'running'`** — renderer has called `transitionRenderToRunning(id)` and is actively executing. If `≥ 30 min` old, check the code-execution bridge logs; a container may have hung. +- **`'completed'`** — terminal success. Fetch artifact via `report_artifacts.id = xlsx_renders.artifact_id`. Audit verdict in `audit_status` (generated column). +- **`'failed'`** — terminal failure. `error_message` carries the reason; check `audit_results->'phase_audits'` for multi-turn forensic detail. +- **`'built'`** — reconciliation safe-flip; file on disk but SSE event never fired (caller may have disconnected pre-completion). Treat as successful delivery. +- **`'reconciled_failed'`** — reconciliation exhausted attempts; manual investigation required. Check the original code-execution bridge call logs. + +For caller-side polling of an in-flight render: `GET /api/render-workbook/:renderId/status` returns the same shape (404 if `renderId` doesn't exist; 503 if `XLSX_RENDERER=false`). diff --git a/super-legal-mcp-refactored/docs/api-reference.md b/super-legal-mcp-refactored/docs/api-reference.md index 61b151084..88f61dd00 100644 --- a/super-legal-mcp-refactored/docs/api-reference.md +++ b/super-legal-mcp-refactored/docs/api-reference.md @@ -484,6 +484,102 @@ The server does not currently impose application-level rate limits on operator/r --- +## Document Generation — Workbook Rendering + +> **Flag-gated**: All endpoints below return `503 xlsx_renderer_disabled` when `XLSX_RENDERER=false` (the prod default). + +### `POST /api/render-workbook/:sessionId` + +Auth: `cookieAuthMiddleware`. **Async — returns `202 Accepted`** and dispatches the render fire-and-forget (Issue #88, v7.x). Caller subscribes to SSE for live progress OR polls the status endpoint for terminal-state confirmation. The pre-Issue#88 sync response shape (with full `auditResults` envelope inline) is **BREAKING**; clients must migrate. + +**Path parameters**: +- `sessionId` — session key in `YYYY-MM-DD-NNNNNNN` format (NOT the UUID). + +**Query parameters**: +- `template` (optional) — force a specific template (`session-models`, `full-deal-workbook`, …). Omit to auto-select. + +**Per-user quota** (Phase 7 Issue #4): 10/hour, 50/day per `req.user.id`. Counted via `claude_xlsx_render_manual_calls_total{outcome}` Prometheus counter with outcome ∈ `{accepted, rate_limited, dispatched, dispatch_failed}`. + +**Response 202**: +```json +{ + "render_id": "01H8…-uuid", + "status": "pending", + "session_id": "2026-05-15-9600099", + "template_id": "session-models", + "created_at": "2026-05-15T18:30:00.000Z", + "status_poll_url": "/api/render-workbook/01H8…-uuid/status", + "sse_url": "/api/stream?sessionId=2026-05-15-9600099" +} +``` + +**Error responses**: + +| Status | Code | Cause | +|---|---|---| +| 400 | `invalid_session_id` | Path param doesn't match `YYYY-MM-DD-NNNNNNN` | +| 400 | `template_not_found` | `?template=` doesn't match any registered template | +| 404 | `session_not_found` | No `sessions` row for the given session_key | +| 429 | `rate_limited` | Per-user quota exceeded | +| 500 | `dispatch_failed` | DB INSERT into `xlsx_renders` failed (pre-202) | +| 503 | `xlsx_renderer_disabled` | Feature flag off | +| 503 | `database_unavailable` | PG pool unreachable | + +**Pre-202 audit writes**: Art. 14 `human_interventions` row + Art. 12 `access_log` row written BEFORE the 202 response, so traceability is honest even if the background render subsequently fails. Both writes are best-effort (log a warning on failure, don't block the 202). + +**Idempotency (v1 — NONE)**: This endpoint has NO `Idempotency-Key` support in v1. Duplicate POSTs within the rate-limit window WILL create duplicate `xlsx_renders` rows. Clients that retry on transport blips should track their own `render_id` and avoid retrying after a successful 202. (Tracked as Issue #88 deferred-refinement-3.) + +### `GET /api/render-workbook/:renderId/status` + +Auth: `cookieAuthMiddleware`. Reads `xlsx_renders` by primary-key UUID. + +**Path parameters**: +- `renderId` — UUID returned from the POST 202 response. + +**Response 200**: +```json +{ + "render_id": "01H8…-uuid", + "status": "pending|running|completed|failed|built|reconciled_failed", + "session_id": "2026-05-15-9600099", + "template_id": "session-models", + "started_at": "2026-05-15T18:30:00.000Z", + "completed_at": "2026-05-15T18:35:00.000Z", + "artifact_id": "…", + "audit_status": "PASS", + "sheet_count": 9, + "warnings_count": 0, + "node_audit_ran": true, + "error_message": null, + "reconciliation_attempts": 0 +} +``` + +The 4 generated columns (`audit_status`, `sheet_count`, `warnings_count`, `node_audit_ran`) come from migration 018 — they're STORED projections of `audit_results` JSONB. `null` until the renderer's audit completes (multi-turn renders fill `sheet_count` only when `merge_info` is populated, etc.). + +**Error responses**: +- `400 invalid_render_id` — path param isn't a valid UUID. +- `404 render_not_found` — no row with that UUID. +- `503 xlsx_renderer_disabled` — flag off. +- `503 xlsx_renders_table_missing` — migration 017 not applied. + +### Live progress via SSE (preferred over polling) + +Subscribe to `/api/stream?sessionId={session_key}` (NOT renderId — the channel is session-keyed). The renderer publishes events of type `xlsx_render` with the following payloads: + +- **status=pending**: `{type:'xlsx_render', status:'pending', template_id, started_at}` (emitted at the start of `_renderForSessionInner`) +- **status=complete**: `{type:'xlsx_render', status:'complete', template_id, audit_status, file, size}` +- **status=failed**: `{type:'xlsx_render', status:'failed', error}` + +> **Status string asymmetry** (preserved from pre-async-202 codebase): SSE event uses `'complete'`/`'failed'`; the polling endpoint reports `'completed'`/`'failed'` (matching the `xlsx_renders.render_status` enum). Clients should accept both spellings. + +### Polling guidance + +- **Preferred**: subscribe to SSE — push-based, no polling overhead. +- **If polling**: 5-second interval is reasonable for renders <10min; 30-second for renders >10min. Backoff to 60-second once `'pending'`/`'running'` exceeds 15 min — reconciliation will sweep at `STUCK_BUILD_THRESHOLD_MIN`=60 min. + +--- + **Reference docs**: - v6.8.5 audit-export runbook: `docs/runbooks/v6.8.5-audit-export.md` - v6.7.0 reconciliation runbook: `docs/runbooks/v6.7.0-session-reconciliation.md` diff --git a/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md b/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md index a4645cf33..01b438361 100644 --- a/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md +++ b/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md @@ -14,7 +14,8 @@ the XLSX renderer.** | Surface | Inputs | PII potential | |---|---|---| | Auto-trigger hook (`agentStreamHandler.js`) | `sessionId` only | None — internal trigger after manifest finalize | -| Manual endpoint (`POST /api/render-workbook/:sessionId?template=`) | `:sessionId` path param + optional `?template=` query | None — both inputs are server-controlled enums | +| Manual endpoint (`POST /api/render-workbook/:sessionId?template=`) | `:sessionId` path param + optional `?template=` query | None — both inputs are server-controlled enums. **Post-Issue#88 (async-202)**: Art. 14 (`persistIntervention`) + Art. 12 (`access_log`) writes fire at REQUEST TIME (pre-202), not on render completion — ensures traceability is honest even if the background render subsequently fails or times out. | +| Status-polling endpoint (`GET /api/render-workbook/:renderId/status`) | `:renderId` UUID | None — read-only; no Art.12 audit write per codebase convention (only state-modifying ops are audit-logged). | The renderer reads from already-pseudonymized session state (memo, citations, `data_provenance`, `code_executions.data`). Upstream PII is mapped to diff --git a/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-gate2-analysis.md b/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-gate2-analysis.md index 27d6efb75..ac1865e91 100644 --- a/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-gate2-analysis.md +++ b/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-gate2-analysis.md @@ -82,6 +82,6 @@ the failure class is closed with stronger evidence than an intermittent live rep ## Next step -Task #88 — fix the blocking `/api/render-workbook` endpoint (dispatch fire-and-forget, return 202 + render id, caller polls). Independent of the gates. +Task #88 — ✅ **SHIPPED** on `feature/xlsx-renderer-88-async-202`. The endpoint returns 202 + render_id envelope, dispatches via setImmediate, transitions `xlsx_renders.render_status` `'pending' → 'running' → 'completed'|'failed'`. Caller subscribes to `/api/stream?sessionId=…` (existing SSE channel) or polls `GET /api/render-workbook/:renderId/status`. Full contract in `docs/api-reference.md` → "Document Generation — Workbook Rendering". Refinement #3 (Idempotency-Key) deferred — zero precedent in codebase. `XLSX_RENDERER` stays `false` in production until a deploy decision. diff --git a/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-isolation-test-plan.md b/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-isolation-test-plan.md index a0b464ffc..160de0fad 100644 --- a/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-isolation-test-plan.md +++ b/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-isolation-test-plan.md @@ -604,10 +604,15 @@ sleep 8 # Cleaner approach: invoke renderForSession path via a server-side test endpoint OR # use direct curl with manually-issued JWT. -curl -s -X POST "http://localhost:3099/api/render-workbook/2026-05-12-1234567" \ - -H "Cookie: $TEST_AUTH_COOKIE" | jq . - -# Verify compliance writes +# Post-Issue#88: expect HTTP 202 (not 200) with render_id envelope. +RESP=$(curl -s -X POST -w '\n%{http_code}' "http://localhost:3099/api/render-workbook/2026-05-12-1234567" \ + -H "Cookie: $TEST_AUTH_COOKIE") +echo "$RESP" | head -1 | jq . +STATUS_CODE=$(echo "$RESP" | tail -1) +[ "$STATUS_CODE" = "202" ] || { echo "FAIL: expected 202, got $STATUS_CODE"; exit 1; } + +# Verify compliance writes — CRITICAL: Art.14 + Art.12 MUST fire pre-202, +# so these reads succeed immediately (not after waiting for render to finish). psql $TEST_DATABASE_URL < **Note**: this phase requires a valid auth cookie. If local auth setup is complex, skip and rely on Phase 4's code-level audit (verified in commit `81a66aa4` that the writes ARE wired); revisit on staging. diff --git a/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-preflight.md b/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-preflight.md index e13f14973..916aef0d9 100644 --- a/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-preflight.md +++ b/super-legal-mcp-refactored/docs/pending-updates/excel-code-execution-preflight.md @@ -77,27 +77,43 @@ Pick a completed session with `code_executions` rows already persisted (an exist SESSION_KEY=2026-05-XX-NNNNNNNN # pick a real one COOKIE='auth=...' # valid session cookie -curl -s -X POST "https://staging.api/api/render-workbook/$SESSION_KEY" \ - -H "Cookie: $COOKIE" | jq . +RESP=$(curl -s -X POST -w '\n%{http_code}' "https://staging.api/api/render-workbook/$SESSION_KEY" \ + -H "Cookie: $COOKIE") +echo "$RESP" | head -1 | jq . +echo "$RESP" | tail -1 # status code ``` -**Pass criteria**: HTTP 200 with envelope: +**Pass criteria (post-Issue #88 async-202)**: HTTP **202** with envelope: ```json { - "success": true, - "xlsxPath": "/app/reports//documents/.xlsx", - "auditResults": { "status": "PASS" | "FAIL", "checks": {...}, "warnings": [...] }, - "artifactId": "", - "durationMs": + "render_id": "", + "status": "pending", + "session_id": "", + "template_id": "auto" | "session-models" | …, + "created_at": "", + "status_poll_url": "/api/render-workbook//status", + "sse_url": "/api/stream?sessionId=" } ``` +Then poll the status endpoint until terminal: +```bash +RID=$(echo "$RESP" | head -1 | jq -r .render_id) +while true; do + STATUS=$(curl -s -H "Cookie: $COOKIE" "https://staging.api/api/render-workbook/$RID/status" | jq -r .status) + echo "render_status=$STATUS" + [[ "$STATUS" =~ ^(completed|failed|built|reconciled_failed)$ ]] && break + sleep 30 +done +``` + **Failure indicator**: - HTTP 503 `xlsx_renderer_disabled` → flag not flipped. - HTTP 404 `session_not_found` → bad session_key. -- HTTP 500 `render_failed` → check server logs for the underlying error (raw error NOT in response per security hardening). -- `{ success: false, error: '...' }` → renderer reached the catch block; check `xlsx_renders.error_message` for the reason. -- `{ skipped: 'no_template_matches' }` → session has no invoked models AND no dealType/dealSize metadata; pick a different session OR force a template via `?template=session-models`. +- HTTP 500 `dispatch_failed` → DB INSERT failed pre-202; check server logs (raw error NOT in response per security hardening). +- HTTP 429 `rate_limited` → per-user quota exhausted (10/hour, 50/day). +- Status endpoint returns `'failed'`/`'reconciled_failed'` → check `xlsx_renders.error_message` for the reason. +- Status endpoint returns `'pending'` for `>60 min` → setImmediate dispatcher errored; check server logs for `xlsx_manual_dispatch_failed` events. Reconciliation will sweep at `STUCK_BUILD_THRESHOLD_MIN`. ### Step 5 — Disk artifact validation (1 min) diff --git a/super-legal-mcp-refactored/src/server/claude-sdk-server.js b/super-legal-mcp-refactored/src/server/claude-sdk-server.js index 033d7ad92..8009eea8d 100644 --- a/super-legal-mcp-refactored/src/server/claude-sdk-server.js +++ b/super-legal-mcp-refactored/src/server/claude-sdk-server.js @@ -1079,7 +1079,7 @@ app.post('/api/enhance-prompt/approve', (req, res) => { res.json({ ok: true }); }); -// v4.5 Phase 1D + Phase 4 fix — Manual XLSX render trigger. +// v4.5 Phase 1D + Phase 4 fix + Issue #88 — Manual XLSX render trigger. // Protected by cookieAuthMiddleware (mounted at line 214 above). The sessions // table has no `user_id` column in this codebase (Phase 4 audit found the // original RBAC query would crash with 42P01); session ownership is not @@ -1087,8 +1087,19 @@ app.post('/api/enhance-prompt/approve', (req, res) => { // "any authenticated user in this tenant may trigger a render". Every // invocation is audited via Art. 12 access_log + Art. 14 persistIntervention // for traceability. +// +// Issue #88 (async-202): this endpoint now returns 202 immediately after +// pre-202 sync work (flag/quota/session/template validation + audit writes + +// xlsx_renders row INSERT at status='pending'). The actual render is +// dispatched via setImmediate, runs in the background, transitions the row +// pending→running→completed|failed, and publishes progress on the session +// SSE channel (/api/stream?sessionId=...). Caller subscribes to SSE for live +// updates OR polls GET /api/render-workbook/:renderId/status for terminal +// state. Response shape is BREAKING vs the pre-#88 sync envelope; see +// docs/api-reference.md "Document Generation — Workbook Rendering". app.post('/api/render-workbook/:sessionId', async (req, res) => { const { sessionId } = req.params; + const userIdLabel = String(req.user?.id || 'anon'); if (!/^\d{4}-\d{2}-\d{2}-\d+$/.test(sessionId)) { return res.status(400).json({ error: 'invalid_session_id' }); @@ -1100,13 +1111,14 @@ app.post('/api/render-workbook/:sessionId', async (req, res) => { // Phase 7 Issue #4: per-user quota — prevents cost runaway from authenticated // users firing the endpoint in a loop. Defaults: 10/hour, 50/day per user. + let xlsxRenderManualCallsCounter = null; try { const { enforceXlsxRenderQuota } = await import('../middleware/xlsxRenderQuota.js'); - const { xlsxRenderManualCalls } = await import('../utils/sdkMetrics.js'); - const userIdLabel = String(req.user?.id || 'anon'); + const metrics = await import('../utils/sdkMetrics.js'); + xlsxRenderManualCallsCounter = metrics.xlsxRenderManualCalls; const quotaCheck = enforceXlsxRenderQuota({ userId: userIdLabel }); if (!quotaCheck.allowed) { - try { xlsxRenderManualCalls.inc({ user_id: userIdLabel, outcome: 'rate_limited' }); } catch {} + try { xlsxRenderManualCallsCounter.inc({ user_id: userIdLabel, outcome: 'rate_limited' }); } catch {} return res.status(429).json({ error: 'rate_limited', reason: quotaCheck.reason, @@ -1116,25 +1128,29 @@ app.post('/api/render-workbook/:sessionId', async (req, res) => { limits: quotaCheck.limits, }); } - try { xlsxRenderManualCalls.inc({ user_id: userIdLabel, outcome: 'accepted' }); } catch {} } catch (qErr) { console.warn('[xlsxRenderer] quota check failed:', qErr.message); // Fail-open: quota errors don't block legitimate users. } const pool = getPool(); - if (!pool) return res.status(503).json({ error: 'database_unavailable' }); + if (!pool) { + try { xlsxRenderManualCallsCounter?.inc({ user_id: userIdLabel, outcome: 'dispatch_failed' }); } catch {} + return res.status(503).json({ error: 'database_unavailable' }); + } // Session must exist (cheap existence check — no schema-dependent columns). let sessionUuid = null; try { const r = await pool.query('SELECT id FROM sessions WHERE session_key = $1', [sessionId]); if (r.rows.length === 0) { + try { xlsxRenderManualCallsCounter?.inc({ user_id: userIdLabel, outcome: 'dispatch_failed' }); } catch {} return res.status(404).json({ error: 'session_not_found' }); } sessionUuid = r.rows[0].id; } catch (err) { console.error('[xlsxRenderer] session lookup failed:', err.message); + try { xlsxRenderManualCallsCounter?.inc({ user_id: userIdLabel, outcome: 'dispatch_failed' }); } catch {} return res.status(500).json({ error: 'session_lookup_failed' }); } @@ -1143,6 +1159,7 @@ app.post('/api/render-workbook/:sessionId', async (req, res) => { if (templateOverride) { const { XLSX_TEMPLATES } = await import('../config/xlsxTemplates/index.js'); if (!XLSX_TEMPLATES[templateOverride]) { + try { xlsxRenderManualCallsCounter?.inc({ user_id: userIdLabel, outcome: 'dispatch_failed' }); } catch {} return res.status(400).json({ error: 'template_not_found', requested: templateOverride, @@ -1153,7 +1170,8 @@ app.post('/api/render-workbook/:sessionId', async (req, res) => { // Phase 4 fix (D2): Art. 14 (human_interventions) — log the manual override // BEFORE dispatching renderer per master plan §6.6. Best-effort write; - // failure does not block the render. + // failure does not block the render. Critically: this fires PRE-202 so + // traceability is honest even if the background render subsequently fails. try { const { persistIntervention } = await import('../utils/hookDBBridge.js'); await persistIntervention(pool, sessionUuid, { @@ -1170,6 +1188,7 @@ app.post('/api/render-workbook/:sessionId', async (req, res) => { } // Phase 4 fix (D2): Art. 12 (access_log) — every render attempt logged. + // Also fires PRE-202. try { await pool.query( `INSERT INTO access_log (session_id, resource_type, resource_key, requester, purpose_code) @@ -1180,31 +1199,145 @@ app.post('/api/render-workbook/:sessionId', async (req, res) => { console.warn('[xlsxRenderer] Art. 12 audit write failed:', auditErr.message); } - // Phase 4 audit-fix HIGH: wrap dispatch in withSpan so Cloud Trace - // correlates Express request → manual render trigger → renderer's own - // `xlsx_render.lifecycle` child span. Without this, the renderer span - // becomes a root (orphan) trace; operators cannot pivot from access_log - // to the underlying sandbox call. + // Issue #88: pre-create xlsx_renders row at status='pending'. The renderer + // (called via setImmediate below) will transition it pending→running. This + // INSERT is FATAL — if it fails, we cannot honor the 202 contract (caller + // would have no render_id to poll). Return 500 instead. + let renderRowId = null; + let templateIdForResponse = templateOverride || 'auto'; try { - const { withSpan } = await import('../utils/sdkTracing.js'); - const { renderForSession } = await import('../utils/xlsxRenderer/index.js'); - const result = await withSpan( - 'xlsx_render.manual_trigger', - { - 'session.id': sessionId, - 'template.id': templateOverride || 'auto', - 'requester.id': String(req.user?.id || 'unknown'), - }, - async () => renderForSession(sessionId, { - templateOverride, - // No sseContext on manual endpoint — caller polls /api/reports for result - }), + const { insertXlsxRenderRow } = await import('../utils/xlsxRenderer/persist.js'); + // Pre-resolve the template so the inserted row carries a real template_id + // (not 'auto'). selectTemplate needs sessionContext; cheaper to let the + // renderer re-resolve and let the row carry 'auto' until the renderer + // updates audit_results. We pass templateOverride if explicit, else 'auto' + // as the placeholder — the renderer overwrites template_id on UPDATE if + // it auto-selected a different one. (Same behavior as the auto-trigger + // path before #88.) + const inserted = await insertXlsxRenderRow(sessionId, templateIdForResponse, null); + renderRowId = inserted.id; + if (!renderRowId) { + throw new Error('insertXlsxRenderRow returned null id (table missing?)'); + } + } catch (insertErr) { + console.error('[xlsxRenderer] pre-202 render row INSERT failed:', insertErr.message); + try { xlsxRenderManualCallsCounter?.inc({ user_id: userIdLabel, outcome: 'dispatch_failed' }); } catch {} + return res.status(500).json({ error: 'dispatch_failed', detail: 'render_row_insert_failed' }); + } + + // 202 Accepted — return immediately with the polling envelope. + const createdAt = new Date().toISOString(); + const responseEnvelope = { + render_id: renderRowId, + status: 'pending', + session_id: sessionId, + template_id: templateIdForResponse, + created_at: createdAt, + status_poll_url: `/api/render-workbook/${renderRowId}/status`, + sse_url: `/api/stream?sessionId=${sessionId}`, + }; + try { xlsxRenderManualCallsCounter?.inc({ user_id: userIdLabel, outcome: 'dispatched' }); } catch {} + res.status(202).json(responseEnvelope); + + // Fire-and-forget dispatch. The withSpan call lives INSIDE setImmediate so + // the span wraps the actual renderer execution (wrapping setImmediate itself + // would close the span immediately — the work happens after the callback + // schedules). Errors are logged via `xlsx_manual_dispatch_failed` so the + // operator can correlate via render_id; reconciliation will sweep stuck + // 'pending' rows after STUCK_BUILD_THRESHOLD_MIN as a backstop. + setImmediate(async () => { + try { + const { withSpan } = await import('../utils/sdkTracing.js'); + const { renderForSession } = await import('../utils/xlsxRenderer/index.js'); + await withSpan( + 'xlsx_render.manual_trigger', + { + 'session.id': sessionId, + 'template.id': templateIdForResponse, + 'requester.id': String(req.user?.id || 'unknown'), + 'render.id': renderRowId, + }, + async () => renderForSession(sessionId, { + templateOverride, + preCreatedRenderRowId: renderRowId, + // No sseContext from HTTP handler — the renderer's own SSE emits + // go to the session SSE channel via the global hookSSEBridge. + }), + ); + } catch (err) { + // Best-effort log; the row's render_status will be 'pending' or + // 'running' until reconciliation marks it 'reconciled_failed' after + // STUCK_BUILD_THRESHOLD_MIN. + try { + const { logError } = await import('../utils/sdkLogger.js'); + logError('xlsx_manual_dispatch_failed', { + render_id: renderRowId, + session_id: sessionId, + error: err?.message || String(err), + }); + } catch { + console.error('[xlsxRenderer] manual dispatch failed (render_id=' + renderRowId + '):', err); + } + } + }); +}); + +// Issue #88: status-polling endpoint paired with the async-202 POST above. +// Auth: same cookieAuthMiddleware as the POST (mounted at line 214). Single- +// tenant authz — any authenticated user may poll. Reads xlsx_renders by +// primary-key UUID; no Art.12 write for reads (matches the codebase's +// convention that audit_log captures resource *modifications*, not reads). +app.get('/api/render-workbook/:renderId/status', async (req, res) => { + const { renderId } = req.params; + + // UUID v4 format check — both standard UUIDs and ULIDs may flow through + // here depending on gen_random_uuid() output. Accept the standard 36-char + // 8-4-4-4-12 hyphenated form; reject obviously bogus IDs cheaply. + if (!/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i.test(renderId)) { + return res.status(400).json({ error: 'invalid_render_id' }); + } + + if (!featureFlags.XLSX_RENDERER) { + return res.status(503).json({ error: 'xlsx_renderer_disabled' }); + } + + const pool = getPool(); + if (!pool) return res.status(503).json({ error: 'database_unavailable' }); + + try { + // SELECT includes the 4 Option-B generated columns so callers can see + // audit_status / sheet_count / warnings_count / node_audit_ran without + // re-querying. Joins sessions to surface session_key for caller context. + const r = await pool.query( + `SELECT + xr.id AS render_id, + xr.render_status AS status, + s.session_key AS session_id, + xr.template_id, + xr.started_at, + xr.completed_at, + xr.artifact_id, + xr.audit_status, + xr.sheet_count, + xr.warnings_count, + xr.node_audit_ran, + xr.error_message, + xr.reconciliation_attempts + FROM xlsx_renders xr + JOIN sessions s ON s.id = xr.session_id + WHERE xr.id = $1`, + [renderId], ); - res.json(result); + if (r.rows.length === 0) { + return res.status(404).json({ error: 'render_not_found' }); + } + return res.json(r.rows[0]); } catch (err) { - // Audit-fix (security LOW from Phase 4 review): don't leak raw err.message. - console.error('[xlsxRenderer] manual render failed:', err); - res.status(500).json({ error: 'render_failed' }); + if (err.code === '42P01') { + return res.status(503).json({ error: 'xlsx_renders_table_missing' }); + } + console.error('[xlsxRenderer] status lookup failed:', err.message); + return res.status(500).json({ error: 'status_lookup_failed' }); } }); diff --git a/super-legal-mcp-refactored/src/utils/sdkMetrics.js b/super-legal-mcp-refactored/src/utils/sdkMetrics.js index 6868b9679..d9aa95984 100644 --- a/super-legal-mcp-refactored/src/utils/sdkMetrics.js +++ b/super-legal-mcp-refactored/src/utils/sdkMetrics.js @@ -393,8 +393,17 @@ export function setXlsxRenderQueueDepth(depth) { } // Phase 7 Issue #4: manual-endpoint call counter for cost-runaway visibility. -// outcome ∈ {accepted, rate_limited}. user_id label hashed via the application -// (use String(req.user?.id) — short numeric IDs, bounded cardinality ≤ ~500). +// outcome ∈ { +// accepted, // pre-Issue#88: quota OK, render kicked off synchronously +// rate_limited, // quota exceeded +// dispatched, // post-Issue#88: 202 returned, render queued via setImmediate +// dispatch_failed, // post-Issue#88: pre-202 failure (DB INSERT failed, session not found, etc.) +// } +// Both 'accepted' and 'dispatched' exist because the metric is also queried +// for proof-of-life (V7 post-deploy verify). 'accepted' will fall to 0 once +// async-202 ships everywhere; 'dispatched' is the new normal. +// user_id label hashed via the application (use String(req.user?.id) — short +// numeric IDs, bounded cardinality ≤ ~500). export const xlsxRenderManualCalls = new client.Counter({ name: 'claude_xlsx_render_manual_calls_total', help: 'Manual /api/render-workbook invocations by user_id and outcome', diff --git a/super-legal-mcp-refactored/src/utils/xlsxRenderer/index.js b/super-legal-mcp-refactored/src/utils/xlsxRenderer/index.js index 64c7e9c19..ffab6e72c 100644 --- a/super-legal-mcp-refactored/src/utils/xlsxRenderer/index.js +++ b/super-legal-mcp-refactored/src/utils/xlsxRenderer/index.js @@ -29,6 +29,7 @@ import { gatherSessionContext, gatherInputs, composeWorkbookSpec, composePhaseSp import { renderMultiTurn } from './multiTurnOrchestrator.js'; import { insertXlsxRenderRow, + transitionRenderToRunning, insertXlsxRenderInputs, updateXlsxRenderRow, persistWorkbookToDisk, @@ -101,7 +102,18 @@ export async function renderForSession(sessionId, options = {}) { } async function _renderForSessionInner(sessionId, options) { - const { templateOverride, sseContext, runAnalysis = defaultRunAnalysis, isReconciliation = false } = options; + const { + templateOverride, + sseContext, + runAnalysis = defaultRunAnalysis, + isReconciliation = false, + // Issue #88: when the async-202 manual endpoint pre-creates the + // xlsx_renders row (at status='pending'), it passes the row id here + // so this renderer transitions it pending→running instead of inserting + // a fresh row. Auto-trigger paths (agentStreamHandler.js SessionEnd) + // omit this option — the renderer falls back to inserting its own row. + preCreatedRenderRowId = null, + } = options; return withSpan('xlsx_render.lifecycle', { 'session.id': sessionId, 'template.id': templateOverride || 'auto', 'is_reconciliation': isReconciliation }, @@ -149,12 +161,27 @@ async function _renderForSessionInner(sessionId, options) { // 4. Gather inputs (code_executions, state files, citations, narrative) const inputs = await gatherInputs(sessionId, template); - // 5. Insert pending xlsx_renders row + xlsx_render_inputs junction rows - renderRow = await insertXlsxRenderRow(sessionId, template.id, template.version); + // 5. Insert pending xlsx_renders row + xlsx_render_inputs junction rows. + // Issue #88: if the manual async-202 endpoint pre-created the row, + // adopt that id; otherwise insert a new row (auto-trigger path). + if (preCreatedRenderRowId) { + renderRow = { id: preCreatedRenderRowId }; + } else { + renderRow = await insertXlsxRenderRow(sessionId, template.id, template.version); + } if (renderRow?.id && inputs.code_execution_ids?.length > 0) { await insertXlsxRenderInputs(renderRow.id, inputs.code_execution_ids); } + // 5b. Transition pending → running. Idempotent — the WHERE clause + // makes it a no-op if the row was already advanced by reconciliation + // or a duplicate caller. Marks the renderer-has-started boundary so + // reconciliation can tell "row pre-created, dispatcher crashed" + // (still 'pending') from "renderer actively working" ('running'). + if (renderRow?.id) { + await transitionRenderToRunning(renderRow.id); + } + // 6-7. Compose spec + invoke sandbox. Phase 9: branch on phaseSplit. // Multi-turn templates (full-deal, lbo, tax-memo, valuation-only) run // 3 sequential 5-min containers; small templates (session-models) diff --git a/super-legal-mcp-refactored/src/utils/xlsxRenderer/persist.js b/super-legal-mcp-refactored/src/utils/xlsxRenderer/persist.js index 24adb0bdb..258c6be55 100644 --- a/super-legal-mcp-refactored/src/utils/xlsxRenderer/persist.js +++ b/super-legal-mcp-refactored/src/utils/xlsxRenderer/persist.js @@ -70,6 +70,17 @@ export function parseAuditResults(result) { /** * Insert pending xlsx_renders row. Returns { id } or null if table missing * (Phase 2A.2 hasn't run yet — non-fatal). + * + * Initial status is `'pending'` — the row exists but the renderer hasn't + * started doing work yet. `_renderForSessionInner` calls + * `transitionRenderToRunning(id)` at the start of the actual render to + * advance to `'running'`. This separation supports the async-202 endpoint: + * the manual POST handler inserts the row at `'pending'`, returns 202 + * immediately, then the fire-and-forget renderer transitions to `'running'` + * when it actually starts. The auto-trigger path (agentStreamHandler) gets + * the same two-step transition for free — reconciliation can now tell + * "row pre-created, dispatcher crashed" (`'pending'`) from "renderer + * actively working" (`'running'`). */ export async function insertXlsxRenderRow(sessionId, templateId, templateVersion) { const pool = getPool(); @@ -78,7 +89,7 @@ export async function insertXlsxRenderRow(sessionId, templateId, templateVersion try { const result = await pool.query( `INSERT INTO xlsx_renders (session_id, template_id, template_version, render_status, started_at) - SELECT s.id, $2, $3, 'running', NOW() + SELECT s.id, $2, $3, 'pending', NOW() FROM sessions s WHERE s.session_key = $1 RETURNING id`, [sessionId, templateId, templateVersion] @@ -87,13 +98,45 @@ export async function insertXlsxRenderRow(sessionId, templateId, templateVersion } catch (err) { if (err.code === '42P01') { // Table doesn't exist — Phase 2A.2 schema not yet deployed. Non-fatal. - logError('xlsx_renders_table_missing', { sessionId, hint: 'Phase 2A.2 migration 016 not applied yet' }); + logError('xlsx_renders_table_missing', { sessionId, hint: 'Phase 2A.2 migration 017 not applied yet' }); return { id: null }; } throw err; } } +/** + * Transition a pre-created xlsx_renders row from `'pending'` to `'running'`. + * Idempotent — the WHERE clause makes it a no-op if the row was already + * advanced (by reconciliation or a duplicate call). Returns whether the row + * was actually transitioned, so callers can log a divergence if expected. + * + * Used by `_renderForSessionInner` at the start of every render to mark the + * renderer-has-started boundary. Non-fatal if table or row is missing. + */ +export async function transitionRenderToRunning(renderId) { + if (!renderId) return { transitioned: false }; + const pool = getPool(); + if (!pool) return { transitioned: false }; + + try { + const result = await pool.query( + `UPDATE xlsx_renders + SET render_status = 'running' + WHERE id = $1 AND render_status = 'pending' + RETURNING id`, + [renderId], + ); + return { transitioned: result.rowCount > 0 }; + } catch (err) { + if (err.code === '42P01') { + logError('xlsx_renders_transition_table_missing', { renderId }); + return { transitioned: false }; + } + throw err; + } +} + /** * Insert junction rows linking this render to its consumed code_executions. * Non-fatal if junction table missing. diff --git a/super-legal-mcp-refactored/test/sdk/xlsx-renderer-integration.test.js b/super-legal-mcp-refactored/test/sdk/xlsx-renderer-integration.test.js index f2e1c74d3..19052a293 100644 --- a/super-legal-mcp-refactored/test/sdk/xlsx-renderer-integration.test.js +++ b/super-legal-mcp-refactored/test/sdk/xlsx-renderer-integration.test.js @@ -1390,6 +1390,220 @@ async function testT26_GeneratedColumns() { await seeded.cleanup(); } +async function testT27_AsyncEndpointPendingInsert() { + console.log('\n─── T27: Issue #88 — insertXlsxRenderRow inserts at status=\'pending\' ───\n'); + const { dbAvailable, seedTestSession } = await import('./_xlsx-test-helpers.js'); + if (!dbAvailable()) { + skip('T27 async-202 pending insert', 'no TEST_DATABASE_URL set'); + return; + } + + const sessionKey = '2026-05-15-9600088'; + const seeded = await seedTestSession(sessionKey, { deal_type: 'M&A' }); + + // Call insertXlsxRenderRow directly (the building block the async-202 POST + // handler uses pre-202). Assert the row lands at 'pending', not 'running'. + const { insertXlsxRenderRow, transitionRenderToRunning } = + await import('../../src/utils/xlsxRenderer/persist.js'); + + const inserted = await insertXlsxRenderRow(sessionKey, 'session-models', null); + assert(inserted?.id, `T27: insertXlsxRenderRow returns id (got ${inserted?.id})`); + + const row = await seeded.pool.query( + `SELECT render_status, template_id FROM xlsx_renders WHERE id = $1`, + [inserted.id], + ); + assert( + row.rows[0]?.render_status === 'pending', + `T27 (#88): row inserted at 'pending' (got '${row.rows[0]?.render_status}')`, + ); + assert( + row.rows[0]?.template_id === 'session-models', + `T27: template_id preserved (got '${row.rows[0]?.template_id}')`, + ); + + // transitionRenderToRunning flips 'pending' → 'running' exactly once. + const t1 = await transitionRenderToRunning(inserted.id); + assert(t1.transitioned === true, `T27 (#88): first transition flips pending→running (got transitioned=${t1.transitioned})`); + + const after = await seeded.pool.query( + `SELECT render_status FROM xlsx_renders WHERE id = $1`, + [inserted.id], + ); + assert( + after.rows[0]?.render_status === 'running', + `T27 (#88): row at 'running' post-transition (got '${after.rows[0]?.render_status}')`, + ); + + // Idempotent — calling transitionRenderToRunning again is a no-op (WHERE + // clause filters on render_status='pending'). Returns transitioned=false. + const t2 = await transitionRenderToRunning(inserted.id); + assert( + t2.transitioned === false, + `T27 (#88): second transition is no-op (got transitioned=${t2.transitioned})`, + ); + + await seeded.cleanup(); +} + +async function testT28_StatusEndpointQueryShape() { + console.log('\n─── T28: Issue #88 — GET status endpoint query shape ───\n'); + const { dbAvailable, seedTestSession } = await import('./_xlsx-test-helpers.js'); + if (!dbAvailable()) { + skip('T28 status endpoint query', 'no TEST_DATABASE_URL set'); + return; + } + + const sessionKey = '2026-05-15-9600089'; + const seeded = await seedTestSession(sessionKey, { deal_type: 'M&A' }); + + // Insert 3 rows in different lifecycle states. + async function ins(status, audit) { + const { rows } = await seeded.pool.query( + `INSERT INTO xlsx_renders (session_id, template_id, render_status, audit_results, started_at, completed_at) + VALUES ($1, 't28', $2, $3::jsonb, NOW() - INTERVAL '1 min', $4) + RETURNING id`, + [seeded.sessionUuid, status, audit ? JSON.stringify(audit) : null, + (status === 'completed' || status === 'failed') ? new Date() : null], + ); + return rows[0].id; + } + const idPending = await ins('pending', null); + const idRunning = await ins('running', null); + const idDone = await ins('completed', { + status: 'PASS', + warnings: [], + node_audit: true, + merge_info: { sheet_count: 7 }, + }); + + // The query is the literal SELECT shipped in the GET /api/render-workbook/:renderId/status + // handler. Assert response shape per state. + const Q = `SELECT + xr.id AS render_id, + xr.render_status AS status, + s.session_key AS session_id, + xr.template_id, + xr.started_at, + xr.completed_at, + xr.artifact_id, + xr.audit_status, + xr.sheet_count, + xr.warnings_count, + xr.node_audit_ran, + xr.error_message, + xr.reconciliation_attempts + FROM xlsx_renders xr + JOIN sessions s ON s.id = xr.session_id + WHERE xr.id = $1`; + + const rowP = (await seeded.pool.query(Q, [idPending])).rows[0]; + assert( + rowP.status === 'pending' && rowP.session_id === sessionKey, + `T28: pending row — status='pending', session_id=session_key (got status=${rowP.status}, sid=${rowP.session_id})`, + ); + assert( + rowP.completed_at === null && rowP.audit_status === null && rowP.sheet_count === null, + `T28: pending row — completed_at/audit_status/sheet_count all NULL`, + ); + + const rowR = (await seeded.pool.query(Q, [idRunning])).rows[0]; + assert( + rowR.status === 'running' && rowR.completed_at === null, + `T28: running row — status='running', completed_at NULL`, + ); + + const rowD = (await seeded.pool.query(Q, [idDone])).rows[0]; + assert( + rowD.status === 'completed' && rowD.audit_status === 'PASS', + `T28: completed row — status='completed', audit_status='PASS' (got status=${rowD.status}, audit=${rowD.audit_status})`, + ); + assert( + rowD.sheet_count === 7 && rowD.warnings_count === 0 && rowD.node_audit_ran === true, + `T28: completed row — generated columns populated (sheet_count=7, warnings_count=0, node_audit_ran=true)`, + ); + assert( + rowD.completed_at !== null, + `T28: completed row — completed_at set`, + ); + + // 404-equivalent — bogus UUID returns 0 rows. + const bogus = await seeded.pool.query(Q, ['00000000-0000-0000-0000-000000000000']); + assert( + bogus.rows.length === 0, + `T28: bogus render_id → 0 rows (handler returns 404)`, + ); + + await seeded.cleanup(); +} + +async function testT29_RenderForSessionAdoptsPreCreatedRow() { + console.log('\n─── T29: Issue #88 — renderForSession adopts preCreatedRenderRowId ───\n'); + const { dbAvailable, seedTestSession, buildMinimalWorkbook } = + await import('./_xlsx-test-helpers.js'); + if (!dbAvailable()) { + skip('T29 preCreatedRenderRowId adoption', 'no TEST_DATABASE_URL set'); + return; + } + + const sessionKey = '2026-05-15-9600090'; + const seeded = await seedTestSession(sessionKey, { deal_type: 'M&A' }); + + // Pre-create the row (simulating the async-202 POST handler's pre-202 + // INSERT). Then call renderForSession with preCreatedRenderRowId. Assert + // (a) NO second row is inserted, (b) the pre-created row transitions + // pending → running → completed, (c) row carries the renderer's results. + const { insertXlsxRenderRow } = await import('../../src/utils/xlsxRenderer/persist.js'); + const inserted = await insertXlsxRenderRow(sessionKey, 'session-models', null); + const preRowId = inserted.id; + assert(preRowId, `T29: pre-created row id (got ${preRowId})`); + + const stub = await buildMinimalWorkbook(); + const { renderForSession } = await import('../../src/utils/xlsxRenderer/index.js'); + const result = await renderForSession(sessionKey, { + templateOverride: 'session-models', + preCreatedRenderRowId: preRowId, + runAnalysis: async () => ({ + success: true, + data: { + b64_xlsx: stub.b64, + xlsx_filename: 't29.xlsx', + audit_results: { status: 'PASS', checks: {} }, + workbook_size_bytes: stub.size, + }, + }), + }); + + assert(result?.success === true, `T29: renderForSession success (got success=${result?.success})`); + + // Assertion (a) — exactly one row exists for this session. + const count = await seeded.pool.query( + `SELECT COUNT(*)::int AS n FROM xlsx_renders WHERE session_id = $1`, + [seeded.sessionUuid], + ); + assert( + count.rows[0].n === 1, + `T29 (#88): preCreatedRenderRowId is adopted, NOT duplicated — exactly 1 row (got ${count.rows[0].n})`, + ); + + // Assertion (b) — the pre-created row is now in terminal 'completed' state. + const final = await seeded.pool.query( + `SELECT id, render_status, audit_results->>'status' AS audit_s + FROM xlsx_renders WHERE id = $1`, + [preRowId], + ); + assert( + final.rows[0]?.render_status === 'completed', + `T29 (#88): pre-created row reached 'completed' (got '${final.rows[0]?.render_status}')`, + ); + assert( + final.rows[0]?.audit_s === 'PASS', + `T29 (#88): pre-created row audit_results='PASS' (got '${final.rows[0]?.audit_s}')`, + ); + + await seeded.cleanup(); +} + async function testT24_CitationSeededSpec() { console.log('\n─── T24: seedTestSession citation opts → matcher-ready spec (Issue 2) ───\n'); const { dbAvailable, seedTestSession, SAMPLE_CITATION_SEED, buildMinimalWorkbook } = @@ -1473,6 +1687,9 @@ async function main() { await testT24_CitationSeededSpec(); await testT25_NamedRangeMigration(); await testT26_GeneratedColumns(); + await testT27_AsyncEndpointPendingInsert(); + await testT28_StatusEndpointQueryShape(); + await testT29_RenderForSessionAdoptsPreCreatedRow(); console.log('\n=== Summary ==='); console.log(`${PASS} Passed: ${passed}`); From c870786638c3f568016238b8d73cefe25f4909a7 Mon Sep 17 00:00:00 2001 From: Number531 <120485065+Number531@users.noreply.github.com> Date: Fri, 15 May 2026 13:35:26 -0400 Subject: [PATCH 2/2] fix(xlsx-renderer): #88 audit-pass gap-fills (CHANGELOG + art17 + 42703) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three findings from the independent code+docs audit of PR #133: 1. [GAP] CHANGELOG.md had NO entry for Issue #88 — plan explicitly listed this in the docs section but was missed. Added a full Unreleased entry documenting the BREAKING response-shape change, new GET status endpoint, state-machine cleanup ('pending' becomes meaningful), new metric outcomes, Idempotency-Key deferral rationale, and test-suite baseline. 2. [INCONSISTENCY] docs/compliance/xlsx-art17-scope.md:63 still referenced "migration 015 not applied" for human-interventions-metadata. Post-merge, that's migration 016 on this branch (was 015 pre-merge with main, where 015 is citation-verdicts). My earlier edit only touched line 17; this fixes the parallel reference at line 63. 3. [GAP] GET /api/render-workbook/:renderId/status caught 42P01 (table missing) but NOT 42703 (column missing). If migration 018 (generated columns) hasn't run while migration 017 has, the SELECT errors with 42703 because the handler explicitly projects 4 generated columns (audit_status, sheet_count, warnings_count, node_audit_ran). Without this catch, callers get an unhelpful 500. Now returns 503 xlsx_renders_schema_incomplete with a hint distinguishing the two missing-migration cases. Soft findings deferred (explicitly OK per the plan): - D.4/D.5: HTTP-level test harness for response-envelope shape. Plan chose function-direct tests for consistency with T13 / the rest of the suite. The T27/T28/T29 DB-state assertions cover behavior; HTTP envelope shape is asserted at the response.json() line and reviewed via the api-reference.md contract. Verification: - node --check src/server/claude-sdk-server.js → OK - Unit suite: 185 / 0 / 2 (unchanged from pre-audit-fix baseline) Co-Authored-By: Claude Opus 4.7 (1M context) --- super-legal-mcp-refactored/CHANGELOG.md | 29 +++++++++++++++++++ .../docs/compliance/xlsx-art17-scope.md | 2 +- .../src/server/claude-sdk-server.js | 11 +++++-- 3 files changed, 39 insertions(+), 3 deletions(-) diff --git a/super-legal-mcp-refactored/CHANGELOG.md b/super-legal-mcp-refactored/CHANGELOG.md index a6eeb68e3..e742773f6 100644 --- a/super-legal-mcp-refactored/CHANGELOG.md +++ b/super-legal-mcp-refactored/CHANGELOG.md @@ -4,6 +4,35 @@ All notable changes to the Super Legal MCP Server are documented in this file. ## [Unreleased] +### ⚠️ Changed (BREAKING) — `POST /api/render-workbook/:sessionId` is now async-202 (Issue #88, PR [#133](https://github.com/Number531/Legal-API/pull/133)) + +The manual XLSX render endpoint previously returned `HTTP 200` with the full sync envelope `{ success, xlsxPath, auditResults, artifactId, durationMs }`, holding the request thread up to `OVERALL_TIMEOUT_MS` (1200s) — which caused client-side timeouts (browser ~5min, undici 30s, CI/CD 60–300s), proxy idle-timeouts (Cloud Run / nginx), and concurrency-cap connection hold. + +**It now returns `HTTP 202 Accepted` immediately** with a fire-and-forget envelope: + +```json +{ "render_id": "", "status": "pending", "session_id": "", + "template_id": "...", "created_at": "", + "status_poll_url": "/api/render-workbook//status", + "sse_url": "/api/stream?sessionId=" } +``` + +The render runs in the background; caller polls the new `GET /api/render-workbook/:renderId/status` for terminal state OR subscribes to the existing session SSE channel `/api/stream?sessionId=…` for live progress (events of type `xlsx_render` with status ∈ `{pending, complete, failed}`). + +**State machine cleanup**: `xlsx_renders.render_status` rows now INSERT at `'pending'` (previously inserted at `'running'`). The renderer transitions `'pending' → 'running'` via the new `transitionRenderToRunning(id)` helper at the start of `_renderForSessionInner` — idempotent, `WHERE … AND render_status='pending'`. Reconciliation predicates (`WHERE render_status IN ('pending','running')`) already accept both states; auto-trigger path (`agentStreamHandler.js` SessionEnd) inherits the new state machine for free. + +**New endpoint** `GET /api/render-workbook/:renderId/status` — same `cookieAuthMiddleware` auth as POST. Returns the full `xlsx_renders` row plus all 4 generated columns (`audit_status`, `sheet_count`, `warnings_count`, `node_audit_ran`). + +**New metric outcomes** on `claude_xlsx_render_manual_calls_total`: `'dispatched'` (202 returned), `'dispatch_failed'` (pre-202 failure — DB INSERT failed, session not found, etc.). Existing `'rate_limited'` retained; `'accepted'` retired (no longer emitted by the new code). + +**Refinement deferred**: `Idempotency-Key` header support is NOT in v1 — zero precedent in the codebase, inconsistent with the "consume existing infrastructure" framing of the refactor. Duplicate POSTs within the rate-limit window WILL create duplicate `xlsx_renders` rows. Clients that retry on transport blips should track their own `render_id` and avoid retrying after a successful 202. (Issue #88 deferred-refinement-3.) + +**Behavior-neutral against current production**: `XLSX_RENDERER=false` in `flags.env` is the prod default; the endpoint returns `503 xlsx_renderer_disabled` regardless of POST/GET shape. The PR ships before the flag flips on. + +**Files**: `src/server/claude-sdk-server.js`, `src/utils/xlsxRenderer/persist.js`, `src/utils/xlsxRenderer/index.js`, `src/utils/sdkMetrics.js`, `test/sdk/xlsx-renderer-integration.test.js` (new T27/T28/T29 — 18 assertions), `docs/api-reference.md` (new "Document Generation — Workbook Rendering" section), 4 docs/pending-updates + 4 .claude/skills updated for observability alignment. + +**Test suite**: 185 pass / 0 fail / 2 skip (was 167 pre-#88). + ### Added — Sonnet-deep vs Haiku-deep A/B experiment (test-only, 2026-05-12, PR forthcoming) Empirical investigation of whether Haiku 4.5 could replace Sonnet 4.6 for `CITATION_DEEP_VERIFICATION=true` mode at ~4.4× cost reduction (measured, not 12× as agent-file comment estimated). Both arms ran with `EXA_WEB_TOOLS=true` for production parity; only the verifier subagent's model varied. diff --git a/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md b/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md index 01b438361..de5c5b8e1 100644 --- a/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md +++ b/super-legal-mcp-refactored/docs/compliance/xlsx-art17-scope.md @@ -60,7 +60,7 @@ longer possible. |---|---| | `xlsx_renders` table missing (flag off, stale deploy) | enumeration_status='failed', DELETE proceeds | | Enumeration query times out | enumeration_status='failed', DELETE proceeds | -| `persistIntervention()` fails (e.g., migration 015 not applied) | DELETE proceeds (audit write swallowed) | +| `persistIntervention()` fails (e.g., migration 016 — human-interventions-metadata, was 015 pre-merge with main — not applied) | DELETE proceeds (audit write swallowed) | | DELETE itself fails | Returns `{ success: false }` — the only true failure | This guarantees that an internal infrastructure problem can never block diff --git a/super-legal-mcp-refactored/src/server/claude-sdk-server.js b/super-legal-mcp-refactored/src/server/claude-sdk-server.js index 8009eea8d..4bb172127 100644 --- a/super-legal-mcp-refactored/src/server/claude-sdk-server.js +++ b/super-legal-mcp-refactored/src/server/claude-sdk-server.js @@ -1333,8 +1333,15 @@ app.get('/api/render-workbook/:renderId/status', async (req, res) => { } return res.json(r.rows[0]); } catch (err) { - if (err.code === '42P01') { - return res.status(503).json({ error: 'xlsx_renders_table_missing' }); + // 42P01 — table missing (migration 017 not applied) + // 42703 — column missing (migration 018 not applied; we SELECT the 4 generated + // columns audit_status / sheet_count / warnings_count / node_audit_ran + // which only exist after migration 018 lands) + if (err.code === '42P01' || err.code === '42703') { + return res.status(503).json({ + error: 'xlsx_renders_schema_incomplete', + hint: err.code === '42P01' ? 'migration 017 not applied' : 'migration 018 (generated columns) not applied', + }); } console.error('[xlsxRenderer] status lookup failed:', err.message); return res.status(500).json({ error: 'status_lookup_failed' });