Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added — Exa A3 base plumbing: additionalQueries forwarding (flag-gated, 2026-05-08)

First implementation step of Avenue A3 from the [Exa April 2026 plan](super-legal-mcp-refactored/docs/pending-updates/Exa-April-2026-updates.md) (§4.3). Adds optional `additionalQueries: string[]` parameter to `BaseWebSearchClient.executeExaSearch()`, forwarded to Exa's `/search` API as a top-level Deep parameter when `EXA_ADDITIONAL_QUERIES` flag is on. Default `false` — zero production behavior change until per-domain adopters opt-in (Layer 3 follow-up PRs).

Exa's `/search` with `type: 'deep'` already auto-generates 2–5 query variations server-side on every call. A3 enables the **override** — substituting Exa's generic auto-expansion with caller-supplied, domain-tuned variations. The cost/latency win comes from collapsing client-side fan-out (3 sequential Deep calls → 1 Deep call); the quality/determinism win comes from encoding legal-domain knowledge Exa's generic LLM wouldn't produce.

- **Flag** `EXA_ADDITIONAL_QUERIES` (default `false`) in `featureFlags.js`; rollback is single env-var flip.
- **Validator** `_validateAdditionalQueries(queries)` — drops non-strings/empties, dedupes after trim, throws on >5 unique entries (Exa hard cap).
- **D9 metric** `claude_exa_additional_queries_count` Prometheus Histogram (buckets `[1..5]`, label `domain`) for Layer 3 adopter rollout tracking.
- **Live verification** Test 6 added to `exa-live-verification.mjs` — confirms wire format accepted by Exa (200 OK, parseable response). 6/6 live shapes pass (was 5/5).
- **12 new unit tests** + 5/5 baseline + 6/6 live API. Zero regressions vs. baseline.

Layer 3 (per-domain adopter refactors for SEC, CourtListener, FederalRegister + 4 candidate adopters from audit) deferred to follow-up PRs per plan §5.5 P1 checklist. CourtListener requires opinion-ID extractor refactor first (hard prerequisite, separate PR).

Includes hygiene fix: PR [#102](https://github.com/Number531/Legal-API/pull/102) (commit `e25a14b6`, 2026-05-07) corrected stale W5-004 test assertion in `citation-websearch-verifier.test.js` — pointed at relocated `agentStreamHandler.js:288` (Phase 2B refactor) instead of `claude-sdk-server.js`. Caught by §5.7 baseline-capture protocol during this branch's worktree provisioning. 61/61 verifier tests now pass (was 60/61).

Project-level changelog entry: [`super-legal-mcp-refactored/CHANGELOG.md`](super-legal-mcp-refactored/CHANGELOG.md) v7.1.0.

### Added — Operator skills v7.0.2 expansion (`.claude/skills/`, 2026-05-08)

Ten work items from the v7.0.2 plan extend the operator-skills suite with three Tier-1 automations, two Tier-2 skills, four extensions to existing skills, plus one shared utility. Doc/tooling-only — no application code, no schema, no flag changes.
Expand Down
68 changes: 68 additions & 0 deletions super-legal-mcp-refactored/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,74 @@

All notable changes to the Super Legal MCP Server are documented in this file.

## [7.1.0] - 2026-05-08 — Exa A3 base plumbing: additionalQueries forwarding (flag-gated)

First implementation step of Avenue A3 from the [Exa April 2026 plan](docs/pending-updates/Exa-April-2026-updates.md) (§4.3). Adds optional `additionalQueries` parameter to `BaseWebSearchClient.executeExaSearch()`, forwarded to Exa's `/search` API as a top-level Deep parameter. Feature-flagged behind `EXA_ADDITIONAL_QUERIES=false` (default) — zero production behavior change until per-domain adopters opt-in (Layer 3 follow-up PRs).

### Why

Exa's `/search` with `type: 'deep'` already auto-generates 2–5 query variations server-side on every call (per Exa's [Deep launch announcement](https://exa.ai/docs/changelog/new-deep-search-type), 2025-11-20). A3 enables the **override** — substituting Exa's generic auto-expansion with caller-supplied, domain-tuned variations. The cost/latency win comes from collapsing client-side fan-out (3 sequential Deep calls → 1 Deep call with bundled variations); the quality/determinism win comes from encoding legal-domain knowledge that Exa's generic LLM wouldn't produce. Three layers shape the value: (1) Exa auto-expansion (already happening), (2) A3 plumbing (this release — inert by default), (3) per-domain adopters (deferred to follow-up PRs per §5.5 P1 checklist).

### Added

- **`EXA_ADDITIONAL_QUERIES`** feature flag in `src/config/featureFlags.js`, default `false`. Rollback: single env-var flip.
- **`additionalQueries: string[]` option** on `BaseWebSearchClient.executeExaSearch()`. Validated and forwarded to Exa as a top-level request body field when (a) flag is on, (b) `requestBody.type` ∈ `{deep, deep-lite, deep-reasoning}`, (c) validated array is non-empty.
- **`_validateAdditionalQueries(queries)` helper** — pure (no this-state). Returns `[]` for non-array input. Drops non-string entries, drops empty/whitespace-only strings, deduplicates after trim. Throws on >5 unique non-empty entries with explicit `"exceeds Exa API cap"` message.
- **D9 metric** `claude_exa_additional_queries_count` (Prometheus Histogram, buckets `[1..5]`, label `domain`) registered in `src/utils/sdkMetrics.js`. Emitted from `BaseWebSearchClient.executeExaSearch` whenever forwarding fires.
- **Test 6 in `test/sdk/exa-live-verification.mjs`** — extends live-verification harness with `additionalQueries:[3 variants]`. Confirms wire format accepted by Exa (200 OK, parseable response). 6/6 live shapes now pass (was 5/5).
- **12 new unit tests** in `test/sdk/exa-content-strategy.test.js` — 7 validator unit tests + 5 forwarding integration tests.

### Hard constraints (per Exa spec, enforced in code)

- Maximum 5 entries per request — validator throws on 6th.
- Top-level parameter (NOT under `contents`).
- Only valid on Deep variants — defensive gate via `DEEP_VARIANTS.includes(requestBody.type)`.

### Hygiene

- Fixed stale W5-004 test in `test/sdk/citation-websearch-verifier.test.js` (PR [#102](https://github.com/Number531/Legal-API/pull/102), commit `e25a14b6`, 2026-05-07). Test asserted on `claude-sdk-server.js`, but Phase 2B refactor (commit `9d319f52`) relocated the injection to `agentStreamHandler.js:288`. Caught by §5.7 baseline-capture protocol. 61/61 verifier tests now pass (was 60/61 on `main`).

### Files modified

| File | Change | LoC |
|---|---|---:|
| `src/config/featureFlags.js` | Add `EXA_ADDITIONAL_QUERIES` flag | +9 |
| `src/api-clients/BaseWebSearchClient.js` | Validator, destructure, conditional forward, metric emission | +52 |
| `src/utils/sdkMetrics.js` | Register Histogram + export emitter | +30 |
| `test/sdk/exa-content-strategy.test.js` | 12 new tests | +174 |
| `test/sdk/exa-live-verification.mjs` | Test 6 — live API additionalQueries shape | +23 |

**Total:** 5 files, +288 lines. Zero deletions on production code paths.

### Testing

- **Unit:** 130/130 across 4 Exa test suites. +12 new, zero regressions.
- **Live API:** 6/6 request shapes accepted.
- **End-to-end with flag forced ON:** verified via temporary script — full pipeline works. 2 of 3 result URLs unique vs. control single-query, demonstrating server-side parallelization broadens results.
- **§5.7 zero-degradation gate:** baseline 118/118 unit + 5/5 live → post-work 130/130 unit + 6/6 live. `zero_degradation: true`.

### Deferred to follow-up PRs (Layer 3 adoption)

- **CourtListener opinion-ID extractor refactor** — hard prerequisite, separate PR.
- **Per-domain adopter refactors** for `SECWebSearchClient`, `CourtListenerWebSearchClient`, `FederalRegisterWebSearchClient` — one PR per adopter, low-risk-first.
- **Four additional natural adopters** identified by audit (§10.1.5 of plan): ClinicalTrials, CongressGov, EPA, RegulationsGov.
- **Per-environment ramp:** `EXA_ADDITIONAL_QUERIES=true` flag flip per-domain in staging → 7-day bake → production.

### Risk profile

- **Default OFF** → zero production impact until adopter PRs land.
- **Additive contract** → flag-off path byte-identical to today.
- **Blast radius** → `BaseWebSearchClient` only. All 32 subclasses inherit transparently; none modified.
- **Rollback** → single env-var flip.

### References

- Plan: [`docs/pending-updates/Exa-April-2026-updates.md`](docs/pending-updates/Exa-April-2026-updates.md) §4.3, §5.5 P1, §5.5.5 (D9), §5.5.6, §5.7
- Exa API: [Deep search reference](https://docs.exa.ai/reference/search), [Deep launch announcement](https://exa.ai/docs/changelog/new-deep-search-type)
- Hygiene PR: [#102](https://github.com/Number531/Legal-API/pull/102)

---

## [7.0.3] - 2026-05-08 — Operator skills v7.0.2 plan: 10 work items

Ten work items extending the operator-skills suite. Doc/tooling-only — no application code, no schema, no flag changes. Closes the v7.0.2 plan in [`/Users/ej/.claude/plans/floating-cooking-flute.md`](../.claude/plans/floating-cooking-flute.md).
Expand Down
52 changes: 52 additions & 0 deletions super-legal-mcp-refactored/src/api-clients/BaseWebSearchClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import { SearchQualityMixin } from './SearchQualityMixin.js';
import { ContentStrategy } from './ContentStrategy.js';
import { extractFromSummary, fallbackToTextParsing, sanitizeData } from './schemas/SchemaValidator.js';
import { featureFlags } from '../config/featureFlags.js';
import { recordExaAdditionalQueriesCount } from '../utils/sdkMetrics.js';

export class BaseWebSearchClient extends SearchQualityMixin {
constructor(rateLimiter, exaApiKey, contentStrategy = null) {
Expand Down Expand Up @@ -114,6 +116,34 @@ export class BaseWebSearchClient extends SearchQualityMixin {
return new Promise(resolve => setTimeout(resolve, ms));
}

/**
* Validate and sanitize an `additionalQueries` array per Exa API spec.
* Drops non-string entries, drops empty/whitespace strings, deduplicates after trim.
* Throws if more than 5 unique non-empty entries remain (Exa enforces a 5-cap).
*
* Pure helper — no side effects, no this-state. Safe to call without flag context.
*
* @param {string[]|null|undefined} queries - Caller-supplied query variations
* @returns {string[]} Validated array (possibly empty)
* @throws {Error} If more than 5 unique non-empty queries are provided
*/
_validateAdditionalQueries(queries) {
if (!Array.isArray(queries)) return [];
const cleaned = [...new Set(
queries
.filter(q => typeof q === 'string')
.map(q => q.trim())
.filter(q => q.length > 0)
)];
const MAX_ADDITIONAL_QUERIES = 5;
if (cleaned.length > MAX_ADDITIONAL_QUERIES) {
throw new Error(
`additionalQueries exceeds Exa API cap: ${cleaned.length} unique non-empty entries provided, max ${MAX_ADDITIONAL_QUERIES}.`
);
}
return cleaned;
}

/**
* Standardized Exa search execution with ContentStrategy
* @param {string} query - Search query
Expand Down Expand Up @@ -146,6 +176,7 @@ export class BaseWebSearchClient extends SearchQualityMixin {
endPublishedDate,
category,
effort = 'base',
additionalQueries, // A3 — server-side Deep query parallelization (Exa April 2026 plan)
_retryCount = 0 // Internal retry tracking
} = options;

Expand Down Expand Up @@ -193,6 +224,27 @@ export class BaseWebSearchClient extends SearchQualityMixin {
requestBody.effort = effort;
}

// A3 (Exa April 2026 plan): forward `additionalQueries` only when:
// 1. EXA_ADDITIONAL_QUERIES flag is on (additive contract — flag-off behavior is identical to today)
// 2. requestBody.type ∈ Deep variants (Exa silently ignores on non-deep, but we gate defensively)
// 3. Validator returns non-empty array (drops empties, dedupes, throws on >5 per Exa cap)
// requestBody.type is currently always 'deep' (line ~176), but the type check is preserved
// so future callers passing 'deep-lite' or 'deep-reasoning' (or non-deep) are handled correctly.
const DEEP_VARIANTS = ['deep', 'deep-lite', 'deep-reasoning'];
if (
featureFlags.EXA_ADDITIONAL_QUERIES &&
additionalQueries !== undefined &&
DEEP_VARIANTS.includes(requestBody.type)
) {
const validated = this._validateAdditionalQueries(additionalQueries);
if (validated.length > 0) {
requestBody.additionalQueries = validated;
// D9 (Exa April 2026 plan §5.5.5): observe variation count for adoption tracking.
// Domain label defaults to 'unknown' when caller didn't pass it; non-blocking.
recordExaAdditionalQueriesCount(validated.length, domain || 'unknown');
}
}

// Exa API timeout: 60s per-request (doesn't block 10-min iterative research)
const EXA_TIMEOUT_MS = 60000;
const controller = new AbortController();
Expand Down
9 changes: 9 additions & 0 deletions super-legal-mcp-refactored/src/config/featureFlags.js
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,15 @@ export const featureFlags = {
// When false (default): WebFetch + WebSearch preserved, zero behavior change
// Rollback: EXA_WEB_TOOLS=false
EXA_WEB_TOOLS: envBool(process.env.EXA_WEB_TOOLS, false),
// Exa April 2026 plan A3 — server-side Deep query parallelization via additionalQueries.
// When true: BaseWebSearchClient.executeExaSearch() forwards an `additionalQueries:[...]`
// option as a top-level Exa API parameter, valid only on Deep variants
// (type ∈ {deep, deep-lite, deep-reasoning}). Exa parallelizes the variations
// server-side, deduplicates, and ranks. Hard cap from Exa spec: max 5 entries.
// When false (default): caller-supplied additionalQueries are silently dropped
// at request build — zero behavior change vs. today.
// Rollback: EXA_ADDITIONAL_QUERIES=false.
EXA_ADDITIONAL_QUERIES: envBool(process.env.EXA_ADDITIONAL_QUERIES, false),
// Citation chat router — session-scoped RAG Q&A over embedded report chunks
// Requires EMBEDDING_PERSISTENCE=true (uses embedQuery + searchSimilar)
// Uses Messages API streaming (not Agent SDK) — no hooks, no subagents
Expand Down
30 changes: 30 additions & 0 deletions super-legal-mcp-refactored/src/utils/sdkMetrics.js
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,20 @@ const gateCheckResults = new client.Counter({
labelNames: ['agent_type', 'status']
});

// Exa April 2026 plan A3 — additionalQueries forwarding observability.
// Histogram of caller-supplied `additionalQueries` variation counts forwarded to
// Exa Deep search. Only observed when EXA_ADDITIONAL_QUERIES flag is on AND a
// caller passes a non-empty validated array AND requestBody.type ∈ Deep variants.
// Buckets match Exa's hard cap (max 5 entries per request).
// Used to track Layer 3 adopter rollout: zero observations means no adopter is
// passing the param yet (base plumbing inert by design — see §4.3 of plan).
const exaAdditionalQueriesCount = new client.Histogram({
name: 'claude_exa_additional_queries_count',
help: 'Variation count per Exa /search call when A3 additionalQueries forwarding fires (Exa April 2026 plan §4.3)',
labelNames: ['domain'],
buckets: [1, 2, 3, 4, 5]
});

// Wave 4.5: KG build lifecycle metrics
const kgBuildTotal = new client.Counter({
name: 'claude_kg_build_total',
Expand Down Expand Up @@ -468,6 +482,22 @@ export function recordCircuitBreakerTrip(domain = 'unknown') {
circuitBreakerTrips.inc({ domain });
}

/**
* Record the count of caller-supplied `additionalQueries` variations forwarded
* to Exa Deep search (Exa April 2026 plan A3, D9 compliance dimension).
*
* Called from BaseWebSearchClient.executeExaSearch() inside the A3 forwarding
* block — only when EXA_ADDITIONAL_QUERIES flag is on AND the validated array
* is non-empty AND requestBody.type ∈ Deep variants.
*
* @param {number} count - Number of unique non-empty variations (1–5 per Exa cap)
* @param {string} domain - Caller's domain label (e.g. 'sec', 'courtlistener'),
* 'unknown' if not provided
*/
export function recordExaAdditionalQueriesCount(count, domain = 'unknown') {
exaAdditionalQueriesCount.observe({ domain }, count);
}

export function recordError(code, path = 'unknown') {
errorCounter.inc({ code, path });
}
Expand Down
Loading