Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions super-legal-mcp-refactored/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,76 @@

All notable changes to the Super Legal MCP Server are documented in this file.

## [7.2.0] - 2026-05-08 — Exa A3 Phase A: orchestrator-authored additionalQueries via exa_web_search (flag-gated)

Implements Phase A of the orchestrator-authored Layer 3 architecture for Avenue A3. Plumbs the `additionalQueries` parameter through the catch-all `exa_web_search` MCP tool's direct-fetch path. Validator extracted to a shared module for reuse across BaseWebSearchClient (v7.1.0 path) and exa_web_search (this release).

**Architectural decision (per plan §4.3, revised 2026-05-08):** Layer 3 adopts orchestrator-authored variations. Subagent crafts variations at tool-call time with full user-context awareness; platform plumbs them through MCP surface to Exa. Exa documents this approach as recommended: *"provide your own query variations using the `additionalQueries` parameter for even better results"* ([docs.exa.ai/changelog/new-deep-search-type](https://docs.exa.ai/changelog/new-deep-search-type)). This release covers the catch-all `exa_web_search` tool only — per-domain tools, skill template updates, and subagent prompt iteration deferred to follow-up PRs.

### Added

- **`src/utils/exaQueryValidator.js`** (NEW, ~50 LoC) — shared validator extracted from `BaseWebSearchClient._validateAdditionalQueries`. Returns `[]` for non-array input, drops empties, dedupes after trim, throws on >5 entries. Reused by both BaseWebSearchClient and exa_web_search.
- **`exa_web_search` tool inputSchema** — adds optional `additionalQueries: array<string>, maxItems: 5` field. Description teaches the orchestrator to supply 2–3 domain-tuned variations targeting distinct legal axes (NOT paraphrases). Per Exa documentation, 2–3 is the recommended count.
- **`exa_web_search` tool implementation** — destructures `args.additionalQueries`, validates via shared helper, forwards to Exa request body top-level when (a) `EXA_ADDITIONAL_QUERIES` flag is on, (b) requestType is Deep variant, (c) validated array non-empty. Emits D9 metric `claude_exa_additional_queries_count` with `domain: 'exa_web_search'` label.
- **7 new unit tests** in `test/sdk/exa-web-search-additional-queries.test.js`: flag-OFF silent drop, flag-ON forwarding, omitted-by-caller behavior, empty-string sanitization, cap-violation throw before fetch, dedup-then-cap acceptance, hybrid metadata always present.
- **Live verification Test 7** — verifies the exa_web_search direct-fetch body shape against live Exa API. 7/7 live shapes pass (was 6/6).

### Changed

- **`BaseWebSearchClient._validateAdditionalQueries`** — now a thin wrapper delegating to the shared `validateAdditionalQueries` from `exaQueryValidator.js`. Behavior identical to v7.1.0; method-form preserved for backward compat with PR #106 unit tests.

### Hard constraints (per Exa spec, enforced)

- Max 5 entries (validator throws on 6th, before fetch).
- Top-level parameter (NOT under `contents`).
- Only forwarded when flag on AND non-empty validated array AND Deep variant.

### Track A finding (audit reversal)

The plan §5.5 P1 checklist's "CourtListener opinion-ID extractor refactor" (Track A) is **NOT REQUIRED**. Inspection of `CourtListenerWebSearchClient.js:230–242` confirms the citation-relationship extraction is text-section-based (regex match on `/Cited by/i` vs `/Cites to/i`), not array-position-based. The relationship type is inferred from the document's section header, which is invariant to result-array order under Exa's server-side dedup. The earlier audit's concern was over-cautious. No refactor needed.

### Deferred to follow-up PRs

- Per-domain tool plumbing (search_sec_filings, search_court_opinions, search_federal_register, search_clinical_trials, search_congressional_records, etc.) — 5–10 additional tools, ~1–2 days
- `BaseHybridClient.js:177–180` `websearchArgs` extension for additionalQueries (couples with per-domain plumbing)
- Subagent prompt updates teaching axis-generation pattern (1–2 days + iteration)
- `api-integration` and `subagent-scaffold` skill template updates so future clients inherit A3 support automatically (~0.5 day)
- `EXA_ADDITIONAL_QUERIES_AB_SAMPLE` flag for staging-bake quality validation per plan §4.3 Validation A/B Protocol

### Files modified

| File | Change | LoC |
|---|---|---:|
| `src/utils/exaQueryValidator.js` | NEW shared validator module | +50 |
| `src/api-clients/BaseWebSearchClient.js` | Delegate `_validateAdditionalQueries` to shared helper | +1 / -22 / +3 |
| `src/tools/toolDefinitions.js` | Add `additionalQueries` to `exa_web_search` inputSchema | +9 |
| `src/tools/toolImplementations.js` | Import validator+metric; forward in body construction | +3 imports, +30 forwarding |
| `test/sdk/exa-web-search-additional-queries.test.js` | NEW — 7 tests | +160 |
| `test/sdk/exa-live-verification.mjs` | Add Test 7 | +21 |

**Total:** 6 files, ~255 net additions.

### Testing

- **Unit:** 137/137 across 5 Exa test suites (40 + 5 + 24 + 61 + 7 new). Zero regressions vs. baseline 130/130.
- **Live API:** 7/7 request shapes accepted.
- **§5.7 zero-degradation gate:** baseline 130/130 unit + 6/6 live → post-work 137/137 unit + 7/7 live. `zero_degradation: true`.

### Risk profile

- Default OFF → zero production impact until flag flipped
- Additive contract — flag-off path byte-identical to v7.1.0
- Blast radius — `exa_web_search` MCP tool only; per-domain tools NOT yet covered
- Rollback — single env-var flip

### References

- Plan: [`docs/pending-updates/Exa-April-2026-updates.md`](docs/pending-updates/Exa-April-2026-updates.md) §4.3 (Layer 3 Architecture — Orchestrator-Authored), §6 OQ-7 (resolved)
- Exa API: [Deep launch announcement](https://docs.exa.ai/changelog/new-deep-search-type), [Evaluation guide (2-3 variations recommended)](https://exa.ai/docs/reference/evaluating-exa-search)
- Predecessor: PR #106 (v7.1.0 BaseWebSearchClient base plumbing)

---

## [7.1.0] - 2026-05-08 — Exa A3 base plumbing: additionalQueries forwarding (flag-gated)

First implementation step of Avenue A3 from the [Exa April 2026 plan](docs/pending-updates/Exa-April-2026-updates.md) (§4.3). Adds optional `additionalQueries` parameter to `BaseWebSearchClient.executeExaSearch()`, forwarded to Exa's `/search` API as a top-level Deep parameter. Feature-flagged behind `EXA_ADDITIONAL_QUERIES=false` (default) — zero production behavior change until per-domain adopters opt-in (Layer 3 follow-up PRs).
Expand Down
23 changes: 6 additions & 17 deletions super-legal-mcp-refactored/src/api-clients/BaseWebSearchClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import { ContentStrategy } from './ContentStrategy.js';
import { extractFromSummary, fallbackToTextParsing, sanitizeData } from './schemas/SchemaValidator.js';
import { featureFlags } from '../config/featureFlags.js';
import { recordExaAdditionalQueriesCount } from '../utils/sdkMetrics.js';
import { validateAdditionalQueries } from '../utils/exaQueryValidator.js';

export class BaseWebSearchClient extends SearchQualityMixin {
constructor(rateLimiter, exaApiKey, contentStrategy = null) {
Expand Down Expand Up @@ -118,30 +119,18 @@ export class BaseWebSearchClient extends SearchQualityMixin {

/**
* Validate and sanitize an `additionalQueries` array per Exa API spec.
* Drops non-string entries, drops empty/whitespace strings, deduplicates after trim.
* Throws if more than 5 unique non-empty entries remain (Exa enforces a 5-cap).
*
* Pure helper — no side effects, no this-state. Safe to call without flag context.
* Thin wrapper around `validateAdditionalQueries` from `../utils/exaQueryValidator.js`
* (extracted 2026-05-08 for reuse across BaseWebSearchClient + the catch-all
* `exa_web_search` MCP tool direct-fetch path). Method-form preserved for
* backward compatibility with PR #106 unit tests.
*
* @param {string[]|null|undefined} queries - Caller-supplied query variations
* @returns {string[]} Validated array (possibly empty)
* @throws {Error} If more than 5 unique non-empty queries are provided
*/
_validateAdditionalQueries(queries) {
if (!Array.isArray(queries)) return [];
const cleaned = [...new Set(
queries
.filter(q => typeof q === 'string')
.map(q => q.trim())
.filter(q => q.length > 0)
)];
const MAX_ADDITIONAL_QUERIES = 5;
if (cleaned.length > MAX_ADDITIONAL_QUERIES) {
throw new Error(
`additionalQueries exceeds Exa API cap: ${cleaned.length} unique non-empty entries provided, max ${MAX_ADDITIONAL_QUERIES}.`
);
}
return cleaned;
return validateAdditionalQueries(queries);
}

/**
Expand Down
6 changes: 6 additions & 0 deletions super-legal-mcp-refactored/src/tools/toolDefinitions.js
Original file line number Diff line number Diff line change
Expand Up @@ -3422,6 +3422,12 @@ export const exaSearchTools = featureFlags.EXA_WEB_TOOLS ? [
type: "array",
items: { type: "string" },
description: "Restrict results to these domains (e.g., ['sec.gov', 'ftc.gov'])."
},
additionalQueries: {
type: "array",
items: { type: "string", minLength: 1 },
maxItems: 5,
description: "OPTIONAL — Caller-supplied query variations for Exa Deep parallelization. When provided AND the EXA_ADDITIONAL_QUERIES feature flag is enabled, REPLACES Exa's server-side auto-expansion with these variations. Per Exa documentation (https://docs.exa.ai/changelog/new-deep-search-type), 2-3 variations is the recommended count for best Deep search results — supply 2-3 domain-tuned variations targeting DISTINCT axes (jurisdiction, doctrine, regulatory section, time window, etc.), NOT paraphrases of the primary query. If you cannot identify 2 distinct axes, omit this parameter and let Exa auto-expand."
}
},
required: ["query"]
Expand Down
32 changes: 30 additions & 2 deletions super-legal-mcp-refactored/src/tools/toolImplementations.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ import { runPythonAnalysis, isCodeExecutionBridgeEnabled } from './codeExecution
import { getStore } from '../server/requestContext.js';
import { featureFlags } from '../config/featureFlags.js';
import { createRawSourceService } from '../utils/rawSource/index.js';
import { validateAdditionalQueries } from '../utils/exaQueryValidator.js';
import { recordExaAdditionalQueriesCount } from '../utils/sdkMetrics.js';

// Wave 1 (#3, Correction 1.3): lazy singleton for raw-source archive.
// Instantiated on first tool call with RAW_SOURCE_ARCHIVE=true.
Expand Down Expand Up @@ -967,9 +969,28 @@ export function createToolImplementations(clients, conversationBridge = null, or
return { error: 'Query parameter is required and must be a non-empty string', results: [], _hybrid_metadata: { source: 'exa_search', result_count: 0, confidence: 0, timestamp: new Date().toISOString() } };
}

// A3 (Exa April 2026 plan §4.3): forward caller-supplied `additionalQueries`
// to Exa as a top-level Deep parameter when:
// 1. EXA_ADDITIONAL_QUERIES flag is enabled (additive contract)
// 2. type ∈ Deep variants (always 'deep' here, but preserved for safety)
// 3. Validator returns non-empty sanitized array (drops empties, dedupes,
// throws on >5 per Exa cap — caller misuse surfaces immediately)
// Per Exa: caller-supplied variations REPLACE Exa's auto-expansion. Recommended
// count is 2-3 per https://docs.exa.ai/changelog/new-deep-search-type.
const DEEP_VARIANTS = ['deep', 'deep-lite', 'deep-reasoning'];
const requestType = 'deep';
let validatedAdditionalQueries = [];
if (
featureFlags.EXA_ADDITIONAL_QUERIES &&
args.additionalQueries !== undefined &&
DEEP_VARIANTS.includes(requestType)
) {
validatedAdditionalQueries = validateAdditionalQueries(args.additionalQueries);
}

const body = {
query: args.query,
type: 'deep',
type: requestType,
numResults: args.num_results || 10,
contents: {
text: false,
Expand All @@ -980,9 +1001,16 @@ export function createToolImplementations(clients, conversationBridge = null, or
...(args.category ? { category: args.category } : {}),
...(args.start_published_date ? { startPublishedDate: args.start_published_date } : {}),
...(args.end_published_date ? { endPublishedDate: args.end_published_date } : {}),
...(args.allowed_domains?.length ? { includeDomains: args.allowed_domains } : {})
...(args.allowed_domains?.length ? { includeDomains: args.allowed_domains } : {}),
...(validatedAdditionalQueries.length > 0 ? { additionalQueries: validatedAdditionalQueries } : {})
};

// D9 metric (§5.5.5): observe variation count when forwarding fires.
// Domain label = 'exa_web_search' since this is the catch-all path.
if (validatedAdditionalQueries.length > 0) {
recordExaAdditionalQueriesCount(validatedAdditionalQueries.length, 'exa_web_search');
}

const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 60000);

Expand Down
54 changes: 54 additions & 0 deletions super-legal-mcp-refactored/src/utils/exaQueryValidator.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/**
* exaQueryValidator.js — shared validator for Exa `additionalQueries` parameter.
*
* Extracted from BaseWebSearchClient._validateAdditionalQueries (PR #106) so
* both BaseWebSearchClient (per-domain WebSearchClient path) AND the catch-all
* `exa_web_search` MCP tool (direct-fetch path) can use the same validation
* logic without duplication.
*
* Per Exa documentation (https://docs.exa.ai/changelog/new-deep-search-type):
* - Maximum 5 entries per request (Exa-py SDK cap; TS SDK allows 10)
* - Recommended count: 2–3 for best Deep search quality
* - Caller-supplied variations REPLACE Exa's auto-expansion (Layer 1)
*/

const MAX_ADDITIONAL_QUERIES = 5;

/**
* Validate and sanitize a caller-supplied `additionalQueries` array.
*
* Pure helper — no side effects, no this-state. Safe to call without context.
*
* Behavior:
* - Returns [] for null, undefined, non-array input (silently — caller misuse
* by type, not by value, is treated as "no variations supplied").
* - Drops non-string entries.
* - Trims whitespace and drops empty/whitespace-only entries.
* - Deduplicates by trimmed value (case-sensitive).
* - Throws Error if more than 5 unique non-empty entries remain — caller
* misuse by quantity is loud, not silent, so the orchestrator can correct.
*
* @param {string[]|null|undefined} queries - Caller-supplied query variations
* @returns {string[]} Validated array (possibly empty)
* @throws {Error} If more than 5 unique non-empty queries are provided
*/
export function validateAdditionalQueries(queries) {
if (!Array.isArray(queries)) return [];

const cleaned = [...new Set(
queries
.filter(q => typeof q === 'string')
.map(q => q.trim())
.filter(q => q.length > 0)
)];

if (cleaned.length > MAX_ADDITIONAL_QUERIES) {
throw new Error(
`additionalQueries exceeds Exa API cap: ${cleaned.length} unique non-empty entries provided, max ${MAX_ADDITIONAL_QUERIES}.`
);
}

return cleaned;
}

export { MAX_ADDITIONAL_QUERIES };
22 changes: 22 additions & 0 deletions super-legal-mcp-refactored/test/sdk/exa-live-verification.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,28 @@ await testExaRequest('additionalQueries (A3)', {
}
});

// Test 7: A3 Phase A — exa_web_search direct-fetch shape with additionalQueries
// - Mirrors the request body that the exa_web_search MCP tool builds when
// caller passes additionalQueries (orchestrator-authored path)
// - Recommended count: 2-3 per Exa documentation
// - Body shape includes highlights+summary (the exa_web_search defaults)
console.log('\n7. A3 Phase A — exa_web_search direct-fetch shape with additionalQueries');
await testExaRequest('exa_web_search additionalQueries (A3 Phase A)', {
query: 'M&A merger antitrust enforcement 2024',
type: 'deep',
numResults: 3,
additionalQueries: [
'HSR Act premerger notification 2024',
'DOJ antitrust merger challenge horizontal'
],
contents: {
text: false,
highlights: { maxCharacters: 3000, query: 'M&A merger antitrust' },
summary: { query: 'M&A merger antitrust enforcement' },
maxAgeHours: 24
}
});

// Summary
console.log(`\n=== Results: ${passed} passed, ${failed} failed ===`);

Expand Down
Loading