Skip to content

feat: 添加 Exa AI 搜索适配器#333

Merged
claude-code-best merged 2 commits into
claude-code-best:mainfrom
realorange1994:feature/exa-search
Apr 23, 2026
Merged

feat: 添加 Exa AI 搜索适配器#333
claude-code-best merged 2 commits into
claude-code-best:mainfrom
realorange1994:feature/exa-search

Conversation

@realorange1994
Copy link
Copy Markdown

@realorange1994 realorange1994 commented Apr 23, 2026

  • 新增 ExaSearchAdapter,基于 MCP 协议调用 Exa 搜索 API
  • WebSearchTool 支持 num_results、livecrawl、search_type、context_max_characters 等高级选项
  • 非 Anthropic 官方 base URL 时默认使用 Exa 适配器

Summary by CodeRabbit

  • New Features

    • Added Exa search adapter as a new search backend with configurable options: result count, live-crawl behavior (fallback/preferred), search strategy (auto/fast/deep), and context sizing for LLM inputs.
    • Switched default search adapter for non-Anthropic endpoints from Bing to Exa.
  • Tests

    • Added comprehensive tests for Exa adapter parsing, error/abort handling, domain filtering, progress events, and updated adapter-selection expectations.

- 新增 ExaSearchAdapter,基于 MCP 协议调用 Exa 搜索 API
- WebSearchTool 支持 num_results、livecrawl、search_type、context_max_characters 等高级选项
- 非 Anthropic 官方 base URL 时默认使用 Exa 适配器
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 684a27c0-2509-45d7-a556-c941cef88a64

📥 Commits

Reviewing files that changed from the base of the PR and between 93bfdab and b966eef.

📒 Files selected for processing (1)
  • packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts

📝 Walkthrough

Walkthrough

Adds an Exa search adapter and extends WebSearchTool input schema with optional num_results, livecrawl, search_type, and context_max_characters; adapter factory now recognizes and can instantiate ExaSearchAdapter, which posts to Exa MCP, parses SSE/JSON responses, filters domains, and reports progress.

Changes

Cohort / File(s) Summary
Core WebSearchTool Updates
packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts
Input schema extended with optional num_results, livecrawl, search_type, context_max_characters; these map to adapter search options (numResults, livecrawl, searchType, contextMaxCharacters).
Adapter Type Definitions
packages/builtin-tools/src/tools/WebSearchTool/adapters/types.ts
SearchOptions interface gains optional numResults, livecrawl (`'fallback'
Adapter Factory / Selection
packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts
Imports and registers ExaSearchAdapter; adds 'exa' adapter key and branch to instantiate/cache it; fallback for non-first-party Anthropic base URLs now selects Exa when applicable.
New Exa Adapter Implementation
packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts
Adds ExaSearchAdapter with search(query, options): builds JSON-RPC POST to https://mcp.exa.ai/mcp, supports AbortController, parses SSE or direct JSON, extracts results from Title/URL/Content blocks (or markdown links / plain URLs), caps snippets, filters by allowedDomains/blockedDomains, and reports progress via onProgress.
Tests
packages/builtin-tools/src/tools/WebSearchTool/__tests__/adapterFactory.test.ts, packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts
Adapter factory test updated to expect Exa for certain non-first-party base URL cases; new comprehensive exaAdapter.test.ts exercises SSE/JSON parsing, multiple content formats, abort behavior, domain filtering, progress events, and verifies outbound MCP request structure including numResults, livecrawl, type, and contextMaxCharacters.

Sequence Diagram(s)

sequenceDiagram
    participant Client as WebSearchTool<br/>(Client)
    participant Adapter as ExaSearchAdapter
    participant MCP as Exa MCP API<br/>mcp.exa.ai
    participant Parser as Response<br/>Parser

    Client->>Adapter: search(query, options)
    Note over Adapter: Derive MCP params<br/>(numResults, livecrawl, type, contextMaxCharacters)
    Adapter->>MCP: POST JSON-RPC tools/call<br/>(Accept: text/event-stream)
    MCP->>Adapter: SSE stream (data: {...}\n...)
    Adapter->>Parser: Parse SSE lines or fallback JSON
    Parser->>Adapter: Extracted results {title, url, snippet}
    Adapter->>Adapter: Filter by allowed/blocked domains
    Adapter->>Client: Resolve Promise<SearchResult[]>
    Adapter->>Client: onProgress events (query_update, search_results_received)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🐰 I sniffed the SSE streams in the night,
Sent queries to Exa with nimble delight.
Titles and URLs tumble like hay,
Filters and snippets hop on their way.
A rabbit-approved search, quick and bright!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is in Chinese and refers to adding an Exa AI search adapter, which accurately describes the main changes in the PR (new ExaSearchAdapter implementation and WebSearchTool enhancements).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts (1)

101-105: ⚠️ Potential issue | 🟡 Minor

Stale comment after fallback change.

The comment still describes the fallback as "Bing", but the factory now falls back to Exa.

-    // Always enabled — the adapter factory selects the appropriate backend
-    // (API server-side search or Bing fallback) based on provider capabilities.
+    // Always enabled — the adapter factory selects the appropriate backend
+    // (API server-side search or Exa fallback) based on provider capabilities.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts` around lines
101 - 105, Update the stale comment inside the isEnabled() method of
WebSearchTool to reflect the current fallback backend: replace "Bing fallback"
with "Exa fallback" (or otherwise mention Exa) and keep the explanation that the
adapter factory selects the appropriate backend (API server-side search or Exa)
based on provider capabilities; ensure the comment text remains concise and
accurate while leaving the method implementation (return true) unchanged.
🧹 Nitpick comments (2)
packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts (1)

9-10: Avoid mocking src/utils/errors — it's a pure module.

src/utils/errors.js exports a plain AbortError class; per the repo guideline, only modules with side effects or third-party network libraries should be mocked. Importing the real AbortError here keeps the test closer to production behavior and removes the fragility around mock.restore() flagged above.

As per coding guidelines: "Mock only modules with side effects (...) - do not mock pure functions or pure data modules".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts`
around lines 9 - 10, Remove the mock of the pure module 'src/utils/errors'
(remove the mock.module('src/utils/errors.js', _abortMock) and
mock.module('src/utils/errors', _abortMock) lines), import the real AbortError
from 'src/utils/errors' in exaAdapter.test.ts, and update any tests that
referenced _abortMock to use the real AbortError class; also delete or adjust
any mock.restore() related to _abortMock since the module is no longer mocked.
This keeps the test using the real AbortError export and avoids fragile
mock/restore handling.
packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts (1)

161-172: Edge case: URL: line without a Title: uses the URL as the title.

When an Exa block omits Title:, the fallback titleMatch?.[1]?.trim() ?? urlMatch[1] uses the untrimmed URL (with any trailing whitespace captured by [^\s]+ — safe here) as the title. Minor UX nit: this yields identical title/url fields which then render awkwardly in markdown output. Consider deriving a nicer default (e.g., hostname + path) or at minimum keeping consistency with the plain-URL fallback. Non-blocking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts` around
lines 161 - 172, The code pushes a result with titleMatch?.[1]?.trim() ??
urlMatch[1], which can produce an untrimmed/raw URL as the title; update the
fallback so the title is a cleaner form of the URL: use urlMatch[1].trim() or
derive a nicer default (e.g., new URL(urlMatch[1].trim()).hostname + new
URL(...).pathname) before pushing into results; modify the results.push block
replacing the fallback reference to urlMatch[1] with the trimmed/derived title
and keep url set to urlMatch[1].trim(), referencing the existing titleMatch,
urlMatch, contentMatch and results variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts`:
- Around line 31-34: The abort listener attached to the incoming signal in the
code (the call to signal.addEventListener('abort', () =>
abortController.abort(), { once: true })) can leak when the outer signal
outlives a single search; create a local cleanup function that removes the
listener via signal.removeEventListener('abort', handler) and call it
immediately on all exit paths and in a finally block around the try/catch so the
handler and the created abortController can be GC'd; reference the existing
abortController variable and the signal parameter, store the listener as a named
handler instead of an inline lambda, attach it with
signal.addEventListener(...), and ensure cleanup() is invoked before each early
return/throw and inside finally.

In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts`:
- Around line 25-30: The change to adapter selection where adapterKey is
computed from envAdapter and isFirstPartyAnthropicBaseUrl now defaults to 'exa'
instead of 'bing' when WEB_SEARCH_ADAPTER is unset, which is a breaking
default-behavior change; update release notes/changelog to explicitly call out
this default fallback change and instruct downstream users to set
WEB_SEARCH_ADAPTER=bing if they want the previous behavior, and also add a short
inline code comment near the adapterKey computation (referencing envAdapter,
adapterKey, and isFirstPartyAnthropicBaseUrl) documenting the intentional
default switch so future readers understand this behavioral change.

In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts`:
- Around line 26-45: The numeric schema fields allow non-integer, non-finite,
zero/negative values; update the zod validators to enforce integer and bounds:
for num_results in WebSearchTool.ts change its schema to require finite integers
and a sensible range (e.g., .int().finite().min(1).max(100)) so callers can't
pass 0, negatives, floats, or NaN; do the same for context_max_characters (e.g.,
.int().finite().min(1).max(100000)) to prevent invalid/huge values being
forwarded to the Exa MCP payload and keep the existing .optional()/.describe()
metadata.

---

Outside diff comments:
In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts`:
- Around line 101-105: Update the stale comment inside the isEnabled() method of
WebSearchTool to reflect the current fallback backend: replace "Bing fallback"
with "Exa fallback" (or otherwise mention Exa) and keep the explanation that the
adapter factory selects the appropriate backend (API server-side search or Exa)
based on provider capabilities; ensure the comment text remains concise and
accurate while leaving the method implementation (return true) unchanged.

---

Nitpick comments:
In `@packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts`:
- Around line 9-10: Remove the mock of the pure module 'src/utils/errors'
(remove the mock.module('src/utils/errors.js', _abortMock) and
mock.module('src/utils/errors', _abortMock) lines), import the real AbortError
from 'src/utils/errors' in exaAdapter.test.ts, and update any tests that
referenced _abortMock to use the real AbortError class; also delete or adjust
any mock.restore() related to _abortMock since the module is no longer mocked.
This keeps the test using the real AbortError export and avoids fragile
mock/restore handling.

In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts`:
- Around line 161-172: The code pushes a result with titleMatch?.[1]?.trim() ??
urlMatch[1], which can produce an untrimmed/raw URL as the title; update the
fallback so the title is a cleaner form of the URL: use urlMatch[1].trim() or
derive a nicer default (e.g., new URL(urlMatch[1].trim()).hostname + new
URL(...).pathname) before pushing into results; modify the results.push block
replacing the fallback reference to urlMatch[1] with the trimmed/derived title
and keep url set to urlMatch[1].trim(), referencing the existing titleMatch,
urlMatch, contentMatch and results variables.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 86cbaa0b-82a3-420e-9a70-a7e9e9f810a1

📥 Commits

Reviewing files that changed from the base of the PR and between 1173a62 and 93bfdab.

📒 Files selected for processing (6)
  • packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts
  • packages/builtin-tools/src/tools/WebSearchTool/__tests__/adapterFactory.test.ts
  • packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts
  • packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts
  • packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts
  • packages/builtin-tools/src/tools/WebSearchTool/adapters/types.ts

Comment on lines +31 to +34
const abortController = new AbortController()
if (signal) {
signal.addEventListener('abort', () => abortController.abort(), { once: true })
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Abort listener is never removed — leaks on the caller's signal.

When signal outlives a single search (e.g., a request-scoped or long-lived AbortController reused across calls), every search() invocation attaches a permanent abort listener and the closures/AbortController they capture can't be GC'd until the outer signal fires. { once: true } only helps if abort actually happens.

Proposed fix
-    const abortController = new AbortController()
-    if (signal) {
-      signal.addEventListener('abort', () => abortController.abort(), { once: true })
-    }
+    const abortController = new AbortController()
+    const onExternalAbort = () => abortController.abort()
+    if (signal) {
+      signal.addEventListener('abort', onExternalAbort, { once: true })
+    }
+    const cleanup = () => signal?.removeEventListener('abort', onExternalAbort)

…and call cleanup() in a finally around the try/catch (and before each early return/throw path after this point).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts` around
lines 31 - 34, The abort listener attached to the incoming signal in the code
(the call to signal.addEventListener('abort', () => abortController.abort(), {
once: true })) can leak when the outer signal outlives a single search; create a
local cleanup function that removes the listener via
signal.removeEventListener('abort', handler) and call it immediately on all exit
paths and in a finally block around the try/catch so the handler and the created
abortController can be GC'd; reference the existing abortController variable and
the signal parameter, store the listener as a named handler instead of an inline
lambda, attach it with signal.addEventListener(...), and ensure cleanup() is
invoked before each early return/throw and inside finally.

Comment on lines +25 to +30
const adapterKey =
envAdapter === 'api' || envAdapter === 'bing' || envAdapter === 'brave'
envAdapter === 'api' || envAdapter === 'bing' || envAdapter === 'brave' || envAdapter === 'exa'
? envAdapter
: isFirstPartyAnthropicBaseUrl()
? 'api'
: 'bing'
: 'exa'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Behavior change: default fallback switches from Bing to Exa.

Existing users on a non-first-party Anthropic base URL with WEB_SEARCH_ADAPTER unset will silently switch backend from Bing to Exa. This is intentional per the PR description, but it is a breaking change in default behavior — please call this out in release notes / changelog so downstream users can set WEB_SEARCH_ADAPTER=bing if they want to preserve the previous behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts` around
lines 25 - 30, The change to adapter selection where adapterKey is computed from
envAdapter and isFirstPartyAnthropicBaseUrl now defaults to 'exa' instead of
'bing' when WEB_SEARCH_ADAPTER is unset, which is a breaking default-behavior
change; update release notes/changelog to explicitly call out this default
fallback change and instruct downstream users to set WEB_SEARCH_ADAPTER=bing if
they want the previous behavior, and also add a short inline code comment near
the adapterKey computation (referencing envAdapter, adapterKey, and
isFirstPartyAnthropicBaseUrl) documenting the intentional default switch so
future readers understand this behavioral change.

Comment on lines +26 to +45
num_results: z
.number()
.optional()
.describe('Number of search results to return (default: 8)'),
livecrawl: z
.enum(['fallback', 'preferred'])
.optional()
.describe(
"Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')",
),
search_type: z
.enum(['auto', 'fast', 'deep'])
.optional()
.describe(
"Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search",
),
context_max_characters: z
.number()
.optional()
.describe('Maximum characters for context string optimized for LLMs (default: 10000)'),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add bounds/integer validation for numeric options.

num_results and context_max_characters accept any finite number, including negatives, zero, floats, and NaN — these are forwarded verbatim into the Exa MCP payload and could produce 4xx responses or surprising behavior. Constrain them at the schema layer.

Proposed fix
     num_results: z
-      .number()
+      .int()
+      .min(1)
+      .max(100)
       .optional()
       .describe('Number of search results to return (default: 8)'),
@@
     context_max_characters: z
-      .number()
+      .int()
+      .min(1)
       .optional()
       .describe('Maximum characters for context string optimized for LLMs (default: 10000)'),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
num_results: z
.number()
.optional()
.describe('Number of search results to return (default: 8)'),
livecrawl: z
.enum(['fallback', 'preferred'])
.optional()
.describe(
"Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')",
),
search_type: z
.enum(['auto', 'fast', 'deep'])
.optional()
.describe(
"Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search",
),
context_max_characters: z
.number()
.optional()
.describe('Maximum characters for context string optimized for LLMs (default: 10000)'),
num_results: z
.int()
.min(1)
.max(100)
.optional()
.describe('Number of search results to return (default: 8)'),
livecrawl: z
.enum(['fallback', 'preferred'])
.optional()
.describe(
"Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')",
),
search_type: z
.enum(['auto', 'fast', 'deep'])
.optional()
.describe(
"Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search",
),
context_max_characters: z
.int()
.min(1)
.optional()
.describe('Maximum characters for context string optimized for LLMs (default: 10000)'),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts` around lines
26 - 45, The numeric schema fields allow non-integer, non-finite, zero/negative
values; update the zod validators to enforce integer and bounds: for num_results
in WebSearchTool.ts change its schema to require finite integers and a sensible
range (e.g., .int().finite().min(1).max(100)) so callers can't pass 0,
negatives, floats, or NaN; do the same for context_max_characters (e.g.,
.int().finite().min(1).max(100000)) to prevent invalid/huge values being
forwarded to the Exa MCP payload and keep the existing .optional()/.describe()
metadata.

@claude-code-best claude-code-best merged commit 7811888 into claude-code-best:main Apr 23, 2026
2 checks passed
dfsfdfse pushed a commit to dfsfdfse/claude-code that referenced this pull request Apr 25, 2026
…xa-search

feat: 添加 Exa AI 搜索适配器
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants