feat: 添加 Exa AI 搜索适配器#333
Conversation
- 新增 ExaSearchAdapter,基于 MCP 协议调用 Exa 搜索 API - WebSearchTool 支持 num_results、livecrawl、search_type、context_max_characters 等高级选项 - 非 Anthropic 官方 base URL 时默认使用 Exa 适配器
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds an Exa search adapter and extends WebSearchTool input schema with optional Changes
Sequence Diagram(s)sequenceDiagram
participant Client as WebSearchTool<br/>(Client)
participant Adapter as ExaSearchAdapter
participant MCP as Exa MCP API<br/>mcp.exa.ai
participant Parser as Response<br/>Parser
Client->>Adapter: search(query, options)
Note over Adapter: Derive MCP params<br/>(numResults, livecrawl, type, contextMaxCharacters)
Adapter->>MCP: POST JSON-RPC tools/call<br/>(Accept: text/event-stream)
MCP->>Adapter: SSE stream (data: {...}\n...)
Adapter->>Parser: Parse SSE lines or fallback JSON
Parser->>Adapter: Extracted results {title, url, snippet}
Adapter->>Adapter: Filter by allowed/blocked domains
Adapter->>Client: Resolve Promise<SearchResult[]>
Adapter->>Client: onProgress events (query_update, search_results_received)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts (1)
101-105:⚠️ Potential issue | 🟡 MinorStale comment after fallback change.
The comment still describes the fallback as "Bing", but the factory now falls back to Exa.
- // Always enabled — the adapter factory selects the appropriate backend - // (API server-side search or Bing fallback) based on provider capabilities. + // Always enabled — the adapter factory selects the appropriate backend + // (API server-side search or Exa fallback) based on provider capabilities.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts` around lines 101 - 105, Update the stale comment inside the isEnabled() method of WebSearchTool to reflect the current fallback backend: replace "Bing fallback" with "Exa fallback" (or otherwise mention Exa) and keep the explanation that the adapter factory selects the appropriate backend (API server-side search or Exa) based on provider capabilities; ensure the comment text remains concise and accurate while leaving the method implementation (return true) unchanged.
🧹 Nitpick comments (2)
packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts (1)
9-10: Avoid mockingsrc/utils/errors— it's a pure module.
src/utils/errors.jsexports a plainAbortErrorclass; per the repo guideline, only modules with side effects or third-party network libraries should be mocked. Importing the realAbortErrorhere keeps the test closer to production behavior and removes the fragility aroundmock.restore()flagged above.As per coding guidelines: "Mock only modules with side effects (...) - do not mock pure functions or pure data modules".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts` around lines 9 - 10, Remove the mock of the pure module 'src/utils/errors' (remove the mock.module('src/utils/errors.js', _abortMock) and mock.module('src/utils/errors', _abortMock) lines), import the real AbortError from 'src/utils/errors' in exaAdapter.test.ts, and update any tests that referenced _abortMock to use the real AbortError class; also delete or adjust any mock.restore() related to _abortMock since the module is no longer mocked. This keeps the test using the real AbortError export and avoids fragile mock/restore handling.packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts (1)
161-172: Edge case:URL:line without aTitle:uses the URL as the title.When an Exa block omits
Title:, the fallbacktitleMatch?.[1]?.trim() ?? urlMatch[1]uses the untrimmed URL (with any trailing whitespace captured by[^\s]+— safe here) as the title. Minor UX nit: this yields identicaltitle/urlfields which then render awkwardly in markdown output. Consider deriving a nicer default (e.g., hostname + path) or at minimum keeping consistency with the plain-URL fallback. Non-blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts` around lines 161 - 172, The code pushes a result with titleMatch?.[1]?.trim() ?? urlMatch[1], which can produce an untrimmed/raw URL as the title; update the fallback so the title is a cleaner form of the URL: use urlMatch[1].trim() or derive a nicer default (e.g., new URL(urlMatch[1].trim()).hostname + new URL(...).pathname) before pushing into results; modify the results.push block replacing the fallback reference to urlMatch[1] with the trimmed/derived title and keep url set to urlMatch[1].trim(), referencing the existing titleMatch, urlMatch, contentMatch and results variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts`:
- Around line 31-34: The abort listener attached to the incoming signal in the
code (the call to signal.addEventListener('abort', () =>
abortController.abort(), { once: true })) can leak when the outer signal
outlives a single search; create a local cleanup function that removes the
listener via signal.removeEventListener('abort', handler) and call it
immediately on all exit paths and in a finally block around the try/catch so the
handler and the created abortController can be GC'd; reference the existing
abortController variable and the signal parameter, store the listener as a named
handler instead of an inline lambda, attach it with
signal.addEventListener(...), and ensure cleanup() is invoked before each early
return/throw and inside finally.
In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts`:
- Around line 25-30: The change to adapter selection where adapterKey is
computed from envAdapter and isFirstPartyAnthropicBaseUrl now defaults to 'exa'
instead of 'bing' when WEB_SEARCH_ADAPTER is unset, which is a breaking
default-behavior change; update release notes/changelog to explicitly call out
this default fallback change and instruct downstream users to set
WEB_SEARCH_ADAPTER=bing if they want the previous behavior, and also add a short
inline code comment near the adapterKey computation (referencing envAdapter,
adapterKey, and isFirstPartyAnthropicBaseUrl) documenting the intentional
default switch so future readers understand this behavioral change.
In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts`:
- Around line 26-45: The numeric schema fields allow non-integer, non-finite,
zero/negative values; update the zod validators to enforce integer and bounds:
for num_results in WebSearchTool.ts change its schema to require finite integers
and a sensible range (e.g., .int().finite().min(1).max(100)) so callers can't
pass 0, negatives, floats, or NaN; do the same for context_max_characters (e.g.,
.int().finite().min(1).max(100000)) to prevent invalid/huge values being
forwarded to the Exa MCP payload and keep the existing .optional()/.describe()
metadata.
---
Outside diff comments:
In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts`:
- Around line 101-105: Update the stale comment inside the isEnabled() method of
WebSearchTool to reflect the current fallback backend: replace "Bing fallback"
with "Exa fallback" (or otherwise mention Exa) and keep the explanation that the
adapter factory selects the appropriate backend (API server-side search or Exa)
based on provider capabilities; ensure the comment text remains concise and
accurate while leaving the method implementation (return true) unchanged.
---
Nitpick comments:
In `@packages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.ts`:
- Around line 9-10: Remove the mock of the pure module 'src/utils/errors'
(remove the mock.module('src/utils/errors.js', _abortMock) and
mock.module('src/utils/errors', _abortMock) lines), import the real AbortError
from 'src/utils/errors' in exaAdapter.test.ts, and update any tests that
referenced _abortMock to use the real AbortError class; also delete or adjust
any mock.restore() related to _abortMock since the module is no longer mocked.
This keeps the test using the real AbortError export and avoids fragile
mock/restore handling.
In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts`:
- Around line 161-172: The code pushes a result with titleMatch?.[1]?.trim() ??
urlMatch[1], which can produce an untrimmed/raw URL as the title; update the
fallback so the title is a cleaner form of the URL: use urlMatch[1].trim() or
derive a nicer default (e.g., new URL(urlMatch[1].trim()).hostname + new
URL(...).pathname) before pushing into results; modify the results.push block
replacing the fallback reference to urlMatch[1] with the trimmed/derived title
and keep url set to urlMatch[1].trim(), referencing the existing titleMatch,
urlMatch, contentMatch and results variables.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 86cbaa0b-82a3-420e-9a70-a7e9e9f810a1
📒 Files selected for processing (6)
packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.tspackages/builtin-tools/src/tools/WebSearchTool/__tests__/adapterFactory.test.tspackages/builtin-tools/src/tools/WebSearchTool/__tests__/exaAdapter.test.tspackages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.tspackages/builtin-tools/src/tools/WebSearchTool/adapters/index.tspackages/builtin-tools/src/tools/WebSearchTool/adapters/types.ts
| const abortController = new AbortController() | ||
| if (signal) { | ||
| signal.addEventListener('abort', () => abortController.abort(), { once: true }) | ||
| } |
There was a problem hiding this comment.
Abort listener is never removed — leaks on the caller's signal.
When signal outlives a single search (e.g., a request-scoped or long-lived AbortController reused across calls), every search() invocation attaches a permanent abort listener and the closures/AbortController they capture can't be GC'd until the outer signal fires. { once: true } only helps if abort actually happens.
Proposed fix
- const abortController = new AbortController()
- if (signal) {
- signal.addEventListener('abort', () => abortController.abort(), { once: true })
- }
+ const abortController = new AbortController()
+ const onExternalAbort = () => abortController.abort()
+ if (signal) {
+ signal.addEventListener('abort', onExternalAbort, { once: true })
+ }
+ const cleanup = () => signal?.removeEventListener('abort', onExternalAbort)…and call cleanup() in a finally around the try/catch (and before each early return/throw path after this point).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/exaAdapter.ts` around
lines 31 - 34, The abort listener attached to the incoming signal in the code
(the call to signal.addEventListener('abort', () => abortController.abort(), {
once: true })) can leak when the outer signal outlives a single search; create a
local cleanup function that removes the listener via
signal.removeEventListener('abort', handler) and call it immediately on all exit
paths and in a finally block around the try/catch so the handler and the created
abortController can be GC'd; reference the existing abortController variable and
the signal parameter, store the listener as a named handler instead of an inline
lambda, attach it with signal.addEventListener(...), and ensure cleanup() is
invoked before each early return/throw and inside finally.
| const adapterKey = | ||
| envAdapter === 'api' || envAdapter === 'bing' || envAdapter === 'brave' | ||
| envAdapter === 'api' || envAdapter === 'bing' || envAdapter === 'brave' || envAdapter === 'exa' | ||
| ? envAdapter | ||
| : isFirstPartyAnthropicBaseUrl() | ||
| ? 'api' | ||
| : 'bing' | ||
| : 'exa' |
There was a problem hiding this comment.
Behavior change: default fallback switches from Bing to Exa.
Existing users on a non-first-party Anthropic base URL with WEB_SEARCH_ADAPTER unset will silently switch backend from Bing to Exa. This is intentional per the PR description, but it is a breaking change in default behavior — please call this out in release notes / changelog so downstream users can set WEB_SEARCH_ADAPTER=bing if they want to preserve the previous behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/builtin-tools/src/tools/WebSearchTool/adapters/index.ts` around
lines 25 - 30, The change to adapter selection where adapterKey is computed from
envAdapter and isFirstPartyAnthropicBaseUrl now defaults to 'exa' instead of
'bing' when WEB_SEARCH_ADAPTER is unset, which is a breaking default-behavior
change; update release notes/changelog to explicitly call out this default
fallback change and instruct downstream users to set WEB_SEARCH_ADAPTER=bing if
they want the previous behavior, and also add a short inline code comment near
the adapterKey computation (referencing envAdapter, adapterKey, and
isFirstPartyAnthropicBaseUrl) documenting the intentional default switch so
future readers understand this behavioral change.
| num_results: z | ||
| .number() | ||
| .optional() | ||
| .describe('Number of search results to return (default: 8)'), | ||
| livecrawl: z | ||
| .enum(['fallback', 'preferred']) | ||
| .optional() | ||
| .describe( | ||
| "Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')", | ||
| ), | ||
| search_type: z | ||
| .enum(['auto', 'fast', 'deep']) | ||
| .optional() | ||
| .describe( | ||
| "Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search", | ||
| ), | ||
| context_max_characters: z | ||
| .number() | ||
| .optional() | ||
| .describe('Maximum characters for context string optimized for LLMs (default: 10000)'), |
There was a problem hiding this comment.
Add bounds/integer validation for numeric options.
num_results and context_max_characters accept any finite number, including negatives, zero, floats, and NaN — these are forwarded verbatim into the Exa MCP payload and could produce 4xx responses or surprising behavior. Constrain them at the schema layer.
Proposed fix
num_results: z
- .number()
+ .int()
+ .min(1)
+ .max(100)
.optional()
.describe('Number of search results to return (default: 8)'),
@@
context_max_characters: z
- .number()
+ .int()
+ .min(1)
.optional()
.describe('Maximum characters for context string optimized for LLMs (default: 10000)'),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| num_results: z | |
| .number() | |
| .optional() | |
| .describe('Number of search results to return (default: 8)'), | |
| livecrawl: z | |
| .enum(['fallback', 'preferred']) | |
| .optional() | |
| .describe( | |
| "Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')", | |
| ), | |
| search_type: z | |
| .enum(['auto', 'fast', 'deep']) | |
| .optional() | |
| .describe( | |
| "Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search", | |
| ), | |
| context_max_characters: z | |
| .number() | |
| .optional() | |
| .describe('Maximum characters for context string optimized for LLMs (default: 10000)'), | |
| num_results: z | |
| .int() | |
| .min(1) | |
| .max(100) | |
| .optional() | |
| .describe('Number of search results to return (default: 8)'), | |
| livecrawl: z | |
| .enum(['fallback', 'preferred']) | |
| .optional() | |
| .describe( | |
| "Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')", | |
| ), | |
| search_type: z | |
| .enum(['auto', 'fast', 'deep']) | |
| .optional() | |
| .describe( | |
| "Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search", | |
| ), | |
| context_max_characters: z | |
| .int() | |
| .min(1) | |
| .optional() | |
| .describe('Maximum characters for context string optimized for LLMs (default: 10000)'), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/builtin-tools/src/tools/WebSearchTool/WebSearchTool.ts` around lines
26 - 45, The numeric schema fields allow non-integer, non-finite, zero/negative
values; update the zod validators to enforce integer and bounds: for num_results
in WebSearchTool.ts change its schema to require finite integers and a sensible
range (e.g., .int().finite().min(1).max(100)) so callers can't pass 0,
negatives, floats, or NaN; do the same for context_max_characters (e.g.,
.int().finite().min(1).max(100000)) to prevent invalid/huge values being
forwarded to the Exa MCP payload and keep the existing .optional()/.describe()
metadata.
…xa-search feat: 添加 Exa AI 搜索适配器
Summary by CodeRabbit
New Features
Tests