Skip to content

feat(core): retry all provider errors and surface JSON-RPC error details#888

Merged
christso merged 4 commits intomainfrom
feat/886-retry-execution-errors
Apr 1, 2026
Merged

feat(core): retry all provider errors and surface JSON-RPC error details#888
christso merged 4 commits intomainfrom
feat/886-retry-execution-errors

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented Apr 1, 2026

Summary

  • Retry all provider errors: Previously only timeout-like errors were retried; now all provider errors (transient API failures, JSON-RPC rejections, connection drops) are retried up to maxRetries (default 2). This matches how promptfoo and inspect_ai handle retries — both retry broadly, not just timeouts.
  • Fix [object Object] error messages: The ACP SDK rejects promises with plain JSON-RPC error objects ({code, message}) instead of Error instances. buildErrorResult used String(error) on these, producing the unhelpful [object Object]. New extractErrorMessage() helper reads .message and .code from plain objects.

Closes #886

Context

Observed in CI run 23804682863: 4 tests (respond-triages-pr-comments, respond-classification-coverage, retrospect-synthesizes-journal, retrospect-identifies-patterns) failed with ERROR: [object Object] — 0 tokens, 0ms duration. The Copilot CLI returned a JSON-RPC error during ACP initialization (transient API issue), and the error was never retried because isTimeoutLike() returned false for non-Error objects.

Test plan

  • Existing test renamed: "retries provider errors up to maxRetries"
  • New test: "retries non-timeout provider errors up to maxRetries" — verifies non-timeout errors are retried
  • New test: "surfaces JSON-RPC error objects with readable messages" — verifies {code: -32600, message: "Invalid request"} produces "Invalid request (code -32600)" not "[object Object]"
  • All 1303 core tests pass

🤖 Generated with Claude Code

christso and others added 3 commits April 1, 2026 00:01
Previously, the retry loop only retried timeout-like errors; other
provider failures (e.g. transient API errors, JSON-RPC rejections from
ACP SDK) failed immediately with no retry. Now all provider errors are
retried up to maxRetries (default 2).

Also fixes error serialization: the ACP SDK rejects promises with plain
JSON-RPC error objects ({code, message}) instead of Error instances.
buildErrorResult used String(error) on these, producing the unhelpful
"[object Object]" message. The new extractErrorMessage() helper reads
the .message and .code properties from plain objects.

Closes #886

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds 2^attempt * 1000ms delay (1s, 2s, 4s, …) capped at 30s between
retry attempts, matching promptfoo's approach. Prevents hammering
rate-limited APIs with immediate retries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 1, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6f5892c
Status:⚡️  Build in progress...

View logs

…xtraction

- sleep() now accepts an optional AbortSignal so cancellation isn't
  blocked during backoff delays
- extractErrorMessage() falls back to JSON.stringify for plain objects
  that lack a string message property, preventing [object Object]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@christso christso merged commit 3280ce7 into main Apr 1, 2026
3 of 4 checks passed
@christso christso deleted the feat/886-retry-execution-errors branch April 1, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: rerun failed evals n times until they pass

1 participant