Skip to content

fix: add 429 rate limit retry with exponential backoff#832

Open
xodn348 wants to merge 1 commit intoanthropics:mainfrom
xodn348:fix/812-429-rate-limit-retry
Open

fix: add 429 rate limit retry with exponential backoff#832
xodn348 wants to merge 1 commit intoanthropics:mainfrom
xodn348:fix/812-429-rate-limit-retry

Conversation

@xodn348
Copy link
Copy Markdown

@xodn348 xodn348 commented Apr 16, 2026

Fixes #812.

This makes transient 429 rate limit failures recoverable for agent sessions:

  • adds rate_limit_max_retries to ClaudeAgentOptions with a default of 3 and 0 as an opt-out
  • retries only subprocess failures that include 429 / rate_limit_error
  • uses retry delay from the error payload when available, otherwise exponential backoff with jitter
  • captures subprocess stderr so ProcessError carries the API error payload needed for retry detection
  • raises RateLimitError after retry exhaustion with the original ProcessError attached

I rebased this on current upstream/main and kept the retry path scoped to rate-limit process errors.

Testing:

uv run --extra dev pytest -q tests/test_rate_limit_retry.py
uv run --extra dev ruff check src/claude_agent_sdk tests/test_rate_limit_retry.py
uv run --extra dev mypy src/claude_agent_sdk tests/test_rate_limit_retry.py
python3 -m compileall -q src/claude_agent_sdk tests/test_rate_limit_retry.py

Results: 14 targeted tests passed; ruff passed; mypy passed; compileall passed.

Agent sessions can fail after a recoverable 429 from the Claude Code subprocess. This adds bounded retry with exponential backoff, preserves the original ProcessError when retries are exhausted, and captures stderr so retry detection can see the API error payload.

Constraint: Keep retry disabled by setting rate_limit_max_retries=0 and avoid changing caller git config.\nRejected: Treat every subprocess failure as retryable | would mask non-rate-limit process errors.\nConfidence: medium\nScope-risk: moderate\nTested: uv run --extra dev pytest -q tests/test_rate_limit_retry.py; uv run --extra dev ruff check src/claude_agent_sdk tests/test_rate_limit_retry.py; uv run --extra dev mypy src/claude_agent_sdk tests/test_rate_limit_retry.py; python3 -m compileall -q src/claude_agent_sdk tests/test_rate_limit_retry.py\nNot-tested: Live Claude Code 429 from the API.
@xodn348 xodn348 force-pushed the fix/812-429-rate-limit-retry branch from 2bee4ae to 07aa047 Compare April 27, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent SDK should handle 429 rate limits gracefully instead of crashing

1 participant