Skip to content

[Bug]: Eval errors when Claude is overloaded, needs retry #112

@amd-pworfolk

Description

@amd-pworfolk

Quick Check ✨

  • I've taken a look at existing issues and discussions
  • I've checked the hardware requirements in the docs
  • This issue relates to GAIA UI (Open-WebUI)

Which version of GAIA are you using?

No response

Details to help us reproduce the issue

Here are the errors in the logfile:

[2025-11-07 18:50:39,482] | INFO | gaia.eval.eval._analyze_summarization_results | eval.py:1390 | Analysis progress: 22/24 summaries completed (91.7%)
[2025-11-07 18:50:39,493] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:91 | Getting completion with usage tracking from Claude
[2025-11-07 18:50:52,138] | INFO | httpx._send_single_request | _client.py:1025 | HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 500 Internal Server Error"
[2025-11-07 18:50:52,138] | ERROR | gaia.eval.claude.get_completion_with_usage | claude.py:119 | Error getting completion with usage: Error code: 500 - {'type': 'error', 'error': {'type': 'api_error', 'message': 'Overloaded'}, 'request_id': None}
[2025-11-07 18:50:52,138] | ERROR | gaia.eval.eval._analyze_summarization_results | eval.py:1326 | Error analyzing summary 22: Error code: 500 - {'type': 'error', 'error': {'type': 'api_error', 'message': 'Overloaded'}, 'request_id': None}
[2025-11-07 18:50:52,147] | INFO | gaia.eval.eval._analyze_summarization_results | eval.py:1390 | Analysis progress: 23/24 summaries completed (95.8%)
[2025-11-07 18:50:52,147] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:91 | Getting completion with usage tracking from Claude
[2025-11-07 18:51:07,169] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:110 | Usage: 2039 input + 688 output = 2727 total tokens
[2025-11-07 18:51:07,169] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:113 | Cost: $0.0061 input + $0.0103 output = $0.0164 total
[2025-11-07 18:51:07,185] | INFO | gaia.eval.eval._analyze_summarization_results | eval.py:1390 | Analysis progress: 24/24 summaries completed (100.0%)
[2025-11-07 18:51:07,185] | WARNING | gaia.eval.eval._analyze_summarization_results | eval.py:1420 | Excluded 1 error entries from quality scoring
[2025-11-07 18:51:07,185] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:91 | Getting completion with usage tracking from Claude
[2025-11-07 18:51:46,754] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:110 | Usage: 24506 input + 1418 output = 25924 total tokens
[2025-11-07 18:51:46,754] | INFO | gaia.eval.claude.get_completion_with_usage | claude.py:113 | Cost: $0.0735 input + $0.0213 output = $0.0948 total
[2025-11-07 18:51:46,754] | INFO | gaia.eval.eval._analyze_summarization_results | eval.py:1779 | Cleaned up intermediate analysis files from: C:\Users\pw\AppData\Local\Temp\gaia_eval\Claude-Sonnet_analysis.intermediate
[2025-11-07 18:51:46,770] | INFO | gaia.eval.eval.generate_enhanced_report | eval.py:1880 | Evaluation data saved to: output\evaluations\Claude-Sonnet.experiment.eval.json

What actually happened?

In the end the evaluation on that summarization fails and is not included in the final result.

What did you expect to happen?

There should be a retry on this type of error from Claude

How did you install GAIA?

Git Clone

Which mode are you running?

None

What's your CPU?

None

What about your GPU setup?

None

AMD GPU Driver Version

No response

NPU Driver Version

No response

Lemonade Version (if applicable)

No response

What's your operating system?

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions