[Feature] Add tool call accuracy tests for large models to nightly CI


## Motivation

Tool call tests today only cover a small model (Llama-3.2-1B in `test/registered/openai_server/function_call/test_openai_function_calling.py`). Large models like DeepSeek V3.2 have no tool call coverage in CI, so bugs like #17593 and #17551 only get caught when users hit them. This issue tracks adding tool call tests to the nightly 8-GPU suite.

## Scenarios

These should be common across all models that support tool calling:

**Basic**
1. Format check — `tool_calls` is a non-empty list, `function.name` / `function.arguments` present, arguments is valid JSON, `finish_reason` is `"tool_calls"`
2. Field placement — tool call goes in `tool_calls`, not `content` (#17593)
3. Streaming — chunks concatenate to valid JSON, `finish_reason` correct

**tool_choice**

4. `"required"` — always returns tool call
5. `"none"` — never returns tool call
6. Specific function — returns the specified one

**Multi-turn**

7. Tool result follow-up — pass tool result back, model replies based on it
8. Thinking + tool call — after tool result, output in `content` not `reasoning_content` (#17551, DeepSeek specific for now — might be model-internal, will write the test first and see)

**Other**

9. Parallel tool calls — multiple tool calls in one request
10. Strict mode — `strict: true` enforces schema

## CI integration

Add to `test/registered/8-gpu-models/test_deepseek_v32.py`, two variants:

Non-MTP:
```
--tp=8 --dp=8 --enable-dp-attention
--tool-call-parser deepseekv32
--reasoning-parser deepseek-v3
```

MTP (speculative decoding):
```
same as above +
--speculative-algorithm=EAGLE
--speculative-num-steps=3
--speculative-eagle-topk=1
--speculative-num-draft-tokens=4
env: SGLANG_ENABLE_SPEC_V2=1
```

Both in `nightly-8-gpu-common`.

## Plan

Start with DeepSeek V3.2, then extend to GLM / Qwen / others reusing the same scenarios.

## Refs

- Small model tests: `test/registered/openai_server/function_call/test_openai_function_calling.py`
- Parser tests: `test/registered/function_call/test_parallel_tool_calls.py`
- #17593, #17551


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add tool call accuracy tests for large models to nightly CI #17933

Motivation

Scenarios

CI integration

Plan

Refs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Add tool call accuracy tests for large models to nightly CI #17933

Description

Motivation

Scenarios

CI integration

Plan

Refs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions