[Bug] Evaluation metrics logged to training in wandb

1. Did you update? `pip install --upgrade unsloth unsloth_zoo`
Yes

2. `Colab` or `Kaggle` or local / cloud
Colab
3. Number GPUs used, use `nvidia-smi`
1 A100 80GB
4. Which notebook? Please link!
[Code](https://colab.research.google.com/drive/143xYvVhlIaHlNk4tyFKz2_02Iwm_Yl0w?usp=sharing)
5. Which Unsloth version, TRL version, transformers version, PyTorch version?
```
unsloth==2025.12.6
unsloth_zoo==2025.12.5
trl==0.25.0, 0.24.0, 0.23.1 (all tested)
torch==2.8.0
torchao=0.13.0
vllm==0.11.0
flashinfer-python==0.3.1.post1.
```
6. Which trainer? `SFTTrainer`, `GRPOTrainer` etc
GRPO

[This basic GRPO script reproduces the issue](https://colab.research.google.com/drive/143xYvVhlIaHlNk4tyFKz2_02Iwm_Yl0w?usp=sharing)

Hi. When using GRPO with wandb, the eval rewards are logged with the train logs. This makes it extremely hard to track evaluation progress.

Running the script, the eval log is:
```
{'eval_loss': -0.0030808576848357916, .. 'rewards/dummy_reward/mean': 0.5854228019714356,  ...}
````

But it is logged with the train reward:

<img width="1242" height="401" alt="Image" src="https://github.com/user-attachments/assets/febe0852-c9d8-4671-8ce8-3dfba5fd3d53" />

Rewards are not logged in the evaluation pane:

<img width="3739" height="835" alt="Image" src="https://github.com/user-attachments/assets/e18536f2-2159-4c32-9791-56da72128231" />

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] Evaluation metrics logged to training in wandb #3744

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] Evaluation metrics logged to training in wandb #3744

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions