Skip to content

[Bug] Evaluation metrics logged to training in wandb #3744

@AJain9199

Description

@AJain9199
  1. Did you update? pip install --upgrade unsloth unsloth_zoo
    Yes

  2. Colab or Kaggle or local / cloud
    Colab

  3. Number GPUs used, use nvidia-smi
    1 A100 80GB

  4. Which notebook? Please link!
    Code

  5. Which Unsloth version, TRL version, transformers version, PyTorch version?

unsloth==2025.12.6
unsloth_zoo==2025.12.5
trl==0.25.0, 0.24.0, 0.23.1 (all tested)
torch==2.8.0
torchao=0.13.0
vllm==0.11.0
flashinfer-python==0.3.1.post1.
  1. Which trainer? SFTTrainer, GRPOTrainer etc
    GRPO

This basic GRPO script reproduces the issue

Hi. When using GRPO with wandb, the eval rewards are logged with the train logs. This makes it extremely hard to track evaluation progress.

Running the script, the eval log is:

{'eval_loss': -0.0030808576848357916, .. 'rewards/dummy_reward/mean': 0.5854228019714356,  ...}

But it is logged with the train reward:

Image

Rewards are not logged in the evaluation pane:

Image

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions