Skip to content

如何得到每条数据的 loss #6165

@Word2VecT

Description

@Word2VecT

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.9.1.dev0
  • Platform: Linux-3.10.0-957.el7.x86_64-x86_64-with-glibc2.17
  • Python version: 3.11.0
  • PyTorch version: 2.4.1+cu121 (GPU)
  • Transformers version: 4.45.2
  • Datasets version: 2.21.0
  • Accelerate version: 0.34.2
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA A100-SXM4-80GB
  • DeepSpeed version: 0.15.3

Reproduction

torchrun --nnodes=1 --nproc-per-node=8 src/train.py
--deepspeed examples/deepspeed/ds_z3_config.json
--stage sft
--do_train
--use_fast_tokenizer
--flash_attn fa2
--model_name_or_path /mnt/petrelfs/tangzinan/LLaMA-Factory/models/LLama3.1-8B
--dataset gsm8k_train
--template llama3
--finetuning_type full
--output_dir saves/LLama3.1-8B/full/train_2024-11-14-22-43-17
--overwrite_cache
--overwrite_output_dir
--warmup_ratio 0.03
--weight_decay 0.
--per_device_train_batch_size 4
--gradient_accumulation_steps 8
--ddp_timeout 9000
--learning_rate 2e-5
--lr_scheduler_type cosine
--cutoff_len 4096
--save_steps 400
--logging_steps 1
--plot_loss
--num_train_epochs 1
--bf16
--report_to wandb

Expected behavior

SFT 微调训练完后,有什么方法能够 inference 一遍,得到每条数据对应的 loss 吗

Others

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    solvedThis problem has been already solved

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions