Reminder
System Info
llamafactory version: 0.9.1.dev0
- Platform: Linux-3.10.0-957.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.11.0
- PyTorch version: 2.4.1+cu121 (GPU)
- Transformers version: 4.45.2
- Datasets version: 2.21.0
- Accelerate version: 0.34.2
- PEFT version: 0.12.0
- TRL version: 0.9.6
- GPU type: NVIDIA A100-SXM4-80GB
- DeepSpeed version: 0.15.3
Reproduction
torchrun --nnodes=1 --nproc-per-node=8 src/train.py
--deepspeed examples/deepspeed/ds_z3_config.json
--stage sft
--do_train
--use_fast_tokenizer
--flash_attn fa2
--model_name_or_path /mnt/petrelfs/tangzinan/LLaMA-Factory/models/LLama3.1-8B
--dataset gsm8k_train
--template llama3
--finetuning_type full
--output_dir saves/LLama3.1-8B/full/train_2024-11-14-22-43-17
--overwrite_cache
--overwrite_output_dir
--warmup_ratio 0.03
--weight_decay 0.
--per_device_train_batch_size 4
--gradient_accumulation_steps 8
--ddp_timeout 9000
--learning_rate 2e-5
--lr_scheduler_type cosine
--cutoff_len 4096
--save_steps 400
--logging_steps 1
--plot_loss
--num_train_epochs 1
--bf16
--report_to wandb
Expected behavior
SFT 微调训练完后,有什么方法能够 inference 一遍,得到每条数据对应的 loss 吗
Others
No response
Reminder
System Info
llamafactoryversion: 0.9.1.dev0Reproduction
torchrun --nnodes=1 --nproc-per-node=8 src/train.py
--deepspeed examples/deepspeed/ds_z3_config.json
--stage sft
--do_train
--use_fast_tokenizer
--flash_attn fa2
--model_name_or_path /mnt/petrelfs/tangzinan/LLaMA-Factory/models/LLama3.1-8B
--dataset gsm8k_train
--template llama3
--finetuning_type full
--output_dir saves/LLama3.1-8B/full/train_2024-11-14-22-43-17
--overwrite_cache
--overwrite_output_dir
--warmup_ratio 0.03
--weight_decay 0.
--per_device_train_batch_size 4
--gradient_accumulation_steps 8
--ddp_timeout 9000
--learning_rate 2e-5
--lr_scheduler_type cosine
--cutoff_len 4096
--save_steps 400
--logging_steps 1
--plot_loss
--num_train_epochs 1
--bf16
--report_to wandb
Expected behavior
SFT 微调训练完后,有什么方法能够 inference 一遍,得到每条数据对应的 loss 吗
Others
No response