Reminder
System Info
llamafactory version: 0.9.2.dev0
- Platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.35
- Python version: 3.10.16
- PyTorch version: 2.6.0+cu126 (GPU)
- Transformers version: 4.49.0.dev0
- Datasets version: 2.21.0
- Accelerate version: 1.0.1
- PEFT version: 0.12.0
- TRL version: 0.9.6
- GPU type: NVIDIA H100 80GB HBM3
- GPU number: 8
- GPU memory: 79.19GB
Reproduction
Just use sharegpt_hyper dataset would cause the error.
[rank0]: File "/home/xxx/LLaMA-Factory/src/llamafactory/data/aligner.py", line 15
3, in convert_sharegpt
[rank0]: {"role": tag_mapping[message[dataset_attr.role_tag]], "content": message[da
taset_attr.content_tag]}
[rank0]: KeyError: 'user'
code here, when broken_data = True, it doesn't break and cause key error finally.
Others
No response
Reminder
System Info
llamafactoryversion: 0.9.2.dev0Reproduction
Just use
sharegpt_hyperdataset would cause the error.code here, when
broken_data = True, it doesn't break and cause key error finally.Others
No response