ultrafeedback_binarized支持

### Reminder

- [X] I have read the README and searched the existing issues.

### System Info

transformers version: 4.41.1
Platform: Linux-5.15.0-107-generic-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.23.2
Safetensors version: 0.4.3
Accelerate version: 0.29.3
Accelerate config: - compute_environment: LOCAL_MACHINE
- distributed_type: DEEPSPEED
- mixed_precision: bf16
- use_cpu: False
- debug: True
- num_processes: 3
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- deepspeed_config: {'gradient_accumulation_steps': 2, 'zero3_init_flag': False, 'zero_stage': 0}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
PyTorch version (GPU?): 2.2.1+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: True
Using distributed or parallel set-up in script?: False

### Reproduction



`CUDA_VISIBLE_DEVICES=0 llamafactory-cli train \
    --stage dpo \
    --pref_loss simpo \
    --simpo_gamma 1.0 \
    --do_train True \
    --model_name_or_path /home/ubuntu/date/llama_ckpts/llama_lx3_ckpts/BAdam_llama3_random_lr1e-6/checkpoint-9600  \
    --preprocessing_num_workers 16 \
    --finetuning_type full \
    --template default \
    --flash_attn auto \
    --dataset_dir data \
    --dataset ultrafeedback_binarized \
    --split train \
    --cutoff_len 2048 \
    --learning_rate 2e-7 \
    --num_train_epochs 1.0 \
    --max_samples 10000000 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --max_grad_norm 1.0 \
    --logging_steps 1 \
    --save_steps 100 \
    --warmup_ratio 0.1 \
    --optim adamw_torch \
    --packing False \
    --report_to none \
    --use_badam True \
    --output_dir saves/LLaMA3-8B/full/train_2024-06-05 \
    --pure_bf16 True \
    --plot_loss True \
    --use_badam True \
    --badam_mode layer \
    --badam_switch_mode random \
    --badam_switch_interval 100 \
    --val_size 0.05 \
    --evaluation_strategy steps \
    --eval_steps 20`

Same to "https://github.com/hiyouga/LLaMA-Factory/issues/4085"


### Expected behavior

ultrafeedback_binarized 是研究中常用的对齐数据集，能否提供支持？

### Others

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ultrafeedback_binarized支持 #4132

Reminder

System Info

Reproduction

Expected behavior

Others

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ultrafeedback_binarized支持 #4132

Description

Reminder

System Info

Reproduction

Expected behavior

Others

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions