GitHub - maxreciprocate/kaggle-lmarena-1st-place: Solution of Kaggle "Multilingual Chatbot Arena" competition

WSDM Cup - Multilingual Chatbot Arena

https://www.kaggle.com/competitions/wsdm-cup-multilingual-chatbot-arena

Requirements

Hardware

H100x8 (to speed up teachers training we used H100x32, however it's not required)

Software

python3.12
SLURM

Packages

deepspeed==0.15.4
accelerate==1.2.1
transformers==4.46.3
flash-attn==2.7.1.post4

Reproduction

Preparation

Download the following datasets and put them into data/ without renaming the files (the rest of datasets are available on huggingface):

Training

Structure

All scripts have to be executed from the root directory

├── data
│   ├── train.parquet
│   └── ...
├── ckpts
│   └── ...
├── stage1
│   ├── README.md
│   └── prepare_pretrain_data.py
├── stage2
│   ├── README.md
│   └── prepare_teacher_data.py
├── stage3
│   ├── README.md
│   ├── collect_labels.py
│   ├── merge_students.py
│   ├── pack_student.py
│   ├── prepare_student_data.py
│   └── prepare_synth_data.py
├── packing
│   └── # https://github.com/tascj/kaggle-lmsys-chatbot-arena/tree/main/human_pref
├── format.py
├── label.py
├── label.sh
├── label.slurm
├── launch.slurm
├── models.py
├── readme.md
├── requirements.txt
├── run.sh
└── train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WSDM Cup - Multilingual Chatbot Arena

Requirements

Hardware

Software

Packages

Reproduction

Preparation

Training

Structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
packing		packing
stage1		stage1
stage2		stage2
stage3		stage3
LICENSE		LICENSE
README.md		README.md
format.py		format.py
label.py		label.py
label.sh		label.sh
label.slurm		label.slurm
launch.slurm		launch.slurm
models.py		models.py
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

WSDM Cup - Multilingual Chatbot Arena

Requirements

Hardware

Software

Packages

Reproduction

Preparation

Training

Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages