DanceSport: A Multi-Modal Video Dataset and Hierarchical Language-Guided Audio-Visual Learning for Long-Term Sports Assessment

This is the repository for our DanceSport dataset and code for our "Hierarchical Multidimensional Language-guided Audio-Visual Learning (H-MLAVL)".

Environments

RTX 3090
CUDA: 12.4
Python: 3.8.19
PyTorch: 2.4.1+cu124

Features

The features and label files of our DanceSport dataset can be download from here.

The features and label files of Rhythmic Gymnastics and Fis-V dataset can be download from the GDLT repository.

The features and label files of FS1000 dataset can be download from the Skating-Mixer repository.

The features and label files of LOGO dataset can be download from the UIL-AQA repository.

Pretrained Model

If you wish to extract your own action text labels, please download the ViFi-CLIP pretrained model and place it in:

weights/k400_clip_complete_finetuned_30_epochs.pth

Running

The following are examples only, more details coming soon!

Please fill in or select the args enclosed by {} first.

Training

CUDA_VISIBLE_DEVICES={device ID} python main.py --video-path {path of video features} --audio-path {path of audio features} --train-label-path {path of label file of training set} --test-label-path {path of label file of test set} --model-name {the name used to save model and log} --action-type {Ball/Clubs/Hoop/Ribbon} --lr 1e-2 --epoch {250/400/500/150} --n_decoder 2 --n_query 4 --alpha 1.0 --margin 1.0 --lr-decay cos --decay-rate 0.01 --dropout 0.3

Testing

CUDA_VISIBLE_DEVICES={device ID} python main.py --video-path {path of video features} --audio-path {path of audio features} --train-label-path {path of label file of training set} --test-label-path {path of label file of test set} --action-type {Ball/Clubs/Hoop/Ribbon} --n_decoder 2 --n_query 4 --dropout 0.3 --test --ckpt {the name of the used checkpoint}

Acknowledgements

This repository builds upon MLAVL (CVPR 2025).

We thank the authors for their contributions to the research community.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
DanceSport		DanceSport
clip		clip
models		models
README.md		README.md
Text_Fea_Dance.npy		Text_Fea_Dance.npy
Text_Fea_FS.npy		Text_Fea_FS.npy
Text_Fea_RG.npy		Text_Fea_RG.npy
action-label.csv		action-label.csv
datasets.py		datasets.py
main.py		main.py
options.py		options.py
read.py		read.py
rg_swintx.py		rg_swintx.py
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DanceSport: A Multi-Modal Video Dataset and Hierarchical Language-Guided Audio-Visual Learning for Long-Term Sports Assessment

Environments

Features

Pretrained Model

Running

The following are examples only, more details coming soon!

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DanceSport: A Multi-Modal Video Dataset and Hierarchical Language-Guided Audio-Visual Learning for Long-Term Sports Assessment

Environments

Features

Pretrained Model

Running

The following are examples only, more details coming soon!

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages