asr-christiian

Working Repo von Christian

Week 1: 08.10 - 15.10

Proof of concept: get finetuning script from Peter working => improves performance on mini train set
prepare code for hyperparameter optimization => not seemless integration with ray tune (industry standard for hyperparameter optimization)
prepare presentation on hyperparameter optimization

Set up High-Perfomance cluster
Submit evaluation job for baseline whisper model
prepare code for hyperparameter optimization (from Week 1) => Collator Function had to be adapted as Ray Tune datasets created a numpy object which could not be processed from HF tokenizer.pad class

Next steps:

Evaluate Baseline model (WhisperX e.g.) on test set to directly compare performance with fine-tuned model
Define hyperparameters to optimize
Run first dummy-finetuning on cluster

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
__pycache__		__pycache__
cluster		cluster
configs		configs
presentations		presentations
.DS_Store		.DS_Store
._utils.py		._utils.py
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bohb_search_fix.py		bohb_search_fix.py
evaluate_model.py		evaluate_model.py
fine_tune_small.sh		fine_tune_small.sh
main_finetuning.py		main_finetuning.py
main_ray_for_HF.py		main_ray_for_HF.py
main_ray_old.py		main_ray_old.py
requirements.txt		requirements.txt
transcribe_audio_fuke.py		transcribe_audio_fuke.py
utils.py		utils.py