Working Repo von Christian
- Proof of concept: get finetuning script from Peter working => improves performance on mini train set
- prepare code for hyperparameter optimization => not seemless integration with ray tune (industry standard for hyperparameter optimization)
- prepare presentation on hyperparameter optimization
- Set up High-Perfomance cluster
- Submit evaluation job for baseline whisper model
- prepare code for hyperparameter optimization (from Week 1) => Collator Function had to be adapted as Ray Tune datasets created a numpy object which could not be processed from HF tokenizer.pad class
Next steps:
- Evaluate Baseline model (WhisperX e.g.) on test set to directly compare performance with fine-tuned model
- Define hyperparameters to optimize
- Run first dummy-finetuning on cluster