Skip to content

Golesheed/whisper-jasmin-cgn

Repository files navigation

A Pilot Study on the JASMIN-CGN Corpus Using the Whisper Model

Description

We test and study the variation in speech recognition of fine-tuned versions of Whisper on children, elderly and non-native Dutch speech from the JASMIN-CGN corpus.

Requirements

Make sure you have JASMIN-CGN downloaded (download from here: [https://taalmaterialen.ivdnt.org/download/tstc-jasmin-spraakcorpus/])

How to use the files:

  • First you will remove the silences of the data
  • Then you need to make the data into 30 second chunks
  • You need to make a csv from the ort files! and convert to UTF-8
  • divide your datasets
  • NOW WE CAN TRAIN
  • AND TEST!

About

We test and study the variation in speech recognition of fine-tuned versions of Whisper on children, elderly and non-native Dutch speech from the JASMIN-CGN corpus.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors