-
Notifications
You must be signed in to change notification settings - Fork 317
Description
I'm looking for a library to use with my next data augmentation pipeline. My target use is for speech oriented task, so I would like to augment my clean speech audio (only speakers voice with no background noise) with mostly background noises, RIRs, phase shift and lossy compression.
For now I considered offline dataset generation, where in multiprocessing setup I first load wavs, augment them and then proceed with usual preprocessing (wav -> mel-spectrogram with tf.io) to prepare tfrecords further used in training pipeline with tf.data API.
I was looking audiomentations which looks fine for my use case, but I stumbled upon pedalboard and got really interested, given how it is oriented on performance and high-quality effects. It's stated that this library is used by Spotify internally for ML training with TF, but I can't find any examples here on this use case.
If processing was fast enough one could easily setup online tf.data training pipeline with augmentations. My question is if you would consider adding some examples of how to setup such data pipeline?