Skip to content

nicofarr/SoundNet_Pytorch

 
 

Repository files navigation

SoundNet_Pytorch

Soundnet model in Pytorch

from soundnet

Introduction

The code is for converting the pretrained tensorflow soundnet model to pytorch model. So no training code for SoundNet model. The pretrained pytorch soundnet model is sound8.pth

Prerequisites

  1. Tensorflow (Only if .pth doesn't exist)
  2. python 3.6 with numpy
  3. pytorch 0.4+

How to use

  1. If the file sound8.pth has not been generated yet, follow the original instructions : model

  2. If audio preprocessing is required (ex : the sample rate is not 22.050 Hz),utils.py has a method for converting the indicated folder.

    To convert a file: sox input.wav -r 22050 -c 1 ouput.wav

  3. To extract a features vector use:

audio,sr = load_audio(filepath)
    features = ex.extract_pytorch_feature(audio,'./soundnet/sound8.pth')   
    print([x.shape for x in features])
    
    ##extract vector
    conv = ex.extract_vector(features,idlayer) #features vector

Highlevel features:

  • conv5, idlayer = 4
  • conv7, idlayer = 6

The temporal resolution

In order to find the the temporal resolution 1/m for each layer, the slope and the interception are calculated, which describes the relationship between the time in seconds and the number of channels of the extract_feature_vector method.

Acknowledgments

Mode for soundnet tensorflow model is ported from soundnet_tensorflow. Thanks for his works!

reference

  1. Yusuf Aytar, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." Advances in Neural Information Processing Systems. 2016.

About

converting the pretrained tensorflow SoundNet model to pytorch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%