the architecture for only audio training

I want to repeat you experiment for audio regression， but i cannot found the archituecture only for audio. In the rec_dense module it's architecture=1 in the code，but architecture1 is only for metadata. There is no default params for only audio regression? so i want to know the exact params and architecture for audio cnn.
 The params in the paper is：relu，filter num：256,512,1024,1024，maxpool size is 4，0.5 dropout for all layers，flatten output is 4096, no dense layer。my question is:
1. what is the filter kenel size？i found in the code， the kernel size if (4,96),(4,1),(4,1),(1,1) and the pool size is (4,1),(4,1),(1,1),(1,1)，but this params cannot get the flatten output is 4096! 
2. is the first kernel size(4,96)? can you help explain this?
3. no dense layer?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the architecture for only audio training #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

the architecture for only audio training #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions