[Ready to merge] Pruned-transducer-stateless2 recipe for aidatatang_200zh #375
Conversation
README.md
Outdated
| ### Aidatatang_200zh | ||
|
|
||
| We provide one model for this recipe: [Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss][Aidatatang_200zh | ||
| _pruned_transducer_stateless2]. |
There was a problem hiding this comment.
Line-break here leads to a broken linke.
Please remove the line-break.
README.md
Outdated
| | fast beam search | 5.30 | 6.34 | | ||
| | modified beam search | 5.27 | 6.33 | | ||
|
|
||
| We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [(https://colab.research.google.com/drive/1wNSnSj3T5oOctbh5IGCa393gKOoQw2GH?usp=sharing) |
There was a problem hiding this comment.
| We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [(https://colab.research.google.com/drive/1wNSnSj3T5oOctbh5IGCa393gKOoQw2GH?usp=sharing) | |
| We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model: [![Open In Colab]](https://colab.research.google.com/assets/colab-badge.svg)(https://colab.research.google.com/drive/1wNSnSj3T5oOctbh5IGCa393gKOoQw2GH?usp=sharing) |
egs/aidatatang_200zh/ASR/README.md
Outdated
| @@ -0,0 +1,39 @@ | |||
| Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/355 | |||
There was a problem hiding this comment.
| Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/355 | |
| Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/375 |
egs/aidatatang_200zh/ASR/README.md
Outdated
| @@ -0,0 +1,39 @@ | |||
| Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/355 | |||
| And the SpecAugment codes from this PR https://github.com/lhotse-speech/lhotse/pull/604. | |||
There was a problem hiding this comment.
| And the SpecAugment codes from this PR https://github.com/lhotse-speech/lhotse/pull/604. |
It has been merged. No need to mention it.
egs/aidatatang_200zh/ASR/RESULTS.md
Outdated
|
|
||
| #### 2022-05-16 | ||
|
|
||
| Using the codes from this PR https://github.com/k2-fsa/icefall/pull/355. |
There was a problem hiding this comment.
| Using the codes from this PR https://github.com/k2-fsa/icefall/pull/355. | |
| Using the codes from this PR https://github.com/k2-fsa/icefall/pull/375. |
|
|
||
|
|
||
| """ | ||
| This file computes fbank features of the aishell dataset. |
There was a problem hiding this comment.
Please update the comment
| sampler = DynamicBucketingSampler( | ||
| cuts, | ||
| max_duration=self.args.max_duration, | ||
| rank=0, |
There was a problem hiding this comment.
I think piotr has suggested you to remove rank and world_size here.
| @@ -0,0 +1,955 @@ | |||
| # Copyright 2021 Xiaomi Corp. (authors: Fangjun Kuang) | |||
There was a problem hiding this comment.
Could you replace it with a symlink?
|
Thanks, all of requirements are done and tested by running. |
|
If you have a model with attentin_dim=512, it may be too large; you could try 256. |
| @@ -0,0 +1,103 @@ | |||
| # Copyright 2021 Xiaomi Corp. (authors: Fangjun Kuang) | |||
There was a problem hiding this comment.
please replace such files with symlinks.
There was a problem hiding this comment.
Done and tested.
|
I can have a experiment with dim=256 later. |
|
When using dim=256 for conformer, the WERs (best performance) are
The results based on dim=256 are worse than dim=512. |
Did you change other settings other than dim=256, i.e., number of encoder layers, dim feedforward? |
|
No, just change encoder model dim=512 to 256. |
|
I think this PR can be merged. |
|
Thanks! |
|
Could you upload a torchscript model to |
|
Done.
|
|
Thanks! I have updated https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition |
…00zh (k2-fsa#375) * add pruned-rnnt2 model for aidatatang_200zh * do some changes * change for README.md * do some changes
This PR is to facilitate code review and it also inherits from the PR #355. This pr aims to merge into the master. The results and comparisons are as follows. Next, I plan to use a conformer-ctc model for aidatatang_200zh to generate a new baseline.
The WERs are
The results with kaldi:
The results with espnet:
Conformer_encoder+SpecAugment + Transformer_decoder