What would be the main steps for building a real-time decoder on top of EESEN?
I read in the EESEN paper that composing the tokens, lexicon and grammar speeds up decoding a great deal, and I'd like to leverage that in a real-time context: capture an audio stream and output the transcripts progressively.
Is that by any chance in the works?
If not, I could try and give it a shot.
Thank you for this great project
What would be the main steps for building a real-time decoder on top of EESEN?
I read in the EESEN paper that composing the tokens, lexicon and grammar speeds up decoding a great deal, and I'd like to leverage that in a real-time context: capture an audio stream and output the transcripts progressively.
Is that by any chance in the works?
If not, I could try and give it a shot.
Thank you for this great project