Decoder-only Architecture for Streaming End-to-end Speech Recognition.
Emiru TsunooHayato FutamiYosuke KashiwagiSiddhant AroraShinji WatanabePublished in: CoRR (2024)
Keyphrases
- end to end
- speech recognition
- scalable video
- rate allocation
- language model
- hidden markov models
- rate adaptation
- automatic speech recognition
- real time
- pattern recognition
- speech signal
- cross layer
- speech synthesis
- speech recognizer
- noisy environments
- video streaming
- speech recognition systems
- congestion control
- data streams
- transport layer
- speaker identification
- low complexity
- speaker independent
- speaker adaptation
- speech recognition technology
- application layer
- multiple description coding
- motion estimation
- feature extraction