Monotonic Segmental Attention for Automatic Speech Recognition.
Albert ZeyerRobin SchmittWei ZhouRalf SchlüterHermann NeyPublished in: SLT (2022)
Keyphrases
- automatic speech recognition
- hidden markov models
- speech recognition
- speech signal
- word error rate
- conversational speech
- broadcast news
- speech retrieval
- spoken words
- speech corpus
- spontaneous speech
- noisy environments
- recognition errors
- acoustic features
- word recognition
- focus of attention
- multi modal
- neural network
- handwriting recognition
- gesture recognition
- image acquisition
- visual attention
- speech sounds