Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition.
Max W. Y. LamJun WangChao WengDan SuDong YuPublished in: Interspeech (2021)
Keyphrases
- speech recognition
- end to end
- recurrent networks
- multiscale
- recurrent neural networks
- feed forward
- biologically inspired
- hidden markov models
- neural network
- rate allocation
- pattern recognition
- language model
- speech signal
- speech synthesis
- rate distortion
- bit rate
- congestion control
- speech recognizer
- automatic speech recognition
- low complexity
- speech recognition technology
- edge detection
- motion estimation
- frequency domain
- speaker independent
- wavelet transform
- speech recognition systems
- video codec
- artificial neural networks
- multiresolution
- genetic algorithm
- artificial intelligence
- image processing
- computational complexity
- speaker identification
- rate control
- video compression
- multi modal