I3D: Transformer architectures with input-dependent dynamic depth for speech recognition.
Yifan PengJaesong LeeShinji WatanabePublished in: CoRR (2023)
Keyphrases
- speech recognition
- speech synthesis
- language model
- speech processing
- speech signal
- automatic speech recognition
- speech understanding
- pattern recognition
- speaker identification
- speech recognizer
- hidden markov models
- speech recognition systems
- noisy environments
- speech retrieval
- speech recognition technology
- handwriting recognition
- speech recognition errors
- face recognition
- keyword spotting
- speech recognizers
- speaker dependent
- neural network
- speaker independent
- speaker recognition
- non stationary
- maximum likelihood
- machine learning