BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition.
Shaoshi LingJulian SalazarYuzong LiuKatrin KirchhoffPublished in: Odyssey (2020)
Keyphrases
- speech recognition
- higher level
- recognition rate
- pattern recognition
- object recognition
- recognition accuracy
- speaker independent
- spoken language
- programming language
- bit rate
- action recognition
- object description
- language processing
- character recognition
- recognition algorithm
- rate distortion
- high level
- object models
- face recognition
- activity recognition
- low complexity
- language model
- motion estimation
- hidden markov models
- natural language
- feature extraction