Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.
Ryo MasumuraDaiki OkamuraNaoki MakishimaMana IhoriAkihiko TakashimaTomohiro TanakaShota OrihashiPublished in: Interspeech (2021)
Keyphrases
- speech recognition
- end to end
- autoregressive
- automatic speech recognition
- language model
- speaker identification
- hidden markov models
- pattern recognition
- speech signal
- speaker dependent
- random fields
- non stationary
- speaker diarization
- speaker recognition
- speech synthesis
- noisy environments
- sar images
- acoustic models
- speaker independent
- speech recognition systems
- speech recognizer
- speaker verification
- natural language
- broadcast news
- markov random field
- video sequences
- high frequency