State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18.
Jesús VillalbaNanxin ChenDavid SnyderDaniel Garcia-RomeroAlan McCreeGregory SellJonas BorgstromFred RichardsonSuwon ShonFrançois GrondinRéda DehakLeibny Paola García-PereraDaniel PoveyPedro A. Torres-CarrasquilloSanjeev KhudanpurNajim DehakPublished in: INTERSPEECH (2019)
Keyphrases
- speaker recognition
- gaussian mixture model
- speaker verification
- vector quantization
- speaker identification
- video data
- probabilistic neural network
- speech recognition
- speech signal
- video frames
- multimedia
- video content
- broadcast news
- video database
- video sequences
- noisy environments
- video streams
- video retrieval
- feature vectors
- bayesian networks
- multimedia data
- image compression
- speaker diarization
- video clips
- neural network
- mel frequency cepstral coefficients
- audio features
- emotion recognition
- audio visual
- video analysis
- noise reduction
- key frames
- hidden markov models
- feature space