Login / Signup

Audio-Visual Multi-Speaker Tracking Based on the GLMB Framework.

Shoufeng LinXinyuan Qian
Published in: INTERSPEECH (2020)
Keyphrases
  • audio visual
  • multi modal
  • temporal context
  • visual information
  • visual data
  • audio visual speech recognition
  • emotion recognition
  • multimedia
  • speaker verification
  • person authentication
  • multi stream