Login / Signup

Audio-visual speech activity detection in a two-speaker scenario incorporating depth information from a profile or frontal view.

Spyridon ThermosGerasimos Potamianos
Published in: SLT (2016)
Keyphrases
  • audio visual
  • depth information
  • emotion recognition
  • speaker diarization
  • multi modal
  • depth map
  • speaker verification
  • visual information
  • multimedia
  • visual data
  • machine learning
  • image sequences