Login / Signup
Audio-visual speech activity detection in a two-speaker scenario incorporating depth information from a profile or frontal view.
Spyridon Thermos
Gerasimos Potamianos
Published in:
SLT (2016)
Keyphrases
</>
audio visual
depth information
emotion recognition
speaker diarization
multi modal
depth map
speaker verification
visual information
multimedia
visual data
machine learning
image sequences