Login / Signup
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild.
Okan Köpüklü
Maja Taseska
Gerhard Rigoll
Published in:
CoRR (2021)
Keyphrases
</>
audio visual
multi modal
speaker verification
multimedia
visual information
temporal context
visual data
emotion recognition
multi stream
audio visual speech recognition
three dimensional
spatio temporal
user interface
audio features