Login / Signup
PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network.
Qinghua Liu
Meng Ge
Zhizheng Wu
Haizhou Li
Published in:
INTERSPEECH (2023)
Keyphrases
</>
audio visual
pose invariant
multi modal
visual information
multimedia
visual data
emotion recognition
data sets
spatio temporal
video sequences
image data
facial expression recognition