Login / Signup
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition.
Guinan Li
Jiajun Deng
Youjun Chen
Mengzhe Geng
Shujie Hu
Zhe Li
Zengrui Jin
Tianzi Wang
Xurong Xie
Helen Meng
Xunying Liu
Published in:
CoRR (2024)
Keyphrases
</>
audio visual
person authentication
multi modal
audio features
visual information
speaker verification
feature extraction
visual data
emotion recognition
co occurrence
pattern recognition
speech recognition
data sets
human activities
spatio temporal
multi stream
computer vision