Login / Signup
The WHU-Alibaba Audio-Visual Speaker Diarization System for the MISP 2022 Challenge.
Ming Cheng
Haoxu Wang
Ziteng Wang
Qiang Fu
Ming Li
Published in:
ICASSP (2023)
Keyphrases
</>
audio visual
speaker diarization
speaker verification
multi modal
visual information
multi stream
visual data
emotion recognition
multimedia
broadcast news
speech recognition
low level
data sets
visual features
audio features
model selection
principal component analysis
high level
metadata
search engine
machine learning