Login / Signup
Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization.
Huan Zhao
Li Zhang
Yue Li
Yannan Wang
Hongji Wang
Wei Rao
Qing Wang
Lei Xie
Published in:
CoRR (2023)
Keyphrases
</>
audio visual
multi modal
pre trained
speaker diarization
visual information
multi stream
multimedia
speaker verification
visual data
training data
emotion recognition
face recognition
supervised learning
probabilistic model
training examples
speech recognition
learning algorithm
multimedia data
audio features