Login / Signup
Self-supervised learning for audio-visual speaker diarization.
Yifan Ding
Yong Xu
Shi-Xiong Zhang
Yahuan Cong
Liqiang Wang
Published in:
CoRR (2020)
Keyphrases
</>
audio visual
multi modal
feature selection
data sets
computer vision