Login / Signup
Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs.
Ruijie Tao
Kong Aik Lee
Rohan Kumar Das
Ville Hautamäki
Haizhou Li
Published in:
IEEE ACM Trans. Audio Speech Lang. Process. (2023)
Keyphrases
</>
multi modal
audio visual
multi modality
cross modal
pairwise
high dimensional
image annotation
semantic concepts
mutual information
speech recognition
mean shift
uni modal