Sign in

Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs.

Ruijie TaoKong Aik LeeRohan Kumar DasVille HautamäkiHaizhou Li
Published in: IEEE ACM Trans. Audio Speech Lang. Process. (2023)
Keyphrases
  • multi modal
  • audio visual
  • multi modality
  • cross modal
  • pairwise
  • high dimensional
  • image annotation
  • semantic concepts
  • mutual information
  • speech recognition
  • mean shift
  • uni modal