Sign in

Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs.

Ruijie TaoKong Aik LeeRohan Kumar DasVille HautamäkiHaizhou Li
Published in: CoRR (2022)
Keyphrases
  • multi modal
  • audio visual
  • multi modality
  • video search
  • pairwise
  • bit rate
  • image annotation
  • uni modal
  • training set
  • speech recognition
  • image registration
  • semantic concepts
  • speaker verification