Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21.
Jesús VillalbaBengt J. BorgstromSaurabh KatariaMagdalena RybickaCarlos D. CastilloJaejin ChoL. Paola García-PereraPedro A. Torres-CarrasquilloNajim DehakPublished in: Odyssey (2022)
Keyphrases
- audio visual
- speaker recognition
- cross lingual
- speaker verification
- multi modal
- machine translation
- visual information
- emotion recognition
- text classification
- multimedia
- visual data
- gaussian mixture model
- news articles
- transfer learning
- neural network
- speech recognition
- probabilistic model
- low level
- audio features
- active learning
- search engine