Two-Stage Triplet Loss Training with Curriculum Augmentation for Audio-Visual Retrieval.
Donghuo ZengKazushi IkedaPublished in: CoRR (2023)
Keyphrases
- audio visual
- audio visual content
- multi modal
- passage retrieval
- visual information
- information retrieval
- person authentication
- temporal context
- relevance feedback
- visual data
- image database
- multi stream
- multimedia
- training set
- information retrieval systems
- retrieval systems
- test collection
- audio visual speech recognition
- nearest neighbor
- image data
- image representation
- feature selection
- spatio temporal
- databases