Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos.
Kranti Kumar ParidaNeeraj MatiyaliTanaya GuhaGaurav SharmaPublished in: CoRR (2019)
Keyphrases
- audio visual
- multi modal
- audio visual content
- video summarization
- visual data
- multimodal fusion
- visual information
- multi stream
- sports video
- audio features
- multimedia
- pattern recognition
- image classification
- image retrieval
- feature selection
- temporal context
- relevance feedback
- feature extraction
- video content
- visual features
- text classification
- feature vectors
- person authentication
- audio visual speech recognition
- information retrieval
- human actions
- retrieval systems
- video data
- spatio temporal