Audio-Visual Scene Analysis with Self-Supervised Multisensory Features.
Andrew OwensAlexei A. EfrosPublished in: CoRR (2018)
Keyphrases
- scene analysis
- audio visual
- person authentication
- multi modal
- audio features
- feature extraction
- low level
- multimedia
- multi stream
- temporal segmentation
- audio visual speech recognition
- multimodal fusion
- biometric identification
- visual data
- visual information
- high level
- video scene
- image classification
- image features
- feature vectors