The Audio-Visual BatVision Dataset for Research on Sight and Sound.
Brunetto AmandineHornauer SaschaStella X. YuMoutarde FabienPublished in: CoRR (2023)
Keyphrases
- audio visual
- multi modal
- sound source
- visual information
- temporal context
- video summarization
- visual data
- person authentication
- emotion recognition
- audio visual speech recognition
- audio features
- multimedia
- multi stream
- data sets
- face recognition
- audio signal
- semantic information
- natural language processing
- image data
- low level
- database