EMID: An Emotional Aligned Dataset in Audio-Visual Modality.
Jialing ZouJiahao MeiGuangze YeTianyu HuaiQiwei ShenDaoguo DongPublished in: CoRR (2023)
Keyphrases
- audio visual
- multi modal
- emotion recognition
- visual information
- video summarization
- visual data
- person authentication
- multi stream
- affective states
- multimedia
- audio visual speech recognition
- temporal context
- audio features
- multimodal fusion
- high dimensional
- human actions
- structured data
- nearest neighbor
- low level
- image retrieval