Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention.
R. Gnana PraveenEric GrangerPatrick CardinalPublished in: CoRR (2022)
Keyphrases
- emotion recognition
- audio visual
- multi modal
- emotional state
- speaker verification
- visual information
- information fusion
- multimedia
- visual data
- data fusion
- multi stream
- human computer interaction
- sentiment analysis
- facial expressions
- domain knowledge
- three dimensional
- computer vision
- low dimensional
- feature vectors
- high dimensional
- data analysis