Leveraging Multi-modal Interactions among the Intermediate Representations of Deep Transformers for Emotion Recognition.
Yang WuZhenyu ZhangPai PengYanyan ZhaoBing QinPublished in: MuSe @ ACM Multimedia (2022)
Keyphrases
- multi modal
- emotion recognition
- audio visual
- intermediate representations
- emotional speech
- intermediate representation
- structured learning
- human computer interaction
- information fusion
- high level
- high dimensional
- image annotation
- three dimensional
- affective states
- sentiment analysis
- facial images
- neural network
- facial expressions
- metadata
- image classification
- natural language processing