Hierarchical Multimodal Transformer with Localness and Speaker Aware Attention for Emotion Recognition in Conversations.
Xiao JinJianfei YuZixiang DingRui XiaXiangsheng ZhouYaofeng TuPublished in: NLPCC (2) (2020)
Keyphrases
- emotion recognition
- audio visual
- speaker verification
- multi modal
- emotional speech
- visual information
- multimedia
- multi stream
- emotion classification
- visual data
- fuzzy logic
- audio features
- speaker recognition
- sentiment analysis
- visual attention
- physiological signals
- computational intelligence
- image database
- affective states
- neural network