Login / Signup
Applying Segment-Level Attention on Bi-Modal Transformer Encoder for Audio-Visual Emotion Recognition.
Jia-Hao Hsu
Chung-Hsien Wu
Published in:
IEEE Trans. Affect. Comput. (2023)
Keyphrases
</>
audio visual
emotion recognition
multi modal
speaker verification
emotional speech
visual information
visual data
multimedia
multi stream
machine learning
computer vision
feature space
user interface
affective states