cross-modal fusion techniques for utterance-level emotion recognition from text and speech.
Jiachen LuoHuy PhanJoshua D. ReissPublished in: CoRR (2023)
Keyphrases
- emotion recognition
- cross modal
- audio visual
- multi modal
- information fusion
- emotional speech
- visual data
- emotion classification
- speech recognition
- human computer interaction
- multimedia retrieval
- facial expressions
- sentiment analysis
- text retrieval
- data fusion
- lexical features
- facial images
- multimedia databases
- information retrieval
- text mining
- text data
- sentence level
- spoken language
- image retrieval
- emotional state
- visual information
- nearest neighbor
- high dimensional
- keywords