Contrastive Regularization for Multimodal Emotion Recognition Using Audio and Text.
Fan QianJiqing HanPublished in: CoRR (2022)
Keyphrases
- multimodal fusion
- audio visual
- emotion recognition
- text graphics
- multimedia
- text to speech synthesis
- cross media retrieval
- cross modal
- multi modal
- high robustness
- text retrieval
- multimodal information
- human language
- free text
- information retrieval
- multimodal interfaces
- text mining
- regularization parameter
- keywords
- spoken documents
- text to speech
- relevance feedback
- semantic information
- multiple modalities
- image restoration
- database
- multimedia documents
- visual features
- story segmentation
- multi stream
- text documents
- multimodal interaction
- broadcast news
- audio features
- sentence level