Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning.
Bogdan MocanuRuxandra TapuTitus B. ZahariaPublished in: Image Vis. Comput. (2023)
Keyphrases
- cross modal
- metric learning
- audio video
- multi modal
- multiple features
- distance metric
- multimedia
- semi supervised
- multimedia retrieval
- learning tasks
- pairwise
- distance function
- dimensionality reduction
- multi task
- feature space
- image retrieval
- visual data
- visual recognition
- semi supervised learning
- multimedia data
- multimedia databases
- multi party
- databases
- machine learning algorithms
- visual similarity
- distance measure
- low dimensional
- high dimensional
- data points
- background knowledge