MMSNet: Multi-modal scene recognition using multi-scale encoded features.
Ali CaglayanNevrez ImamogluRyosuke NakamuraPublished in: Image Vis. Comput. (2022)
Keyphrases
- multi modal
- multiscale
- multi modality
- multiple modalities
- feature space
- audio visual
- cross modal
- video search
- feature vectors
- low level
- image set
- semantic concepts
- high dimensional
- video sequences
- image sequences
- computer vision
- image processing
- uni modal
- humanoid robot
- key frames
- markov random field
- feature set
- input image
- image features
- feature extraction
- multimedia