Multimodal fusion for audio-image and video action recognition.
Muhammad Bilal ShaikhDouglas ChaiSyed Mohammed Shamsul IslamNaveed AkhtarPublished in: Neural Comput. Appl. (2024)
Keyphrases
- action recognition
- static images
- human actions
- multimodal fusion
- visual data
- action classification
- bag of features
- multimedia
- image data
- computer vision
- image features
- bag of words
- action detection
- image content
- image classification
- recognizing human actions
- audio visual
- image retrieval
- high robustness
- recognition of human actions
- video frames
- human activities
- video sequences
- relevance feedback
- image representation
- space time
- activity recognition
- body parts
- feature points
- visual information
- visual features
- object detection
- multimodal interfaces
- image collections
- spatio temporal
- key frames
- high level