MAiVAR: Multimodal Audio-Image and Video Action Recognizer.
Muhammad Bilal ShaikhDouglas ChaiSyed Mohammed Shamsul IslamNaveed AkhtarPublished in: VCIP (2022)
Keyphrases
- visual data
- multimedia
- video files
- multiscale
- image data
- single image
- image features
- image classification
- image content
- image retrieval
- human actions
- input image
- audio video
- image frames
- key frames
- story segmentation
- visual cues
- audio visual
- audio files
- video content
- visual concepts
- digital video
- image segmentation
- visual information
- image representation
- low level
- video clips
- image collections
- image sequences
- video frames
- segmentation method
- news video
- action classification
- high resolution
- multimodal fusion
- computer vision