MAiVAR: Multimodal Audio-Image and Video Action Recognizer.
Muhammad Bilal ShaikhDouglas ChaiSyed Mohammed Shamsul IslamNaveed AkhtarPublished in: CoRR (2022)
Keyphrases
- multimedia
- visual data
- image data
- video files
- image segmentation
- image representation
- multiscale
- image retrieval
- input image
- image frames
- segmentation method
- image content
- video sequences
- image features
- low level
- single image
- digital video
- image collections
- audio visual
- high resolution
- video streams
- image processing
- visual cues
- audio video
- audio files
- temporal continuity
- video data
- video content
- human actions
- signal processing
- video shots
- speech recognition
- digital audio
- scene change detection