Audio-visual Saliency for Omnidirectional Videos.
Yuxin ZhuXilei ZhuHuiyu DuanJie LiKaiwei ZhangYucheng ZhuLi ChenXiongkuo MinGuangtao ZhaiPublished in: CoRR (2023)
Keyphrases
- audio visual
- video summarization
- sports video
- multi modal
- visual data
- audio features
- visual information
- video sequences
- multimedia
- multimodal fusion
- multi stream
- temporal context
- video data
- human activities
- audio visual speech recognition
- vision system
- low level
- video frames
- text classification
- key frames
- high dimensional
- image sequences