Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition.
Yukun ZuoHantao YaoLiansheng ZhuangChangsheng XuPublished in: CoRR (2024)
Keyphrases
- audio visual
- video summarization
- visual data
- temporal segmentation
- multimedia
- multi modal
- meeting room
- audio features
- audio visual content
- multi stream
- temporal context
- video sequences
- visual information
- video data
- multimodal fusion
- object recognition
- human activities
- audio visual speech recognition
- activity recognition
- multimedia data
- video content
- pattern recognition
- action recognition
- video streams
- space time
- data analysis
- video retrieval
- human actions
- feature extraction
- computer vision