Knowledge Distillation for Efficient Audio-Visual Video Captioning.
Özkan ÇayliXubo LiuVolkan KiliçWenwu WangPublished in: EUSIPCO (2023)
Keyphrases
- audio visual
- video summarization
- multimedia
- multi modal
- visual data
- domain knowledge
- audio features
- video data
- meeting room
- visual information
- multi stream
- person authentication
- temporal context
- video content
- video frames
- video sequences
- audio visual content
- video streams
- audio visual speech recognition
- data sets
- space time
- human body
- hidden markov models
- knowledge base