Video Face Recognition with Audio-Visual Aggregation Network.
Qinbo LiQing WanSang-Heon LeeYoonsuck ChoePublished in: ICONIP (4) (2021)
Keyphrases
- audio visual
- face recognition
- video summarization
- visual data
- multi modal
- multimedia
- meeting room
- temporal context
- audio visual content
- video data
- audio features
- video content
- multi stream
- visual information
- multimodal fusion
- audio visual speech recognition
- video streams
- person authentication
- data sets
- computer vision
- video frames
- space time
- video sequences
- three dimensional
- video retrieval
- visual features