Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors.
Sindhu B. HegdeRudrabha MukhopadhyayVinay P. NamboodiriC. V. JawaharPublished in: CoRR (2022)
Keyphrases
- audio visual
- video summarization
- person authentication
- visual data
- multimodal fusion
- multimedia
- meeting room
- multi modal
- audio features
- audio visual content
- visual information
- temporal context
- video data
- video sequences
- multi stream
- video content
- multimedia data
- space time
- audio visual speech recognition
- face images
- high dimensional
- video analysis
- human actions
- video streams
- contextual information
- context aware
- image classification
- xml documents
- keywords
- face recognition