Audio-visual video face hallucination with frequency supervision and cross modality support by speech based lip reading loss.
Shailza SharmaAbhinav DhallVinay KumarVivek Singh BawaPublished in: CoRR (2022)
Keyphrases
- audio visual
- visual data
- multi modal
- lip reading
- multimedia
- audio features
- visual information
- emotion recognition
- video data
- speaker verification
- speaker identification
- space time
- video streams
- image sequences
- key frames
- data sets
- video sequences
- human motion
- feature vectors
- multimedia data
- action recognition
- high dimensional
- image features