Login / Signup
Multimodal Transformer Distillation for Audio-Visual Synchronization.
Xuanjun Chen
Haibin Wu
Chung-Che Wang
Hung-yi Lee
Jyh-Shing Roger Jang
Published in:
CoRR (2022)
Keyphrases
</>
audio visual
multi stream
multi modal
audio visual speech recognition
visual information
multimodal fusion
visual data
emotion recognition
temporal context
person authentication
multimedia
nearest neighbor
computer vision
hidden markov models
image regions
audio features