Login / Signup
Audio-Visual Speech Recognition is Worth $32\times 32\times 8$ Voxels.
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
Published in:
ASRU (2021)
Keyphrases
</>
motion estimation
audio visual speech recognition
multimedia