Login / Signup
Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition.
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
Published in:
CoRR (2022)
Keyphrases
</>
audio visual speech recognition
video data
video sequences
multi stream
video content
multimedia
audio visual
video streams
video analysis
space time
low level
high dimensional
video frames
multimedia data
video retrieval
audio signals