Login / Signup
Audio-Visual Speech Recognition is Worth 32×32×8 Voxels.
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
Published in:
CoRR (2021)
Keyphrases
</>
audio visual speech recognition
multi stream
audio visual
pattern recognition
information retrieval
high level
hidden markov models
motion estimation
dimensionality reduction
multi modal
visual features
speech recognition
noisy environments