Login / Signup
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR.
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
Published in:
CoRR (2023)
Keyphrases
</>
speech recognition
audio visual
automatic speech recognition
statistical models
data sets
probabilistic model
information retrieval
computer vision
image processing
case study
model selection
computational models
endpoint detection