Login / Signup
Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition.
Zi-qiang Zhang
Jie Zhang
Jian-Shu Zhang
Ming-Hui Wu
Xin Fang
Li-Rong Dai
Published in:
CoRR (2022)
Keyphrases
</>
audio visual
audio visual speech recognition
multi stream
multi modal
data sets
domain knowledge
visual information
computer vision
e learning
multimedia
spatio temporal