Login / Signup
Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR.
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
Published in:
CoRR (2022)
Keyphrases
</>
cross modal
perceptual information
multi modal
multimedia
learning algorithm
visual recognition
learning tasks
metadata
similarity measure
keywords
image retrieval
supervised learning
contextual information
image understanding
visual data
visual concepts