Login / Signup
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation.
Kun Wei
Bei Li
Hang Lv
Quan Lu
Ning Jiang
Lei Xie
Published in:
CoRR (2023)
Keyphrases
</>
cross modal
visual recognition
perceptual information
multimedia
multi modal
conversational speech
metadata
feature extraction
pattern recognition
object recognition
visual information
visual data
automatic speech recognition