Login / Signup
Conversational Speech Recognition by Learning Audio-Textual Cross-Modal Contextual Representation.
Kun Wei
Bei Li
Hang Lv
Quan Lu
Ning Jiang
Lei Xie
Published in:
IEEE ACM Trans. Audio Speech Lang. Process. (2024)
Keyphrases
</>
cross modal
visual recognition
perceptual information
multi modal
pattern recognition
multimedia
conversational speech
action recognition
visual information
multimedia retrieval