Login / Signup

Conversational Speech Recognition by Learning Audio-Textual Cross-Modal Contextual Representation.

Kun WeiBei LiHang LvQuan LuNing JiangLei Xie
Published in: IEEE ACM Trans. Audio Speech Lang. Process. (2024)
Keyphrases
  • cross modal
  • visual recognition
  • perceptual information
  • multi modal
  • pattern recognition
  • multimedia
  • conversational speech
  • action recognition
  • visual information
  • multimedia retrieval