Login / Signup

MLCA-AVSR: Multi-Layer Cross Attention Fusion Based Audio-Visual Speech Recognition.

He WangPengcheng GuoPan ZhouLei Xie
Published in: ICASSP (2024)
Keyphrases
  • multi layer
  • audio visual speech recognition
  • multi stream
  • neural network
  • audio visual
  • speech recognition
  • information fusion
  • neural nets
  • face recognition
  • multiresolution