Sign in

MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition.

He WangPengcheng GuoPan ZhouLei Xie
Published in: CoRR (2024)
Keyphrases
  • multi layer
  • audio visual speech recognition
  • multi stream
  • neural network
  • information fusion
  • neural nets
  • audio visual
  • machine learning
  • computer vision
  • artificial neural networks
  • probabilistic model