Login / Signup

Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer.

Maxime BurchiKrishna C. PuvvadaJagadeesh BalamBoris GinsburgRadu Timofte
Published in: CoRR (2024)
Keyphrases
  • audio visual speech recognition
  • recurrent neural networks
  • multi stream
  • nearest neighbor
  • audio visual
  • multi modal
  • eye movements
  • noisy environments
  • e learning
  • video sequences