Login / Signup

Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer.

Maxime BurchiKrishna C. PuvvadaJagadeesh BalamBoris GinsburgRadu Timofte
Published in: ICASSP (2024)
Keyphrases
  • audio visual speech recognition
  • recurrent neural networks
  • multi stream
  • nearest neighbor
  • audio visual
  • feature selection
  • image sequences
  • speech recognition