Login / Signup

Speaker-Targeted Audio-Visual Speech Recognition Using a Hybrid CTC/Attention Model with Interference Loss.

Ryota TsunodaRyo AiharaRyoichi TakashimaTetsuya TakiguchiYoshie Imai
Published in: ICASSP (2022)
Keyphrases
  • audio visual
  • multi modal
  • feature extraction
  • speech recognition
  • speaker verification
  • data sets
  • data analysis
  • temporal context
  • multi stream
  • audio visual speech recognition