Login / Signup
End-to-End Multi-Person Audio/Visual Automatic Speech Recognition.
Otavio Braga
Takaki Makino
Olivier Siohan
Hank Liao
Published in:
CoRR (2022)
Keyphrases
</>
end to end
audio visual
automatic speech recognition
speech recognition
multi modal
speech signal
hidden markov models
multi stream
visual information
visual data
broadcast news
multimedia
emotion recognition
noisy environments
speaker verification
passage retrieval
pattern recognition
audio features
contextual information
context aware
image data
acoustic features
video sequences