Login / Signup
End-to-end multi-talker audio-visual ASR using an active speaker attention module.
Richard Rose
Olivier Siohan
Published in:
CoRR (2022)
Keyphrases
</>
audio visual
end to end
multi modal
speaker verification
sound source
visual data
visual information
emotion recognition
automatic speech recognition
multi stream
audio visual speech recognition
congestion control
speech recognition
audio features
multimedia
data processing
sensor networks
data analysis