Login / Signup
End-to-End Audio-Visual Neural Speaker Diarization.
Mao-Kui He
Jun Du
Chin-Hui Lee
Published in:
INTERSPEECH (2022)
Keyphrases
</>
end to end
audio visual
speaker diarization
speaker verification
multi modal
speech recognition
visual information
emotion recognition
visual data
multimedia
neural network
audio features
broadcast news
bayesian information criterion
pattern recognition
sound source
video sequences
speaker identification