Login / Signup
Training speaker embedding extractors using multi-speaker audio with unknown speaker boundaries.
Themos Stafylakis
Ladislav Mosner
Oldrich Plchot
Johan Rohdin
Anna Silnova
Lukás Burget
Jan Cernocký
Published in:
INTERSPEECH (2022)
Keyphrases
</>
audio visual
speaker recognition
speaker verification
speaker identification
speech recognition
prosodic features
speaker diarization
multimedia
vector quantization
neural network
visual data
video data
multi modal
audio signals
signal processing
speaker dependent
audio stream
automatic transcription