Compositional Embedding Models for Speaker Identification and Diarization with Simultaneous Speech From 2+ Speakers.

Zeqian Li Jacob Whitehill

Published in: ICASSP (2021)

Keyphrases

speaker identification
speech recognition
speaker dependent
speech signal
speaker recognition
speech processing
gaussian mixture model
broadcast news
speaker diarization
noisy environments
speaker independent
hidden markov models
computer vision
mel frequency cepstral coefficients
prior knowledge
multiscale
automatic speech recognition
image search
non stationary
language model
feature vectors
speech recognizer
feature space
feature extraction
machine learning