Compositional Embedding Models for Speaker Identification and Diarization with Simultaneous Speech From 2+ Speakers.
Zeqian LiJacob WhitehillPublished in: ICASSP (2021)
Keyphrases
- speaker identification
- speech recognition
- speaker dependent
- speech signal
- speaker recognition
- speech processing
- gaussian mixture model
- broadcast news
- speaker diarization
- noisy environments
- speaker independent
- hidden markov models
- computer vision
- mel frequency cepstral coefficients
- prior knowledge
- multiscale
- automatic speech recognition
- image search
- non stationary
- language model
- feature vectors
- speech recognizer
- feature space
- feature extraction
- machine learning