An Integrated Deep Clustering-Based System for Speaker Count Agnostic Speech Separation.
Jean-Marie LemercierLeroy BartelDavid DitterTimo GerkmannPublished in: ITG Conference on Speech Communication (2021)
Keyphrases
- speech recognition
- speaker recognition
- automatic speech recognition
- audio visual
- speaker verification
- speaker identification
- speaker dependent
- prosodic features
- speech signal
- vocal tract
- speech synthesis
- speaker diarization
- automatic speech recognition systems
- acoustic models
- acoustic features
- text to speech
- speaker independent
- synthesized speech
- automatic transcription
- gaussian mixture model
- speech sounds
- noisy environments
- hidden markov models
- speaker adaptation
- vector quantization
- broadcast news
- speech recognizer
- audio stream
- language identification
- data sets
- sound source
- language model
- phoneme recognition
- digit recognition
- mel frequency cepstral coefficients
- deep learning
- speech recognition systems
- speech enhancement
- pattern recognition
- multimedia
- visual speech
- language acquisition
- non stationary
- visual information
- probabilistic neural network