Blind Extraction Of Target Speech Source: Three Ways Of Guidance Exploiting Supervised Speaker Embeddings.
Jirí MálekJaroslav CmejlaZbynek KoldovskýPublished in: IWAENC (2022)
Keyphrases
- speech recognition
- speaker recognition
- audio visual
- automatic speech recognition
- speaker identification
- speaker verification
- speaker dependent
- speech signal
- automatic speech recognition systems
- speaker diarization
- prosodic features
- supervised learning
- vocal tract
- speech synthesis
- semi supervised
- vector space
- multiple sources
- text to speech
- broadcast news
- speaker independent
- feature selection
- information extraction
- learning algorithm
- target object
- manifold learning
- speech sounds
- automatic transcription
- multi modal
- speaker adaptation
- audio stream
- acoustic models
- low dimensional
- machine learning
- speech recognizer
- pattern recognition
- hidden markov models
- dimensionality reduction
- gaussian mixture model
- language model
- vector quantization
- non stationary