Speaker activity driven neural speech extraction.
Marc DelcroixKaterina ZmolíkováTsubasa OchiaiKeisuke KinoshitaTomohiro NakataniPublished in: CoRR (2021)
Keyphrases
- speech recognition
- speaker recognition
- audio visual
- automatic speech recognition
- speaker verification
- speaker identification
- vocal tract
- speaker diarization
- neural network
- automatic speech recognition systems
- speech signal
- prosodic features
- speaker dependent
- speech recognizer
- speech synthesis
- network architecture
- data driven
- human activities
- multi modal
- information extraction
- bio inspired
- gaussian mixture model
- automatic extraction
- speaker adaptation
- speech sounds
- hidden markov models
- synthesized speech
- speaker independent
- emotion recognition
- activity patterns
- text to speech
- activity theory
- noisy environments
- visual information
- pattern recognition
- spontaneous speech
- neural model
- language model
- feature extraction