Speaker Extraction with Co-Speech Gestures Cue.
Zexu PanXinyuan QianHaizhou LiPublished in: CoRR (2022)
Keyphrases
- speech recognition
- automatic speech recognition
- speaker recognition
- audio visual
- hidden markov models
- spoken words
- speaker verification
- speaker identification
- prosodic features
- vocal tract
- speech synthesis
- speaker diarization
- speaker dependent
- speech signal
- automatic extraction
- hand movements
- gesture recognition
- automatic speech recognition systems
- audio stream
- hand gestures
- acoustic features
- phoneme recognition
- synthesized speech
- broadcast news
- text to speech
- visual cues
- speech recognizer
- noisy environments
- speaker independent
- multi modal
- pattern recognition
- neural network
- human robot interaction
- visual information
- vector quantization
- language model
- information extraction
- multimedia