Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism.
Jisi ZhangCatalin ZorilaRama DoddipatlaJon BarkerPublished in: CoRR (2021)
Keyphrases
- spatial information
- speech recognition
- speaker recognition
- automatic speech recognition
- audio visual
- spatial distribution
- speaker identification
- spatial relationships
- speaker verification
- spatial relations
- temporal information
- local binary pattern
- speech signal
- frequency domain
- speaker diarization
- speaker dependent
- vocal tract
- speech synthesis
- prosodic features
- spatial resolution
- automatic speech recognition systems
- audio stream
- speech sounds
- region connection calculus
- intensity values
- spatial features
- gaussian mixture model
- topological information
- speech recognizer
- broadcast news
- discriminative ability
- multiscale
- color information
- visual words
- automatic transcription
- language model
- image data