Phone and Speaker Spatial Organization in Self-Supervised Speech Representations.
Pablo RieraManuela CerdeiroLeonardo PepinoLuciana FerrerPublished in: ICASSP Workshops (2023)
Keyphrases
- speech recognition
- acoustic models
- audio visual
- speaker recognition
- automatic speech recognition
- speaker verification
- speaker identification
- speech signal
- automatic speech recognition systems
- spatial data
- speaker dependent
- speaker diarization
- spatial and temporal
- speech synthesis
- hidden markov models
- prosodic features
- spatial information
- broadcast news
- speech recognizer
- vocal tract
- space time
- mobile phone
- spatio temporal
- information systems
- emotional speech
- speaker independent
- audio stream
- speech sounds
- multi modal
- feature extraction
- spatial reasoning
- spatial distribution
- spatial relations
- visual information
- language model
- neural network
- spontaneous speech
- automatic transcription
- spoken term detection
- visual speech
- text to speech
- probabilistic neural network
- human machine interaction
- spoken language
- emotion recognition
- noisy environments
- dialogue system
- knowledge management
- image sequences