Phone and speaker spatial organization in self-supervised speech representations.
Pablo RieraManuela CerdeiroLeonardo PepinoLuciana FerrerPublished in: CoRR (2023)
Keyphrases
- speech recognition
- acoustic models
- speaker recognition
- audio visual
- automatic speech recognition
- speaker identification
- speaker verification
- speaker diarization
- spatial data
- hidden markov models
- spatial and temporal
- prosodic features
- speech synthesis
- spatial information
- speech recognizer
- information systems
- broadcast news
- automatic speech recognition systems
- speaker dependent
- vocal tract
- spatial distribution
- speech signal
- spatio temporal
- spatial reasoning
- emotional speech
- probabilistic neural network
- knowledge management
- symbolic representation
- synthesized speech
- phoneme recognition
- feature extraction
- natural language
- speaker independent
- mobile phone
- multi modal
- gaussian mixture model
- spatial databases
- spoken document retrieval
- mel frequency cepstral coefficients
- recognition engine
- acoustic features
- higher level
- spoken language
- speech sounds
- noisy environments
- neural network