Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE.
Marc-Antoine GeorgesJean-Luc SchwartzThomas HueberPublished in: INTERSPEECH (2022)
Keyphrases
- acoustic features
- speech signal
- speaker verification
- automatic speech recognition
- vector quantization
- visual features
- speaker recognition
- speech recognition
- music information retrieval
- image compression
- audio features
- vocal tract
- cross correlation
- knowledge discovery
- audio stream
- mel frequency cepstral coefficients
- noisy environments
- audio visual
- neural network
- speaker identification
- hidden markov models
- image retrieval
- pattern recognition