Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE.
Marc-Antoine GeorgesJean-Luc SchwartzThomas HueberPublished in: CoRR (2022)
Keyphrases
- acoustic features
- speech signal
- automatic speech recognition
- speaker verification
- vector quantization
- visual features
- music information retrieval
- speaker recognition
- speech recognition
- audio stream
- image compression
- vocal tract
- mel frequency cepstral coefficients
- cross correlation
- audio features
- audio visual
- knowledge discovery
- speaker identification
- noisy environments
- hidden markov models