Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction.
Yi ZhaoHaoyu LiCheng-I LaiJennifer WilliamsErica CooperJunichi YamagishiPublished in: CoRR (2020)
Keyphrases
- vector quantization
- text to speech
- speech synthesis
- speaker recognition
- vector quantizer
- image compression
- vector quantized
- finite state vector quantization
- speech recognition
- audio visual
- three dimensional
- multi stream
- reconstructed image
- image coding
- codebook generation
- medical image compression
- synthesized speech
- distortion measure
- image reconstruction
- progressive transmission
- reconstruction process
- spontaneous speech
- prosodic features
- emotion recognition
- noisy environments
- speech signal
- vector quantisation
- high resolution