A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice.
Frederik BousAxel RoebelPublished in: Inf. (2022)
Keyphrases
- text to speech
- speech synthesis
- emotion recognition
- voice activity detection
- speech recognition errors
- fundamental frequency
- speech quality
- speech recognition
- audio features
- bit rate
- speech sounds
- audio visual
- speech signal
- acoustic features
- synthesized speech
- mel frequency cepstral coefficients
- noisy environments
- prosodic features
- rate distortion
- automatic speech recognition
- low complexity
- feature selection
- speaker identification
- vocal tract
- spoken language
- motion estimation
- decoding process
- speaker recognition
- music information retrieval
- rate control
- error correction
- endpoint detection
- feature set