WavThruVec: Latent speech representation as intermediate features for neural speech synthesis.
Hubert SiuzdakPiotr DuraPol van RijnNori JacobyPublished in: CoRR (2022)
Keyphrases
- speech synthesis
- speech recognition
- text to speech
- vocal tract
- prosodic features
- representation scheme
- feature representation
- feature vectors
- feature construction
- image features
- temporal structure
- pattern recognition
- speech recognition systems
- feature extraction
- image processing
- extracting features
- neural network
- category labels
- spectral features
- intermediate representation
- visual representation
- network architecture
- co occurrence
- classification accuracy
- prior knowledge
- feature selection