FastSpeechStyle: Fast, Emotion Controllable, and High-Quality Speech Synthesis.
Van-Thinh NguyenTri-Nhan DoHung-Cuong PhamTuan Vu HoMinh-Khanh Nguyen NgocDang-Khoa MacPublished in: Int. J. Asian Lang. Process. (2022)
Keyphrases
- speech synthesis
- high quality
- speech recognition
- text to speech
- vocal tract
- prosodic features
- higher quality
- ground truth
- facial expressions
- speech corpus
- low quality
- high resolution
- text to speech synthesis
- image quality
- emotion recognition
- data sets
- virtual agents
- database
- high fidelity
- depth map
- emotional state
- field of view
- human computer interaction
- yields high quality
- music emotion classification