VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over.
Junchen LuBerrak SismanRui LiuMingyang ZhangHaizhou LiPublished in: CoRR (2021)
Keyphrases
- text to speech
- speech synthesis
- prosodic features
- text to speech synthesis
- multimodal interaction
- word processing
- highly accurate
- high accuracy
- fully automatic
- vocal tract
- visual speech recognition
- semi automatic
- data sets
- high quality
- high precision
- question answering
- language model
- visual speech
- speech recognition errors
- neural network