VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over.

Junchen Lu Berrak Sisman Rui Liu Mingyang Zhang Haizhou Li

Published in: CoRR (2021)

Keyphrases

text to speech
speech synthesis
prosodic features
text to speech synthesis
multimodal interaction
word processing
highly accurate
high accuracy
fully automatic
vocal tract
visual speech recognition
semi automatic
data sets
high quality
high precision
question answering
language model
visual speech
speech recognition errors
neural network