Login / Signup
Uni-Dubbing: Zero-Shot Speech Synthesis from Visual Articulation.
Songju Lei
Xize Cheng
Mengjiao Lyu
Jianqiao Hu
Jintao Tan
Runlin Liu
Lingyu Xiong
Tao Jin
Xiandong Li
Zhou Zhao
Published in:
ACL (1) (2024)
Keyphrases
</>
speech synthesis
speech recognition
text to speech
prosodic features
vocal tract
visual information
multiscale
low level
real time
feature selection
high level
hidden markov models
image acquisition
speech signal
visual perception
visual properties