Login / Signup
M3TTS: Multi-modal text-to-speech of multi-scale style control for dubbing.
Yan Liu
Li-Fang Wei
Xinyuan Qian
Tian-Hao Zhang
Song-Lu Chen
Xu-Cheng Yin
Published in:
Pattern Recognit. Lett. (2024)
Keyphrases
</>
text to speech
multi modal
multiscale
speech synthesis
prosodic features
programming tool
multi modality
text to speech synthesis
video search
cross modal
word processing
high dimensional
semantic concepts
audio visual
image annotation
writing skills
uni modal
image classification
edge detection
image processing