Login / Signup
CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis.
Yi Meng
Xiang Li
Zhiyong Wu
Tingtian Li
Zixun Sun
Xinyu Xiao
Chi Sun
Hui Zhan
Helen Meng
Published in:
CoRR (2023)
Keyphrases
</>
cross modal
text to speech synthesis
multi modal
perceptual information
multimedia retrieval
visual recognition
image retrieval
multimedia databases