CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis.

Published in: CoRR (2023)

Keyphrases