Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model.

Published in: CoRR (2024)

Keyphrases