Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters.

Published in: CoRR (2024)

Keyphrases