Login / Signup

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis.

Karolos NikitarasKonstantinos KlapsasNikolaos EllinasGeorgia ManiatiJune Sig SungInchul HwangSpyros RaptisAimilios ChalamandarisPirros Tsiakoulis
Published in: CoRR (2022)
Keyphrases
  • speech synthesis
  • reinforcement learning
  • learning process
  • speech recognition
  • neural network
  • machine learning
  • higher level
  • social network analysis