Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis.
Konstantinos KlapsasKarolos NikitarasNikolaos EllinasJune Sig SungInchul HwangSpyros RaptisAimilios ChalamandarisPirros TsiakoulisPublished in: CoRR (2022)