Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech.
Guangyan ZhangThomas MerrittManuel Sam RibeiroBiel Tura VecinoKayoko YanagisawaKamil PokoraAbdelhamid EzzergSebastian CygertAmmar AbbasPiotr BilinskiRoberto Barra-ChicoteDaniel KorzekwaJaime Lorenzo-TruebaPublished in: CoRR (2023)