PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion.

Tianhua Qi Wenming Zheng Cheng Lu Yuan Zong Hailun Lian

Published in: CoRR (2024)

Keyphrases

end to end
text to speech
emotion recognition
synthesized speech
speech synthesis
prosodic features
congestion control
audio visual
admission control
multipath
ad hoc networks
wireless ad hoc networks
high bandwidth
transport layer
speech recognition
real time
scalable video
content delivery
internet protocol