Login / Signup
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders.
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Köhler
Qing He
Published in:
CoRR (2022)
Keyphrases
</>
contextual information
speech recognition
semantic information
data sets
similarity measure
keywords
sensor networks
dimensionality reduction
context sensitive
source localization