Login / Signup
Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis.
Jingbei Li
Yi Meng
Chenyi Li
Zhiyong Wu
Helen Meng
Chao Weng
Dan Su
Published in:
CoRR (2021)
Keyphrases
</>
multi modal
multi modality
cross modal
machine learning
high dimensional
image processing
word processing
image annotation
audio visual
visual recognition
auto annotation
single modality
speech recognition
contextual information
context aware
relevance feedback
multimedia