DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding.

Junmo Lee Kwang-Sub Song Kyoung Jin Noh Tae-Jun Park Joon-Hyuk Chang

Published in: ICEIC (2019)

Keyphrases

speech synthesis
prosodic features
speech recognition
vocal tract
text to speech
speaker verification
speaker recognition
audio visual
automatic speech recognition
speaker identification
temporal constraints
spatial and temporal
speaker dependent
training process
noisy environments
language model
speaker adaptation
speech signal
temporal reasoning
speaker diarization
information retrieval
speech corpus
temporal data
temporal patterns
vector space
bayesian networks