Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS.

Kenta Udagawa Yuki Saito Hiroshi Saruwatari

Published in: INTERSPEECH (2022)

Keyphrases

speaker adaptation
speech recognition
speaker dependent
maximum likelihood
automatic speech recognition
prosodic features
human subjects
text to speech
speaker identification
speaker independent
language model
image processing
machine learning
speaker verification
speaker recognition
neural network
speech synthesis
computer vision