VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection.

Joanna Hong Minsu Kim Yong Man Ro

Published in: CoRR (2022)

Keyphrases

speech synthesis
speech recognition
prosodic features
vocal tract
feature selection
text to speech
automatic speech recognition
hidden markov models
speech signal
speech corpus
real time
video data
data access
remote access
language model
distributed data management
speaker identification
speaker dependent
storage management
noisy environments
pattern recognition
machine learning
speaker verification
speaker adaptation
replicated data
broadcast news
feature set
neural network