Speech Reconstruction With Reminiscent Sound Via Visual Voice Memory.

Joanna Hong Minsu Kim Se Jin Park Yong Man Ro

Published in: IEEE ACM Trans. Audio Speech Lang. Process. (2021)

Keyphrases

text to speech
fundamental frequency
emotion recognition
speech synthesis
visual features
speech sounds
speech recognition
memory usage
voice activity detection
text to speech synthesis
speech recognition errors
high level
automatic speech recognition systems
memory requirements
visual information
reconstruction method
acoustic features
content based video retrieval
speech quality
low level
three dimensional
compressive sensing
dialogue system
spoken language
speech signal
audio visual
multi modal