SVTS: Scalable Video-to-Speech Synthesis.

Rodrigo Schoburg Carrillo de Mira Alexandros Haliassos Stavros Petridis Björn W. Schuller Maja Pantic

Published in: INTERSPEECH (2022)

Keyphrases

speech synthesis
scalable video
end to end
speech recognition
bitstream
text to speech
video transmission over wireless
video streaming
video quality
vocal tract
scalable video coding
joint source and channel coding
base layer
video transmission
video coding
bit rate
multiresolution
neural network
language model
feature vectors
computer vision
information retrieval