E2E-V2SResNet: Deep residual convolutional neural networks for end-to-end video driven speech synthesis.
Nasir SaleemJiechao GaoMuhammad IrfanElena VerdúJavier Parra FuentePublished in: Image Vis. Comput. (2022)
Keyphrases
- end to end
- speech synthesis
- convolutional neural networks
- scalable video
- speech recognition
- text to speech
- multimedia
- video data
- video sequences
- vocal tract
- wireless ad hoc networks
- video frames
- video content
- video streams
- convolutional network
- high bandwidth
- multipath
- admission control
- congestion control
- internet protocol
- real world
- transport layer
- text localization and recognition
- real time
- rate allocation
- speech signal
- information retrieval
- neural network