Speech inpainting: Context-based speech synthesis guided by video.
Juan F. MontesinosDaniel MichelsantiGloria HaroZheng-Hua TanJesper JensenPublished in: CoRR (2023)
Keyphrases
- speech synthesis
- speech recognition
- text to speech
- vocal tract
- video data
- prosodic features
- speech corpus
- video sequences
- video streams
- video retrieval
- video frames
- video content
- video analysis
- information retrieval
- video processing
- real time
- multimedia
- audio video
- hidden markov models
- automatic speech recognition
- video clips
- multimedia data
- space time
- image restoration
- event detection
- digital video
- key frames
- language model
- machine learning