A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation.
Louis AiraleDominique VaufreydazXavier Alameda-PinedaPublished in: CoRR (2023)
Keyphrases
- multiscale
- audio visual
- speech recognition
- scale space
- real time
- edge detection
- dynamic model
- data sets
- emotion recognition
- wavelet transform
- image processing
- natural images
- visual information
- coarse to fine
- multi modal
- speech signal
- automatic speech recognition
- spoken language
- broadcast news
- speech synthesis
- generation process
- facial animation
- text to speech
- hand movements
- recognition engine