Talking face generation driven by time-frequency domain features of speech audio.
Jiye ZhangYazhi LiuXiong LiWei LiYing TangPublished in: Displays (2023)
Keyphrases
- audio visual
- person authentication
- audio features
- cepstral features
- multimodal fusion
- visual speech
- signal processing
- feature set
- spectral features
- domain specific
- feature extraction
- audio stream
- acoustic signals
- hidden markov models
- recognition engine
- broadcast news
- image features
- emotion recognition
- human faces
- text to speech
- low level
- wavelet transform
- feature space
- keypoints
- speech recognition