A Cantonese Speech-Driven Talking Face Using Translingual Audio-to-Visual Conversion.
Lei XieHelen MengZhi-Qiang LiuPublished in: ISCSLP (2006)
Keyphrases
- visual speech
- visual information
- audio visual
- hidden markov models
- speaker identification
- visual data
- audio stream
- recognition engine
- broadcast news
- text to speech
- audio signals
- speech recognition
- speech signal
- visual features
- content based video retrieval
- emotion recognition
- audio features
- noisy environments
- acoustic features
- cross modal
- human faces
- linear predictive coding
- digital audio
- prosodic features
- multimodal fusion
- cepstral features
- acoustic signals
- multi stream
- feature extraction
- speech processing
- video signals
- recognition algorithm
- low level
- speaker verification
- speech synthesis
- audio video
- audio signal
- music information retrieval
- news video
- audio recordings
- multi modal
- face images
- facial expressions
- multimedia
- speech music discrimination