Keyphrases
- audio visual
- prosodic features
- speech recognition
- speech synthesis
- multimedia
- spectral features
- multi modal
- real images are presented
- multi party
- text to speech
- multi stream
- visual information
- spoken language
- spontaneous speech
- automatic speech recognition
- dialogue system
- emotion recognition
- language acquisition
- speaker verification
- low level
- learning algorithm
- speech processing
- natural language processing
- visual data