Keyphrases
- prosodic features
- multimodal fusion
- audio visual
- speaker verification
- emotion recognition
- facial expressions
- visual cues
- speech recognition
- speaker diarization
- speaker recognition
- virtual agents
- multi modal
- information systems
- human emotion
- emotional speech
- computer vision
- appearance cues
- text to speech synthesis
- human decision making
- high robustness
- text to speech
- speaker identification
- automatic speech recognition
- sentiment analysis
- visual features
- pattern recognition