Multimodal Speech Emotion Recognition Using Audio and Text.
Seunghyun YoonSeokhyun ByunKyomin JungPublished in: SLT (2018)
Keyphrases
- text to speech synthesis
- audio visual
- emotion recognition
- text to speech
- multimodal fusion
- multimodal interfaces
- multimodal interaction
- multi modal
- multi stream
- emotional speech
- speech synthesis
- audio stream
- prosodic features
- spoken documents
- visual information
- text graphics
- multi lingual
- multimedia
- broadcast news
- information retrieval
- audio features
- facial expressions
- spontaneous speech
- speaker identification
- multimodal information
- lexical features
- digital audio
- text input
- affect sensing
- human computer interaction
- speech recognition
- affect detection
- visual data
- story segmentation
- speaker verification
- emotion classification
- human language
- text recognition
- automatic transcription
- audio recordings
- audio video
- text data
- emotional state
- text mining
- affective states
- high robustness
- audio signals
- cepstral features