A Bimodal Approach for Speech Emotion Recognition using Audio and Text.
Oxana VerkholyakAnastasia DvoynikovaAlexey KarpovPublished in: J. Internet Serv. Inf. Secur. (2021)
Keyphrases
- text to speech synthesis
- text to speech
- emotion recognition
- audio visual
- speech synthesis
- spoken documents
- text graphics
- prosodic features
- audio stream
- emotional speech
- human language
- broadcast news
- emotional state
- audio signals
- english text
- multimodal fusion
- speech processing
- human computer interaction
- spontaneous speech
- multimedia
- affect sensing
- multi lingual
- speaker identification
- speech recognition
- text recognition
- information retrieval
- speaker verification
- audio recordings
- text mining
- cross media retrieval
- automatic transcription
- keywords
- multi modal
- audio content
- text input
- content based video retrieval
- language generation
- emotion classification
- facial expressions
- information extraction
- digital audio
- cepstral features
- multimodal interaction