Combining semantic and acoustic features for valence and arousal recognition in speech.
Seliz Gulsen KaradoganJan LarsenPublished in: CIP (2012)
Keyphrases
- acoustic features
- environmental sounds
- mel frequency cepstral coefficients
- speech signal
- automatic speech recognition
- speaker verification
- emotional state
- visual speech
- visual features
- audio features
- emotion recognition
- speech recognition
- music information retrieval
- noisy environments
- object recognition
- natural language
- cross correlation
- pattern recognition
- audio stream
- action recognition
- non stationary
- cepstral coefficients
- audio visual
- speaker recognition
- feature extraction
- high level
- multi modal
- feature set
- image features
- hidden markov models
- artificial neural networks