Predicting Valence and Arousal by Aggregating Acoustic Features for Acoustic-Linguistic Information Fusion.
Bagus Tris AtmajaYasuhiro HamadaMasato AkagiPublished in: TENCON (2020)
Keyphrases
- information fusion
- acoustic features
- emotion recognition
- speaker verification
- emotional state
- visual features
- data fusion
- speech signal
- music information retrieval
- audio features
- automatic speech recognition
- mel frequency cepstral coefficients
- environmental sounds
- soft computing
- speaker recognition
- cross correlation
- audio visual
- natural language processing
- low level
- multi sensor information fusion
- mental states
- noisy environments
- artificial intelligence
- multilayer perceptron
- speech recognition
- computational intelligence
- video sequences