Identifying emotion in speech prosody using acoustical cues of harmony.

Takashi X. Fujisawa Norman D. Cook

Published in: INTERSPEECH (2004)

Keyphrases

prosodic features
text to speech
speech synthesis
text to speech synthesis
audio visual
multimodal fusion
emotion recognition
speaker verification
speech recognition
emotional state
emotional speech
multi stream
multimodal interaction
multimodal interfaces
vocal tract
facial expressions
information retrieval
automatic speech recognition
spontaneous speech
synthesized speech
pattern recognition
visual information
audio signal
prior knowledge
speech signal