Login / Signup
Audio-Visual Prosody: Perception, Detection, and Synthesis of Prominence.
Samer Al Moubayed
Jonas Beskow
Björn Granström
David House
Published in:
COST 2102 Training School (2010)
Keyphrases
</>
audio visual
multi modal
multi stream
temporal segmentation
visual information
emotion recognition
visual data
audio visual speech recognition
speaker verification
multimedia
person authentication
video summarization
temporal context
multimodal fusion
context aware
nearest neighbor
image retrieval
databases