Combining acoustic and visual features to detect laughter in adults' speech.
Hrishikesh RaoZhefan YeYin LiMark A. ClementsAgata RozgaJames M. RehgPublished in: AVSP (2015)
Keyphrases
- visual features
- acoustic features
- visual information
- audio visual
- prosodic features
- image classification
- visual content
- image retrieval
- speaker verification
- audio features
- visual data
- image annotation
- image search
- low level
- low level features
- keywords
- content based video retrieval
- visual appearance
- semantic features
- speech recognition
- speech synthesis
- image collections
- text to speech
- bridge the semantic gap
- web images
- automatic speech recognition
- multi modal
- global features
- speech sounds
- visual descriptors
- speech signal
- semantic concepts
- key frames
- low level visual features
- visual properties
- video shots