A New Approach to Integrate Audio and Visual Features of Speech.
Hao PanZhi-Pei LiangThomas S. HuangPublished in: IEEE International Conference on Multimedia and Expo (II) (2000)
Keyphrases
- visual features
- audio features
- visual information
- acoustic features
- audio visual
- content based video retrieval
- audio stream
- visual data
- image classification
- speaker identification
- visual content
- image retrieval
- low level
- image search
- keywords
- broadcast news
- bag of features
- image annotation
- emotion recognition
- low level features
- semantic concepts
- semantic features
- visual appearance
- web images
- speech recognition
- multi modal
- visual similarity
- semantic gap
- visual descriptors
- key frames
- bridge the semantic gap
- video shots
- speech signal
- labeled images
- image database
- textual features