Fusion of visual and audio features for person identification in real video.
Dongge LiGang WeiIshwar K. SethiNevenka DimitrovaPublished in: Storage and Retrieval for Media Databases (2001)
Keyphrases
- audio features
- person identification
- visual features
- low level
- audio visual
- visual information
- feature set
- music information retrieval
- visual data
- face recognition
- gender classification
- video data
- video sequences
- image classification
- gait recognition
- multimedia
- motion capture
- text data
- visual content
- multi modal
- image processing
- video retrieval
- human actions
- key frames
- multimedia data
- personalized recommendation
- human body
- high level
- speaker identification
- speaker recognition