Audio-visual continuous speech recognition using MPEG-4 compliant visual features.
Petar S. AleksicJay J. WilliamsZhilin WuAggelos K. KatsaggelosPublished in: ICIP (1) (2002)
Keyphrases
- audio visual
- visual features
- visual information
- visual descriptors
- visual data
- visual content
- multimedia
- image classification
- multi modal
- low level
- image collections
- image retrieval
- keywords
- image annotation
- low level features
- speaker verification
- audio features
- human actions
- key frames
- feature extraction
- video sequences
- image representation
- search engine