Omnidirectional Audio-Visual Talker Localization Based on Dynamic Fusion of Audio-Visual Features Using Validity and Reliability Criteria.
Yuki DendaTakanobu NishiuraYoichi YamashitaPublished in: IEICE Trans. Inf. Syst. (2008)
Keyphrases
- audio visual
- visual features
- visual information
- visual data
- audio features
- multimodal fusion
- visual content
- image classification
- low level
- image retrieval
- multi modal
- low level features
- emotion recognition
- image annotation
- audio visual speech recognition
- multi stream
- keywords
- image collections
- multimedia
- key frames
- sound source
- eye movements
- speaker verification
- human actions
- speech recognition