Non-audible murmur recognition based on fusion of audio and visual streams.
Panikos HeracleousNorihiro HagitaPublished in: INTERSPEECH (2010)
Keyphrases
- visual information
- visual data
- visual learning
- visual processing
- visual speech
- visual perception
- recognition rate
- recognition accuracy
- noisy environments
- human face recognition
- multimedia
- cross modal
- single modality
- object recognition
- multi stream
- audio stream
- video files
- visual features
- multi modality
- multimodal fusion
- audio visual
- feature fusion
- automatic transcription
- visual recognition
- gesture recognition
- feature extraction
- recognition algorithm
- pattern recognition
- data streams
- signal processing
- multi modal
- recognition process
- real time
- environmental sounds
- character recognition
- action recognition
- low level
- image retrieval
- image fusion
- automatic target recognition
- speaker identification
- fusion method