A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization.
Kazuhiro OtsukaShoko ArakiKentaro IshizukaMasakiyo FujimotoMartin HeinrichJunji YamatoPublished in: ICMI (2008)
Keyphrases
- pose tracking
- speaker diarization
- sensor fusion
- real time
- motion capture
- face images
- pose estimation
- speech recognition
- human faces
- multimedia
- robust tracking
- facial expressions
- multi modal
- uncalibrated cameras
- facial features
- facial images
- lighting conditions
- speaker identification
- pattern recognition
- image sequences