Real-time semi-blind speech extraction with speaker direction tracking on Kinect.
Yuji OnumaNoriyoshi KamadoHiroshi SaruwatariKiyohiro ShikanoPublished in: APSIPA (2012)
Keyphrases
- real time
- speech recognition
- depth cameras
- real time tracking
- audio visual
- speaker recognition
- automatic speech recognition
- moving target
- vocal tract
- face detection and tracking
- speaker dependent
- speaker verification
- speech synthesis
- depth data
- prosodic features
- automatic speech recognition systems
- microsoft kinect
- depth information
- speech signal
- speaker identification
- particle filtering
- speaker diarization
- speech recognizer
- detecting and tracking multiple
- beating heart
- video rate
- motion tracking
- camera tracking
- surveillance videos
- depth images
- depth map
- super resolution
- infrared camera
- vision system
- image sequences