Audio-visual speech recognition incorporating facial depth information captured by the Kinect.
Georgios GalatasGerasimos PotamianosFillia MakedonPublished in: EUSIPCO (2012)
Keyphrases
- depth information
- audio visual speech recognition
- depth map
- multi stream
- depth cameras
- depth images
- stereo vision
- depth data
- audio visual
- microsoft kinect
- kinect sensor
- rgb d camera
- facial images
- face recognition
- facial expressions
- emotion recognition
- rgbd images
- human faces
- pose estimation
- facial features
- multi view
- disparity map
- stereo matching
- depth estimation
- time of flight
- semi supervised
- visual speech
- hidden markov models
- high resolution
- high quality
- image sequences
- three dimensional