Robot-directed speech detection using Multimodal Semantic Confidence based on speech, image, and motion.
Xiang ZuoNaoto IwahashiRyo TaguchiShigeki MatsudaKomei SugiuraKotaro FunakoshiMikio NakanoNatsuki OkaPublished in: ICASSP (2010)
Keyphrases
- image sequences
- feature points
- image motion
- object motion
- audio visual
- optical flow
- visual data
- input image
- computer vision
- motion estimation
- multimodal interfaces
- image data
- position and orientation
- speech recognition
- multi stream
- single image
- image retrieval
- moving objects
- multiscale
- broadcast news
- edge detection
- multi modal
- speech signal
- object detection
- image features
- image content
- image classification
- mobile robot
- text to speech
- humanoid robot
- vision system
- automatic speech recognition
- noisy environments
- normalized correlation
- confidence scores
- static images
- human computer interaction
- image segmentation