Steps Towards More Natural Human-Machine Interaction via Audio-Visual Word Prominence Detection.
Martin HeckmannPublished in: MA3HMI@INTERSPEECH (2014)
Keyphrases
- human machine interaction
- visual words
- bag of words
- image classification
- gesture recognition
- spontaneous speech
- visual phrases
- visual vocabulary
- image representation
- bag of visual words
- semantic context
- discriminative power
- object categories
- recognition scheme
- keypoints
- multimedia
- focus of attention
- object detection
- image features
- spatial information
- bag of features
- scene classification
- text classification
- co occurrence
- object retrieval
- machine learning
- object detectors
- action recognition
- feature space