Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images.
Sina ZarrießSoledad López GambinoDavid SchlangenPublished in: INLG (2017)
Keyphrases
- real world
- image analysis
- ground truth
- objects represented
- multiple objects
- image data
- three dimensional
- image database
- image features
- spatial relationships
- image classification
- geometric constraints
- deformable objects
- individual objects
- image regions
- image pixels
- input image
- keypoints
- complex scenes
- viewing conditions
- sample images
- bounding box
- viewing angle
- lighting conditions
- image registration
- image retrieval
- complex background
- object models
- rigid body
- visual data
- real world scenes
- background clutter
- illumination conditions
- detecting objects
- data sets
- target object
- multiple images
- image collections
- test images
- image matching
- spatial information
- speech recognition
- object recognition
- image segmentation
- relative position
- multiscale
- partial occlusion
- edge detection
- image segments
- image sequences