Improving Vision-and-Language Navigation with Image-Text Pairs from the Web.
Arjun MajumdarAyush ShrivastavaStefan LeePeter AndersonDevi ParikhDhruv BatraPublished in: ECCV (6) (2020)
Keyphrases
- web images
- text information
- image data
- input image
- image features
- single image
- image content
- multiscale
- image analysis
- image classification
- web documents
- database
- low level
- high resolution
- image retrieval
- image representation
- image segmentation
- region of interest
- image pixels
- information retrieval
- web applications
- website
- language generation
- low level image processing
- information space
- image processing
- similarity measure
- natural language
- pairwise
- vision system
- image regions
- object recognition
- test images
- image annotation
- feature points
- textual information
- computer vision
- complex background
- textual descriptions
- search engine
- relevance feedback
- programming language