Improving Vision-and-Language Navigation with Image-Text Pairs from the Web.
Arjun MajumdarAyush ShrivastavaStefan LeePeter AndersonDevi ParikhDhruv BatraPublished in: CoRR (2020)
Keyphrases
- web images
- image retrieval
- image data
- single image
- input image
- web documents
- text information
- visual perception
- multiscale
- image features
- image classification
- image content
- website
- image representation
- segmentation method
- keywords
- low level
- low level image processing
- database
- test images
- image collections
- semantic information
- computational linguistics
- information retrieval
- image segmentation
- programming language
- feature points
- similarity measure
- language generation
- web pages
- image processing
- textual data
- pixel values
- image annotation
- text retrieval
- image analysis
- high resolution
- edge detection
- segmentation algorithm
- vision system
- region of interest
- pairwise
- image regions
- image pixels
- textual information
- web applications
- metadata
- web scale
- web navigation
- computer vision