ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks.

Jiasen Lu Dhruv Batra Devi Parikh Stefan Lee

Published in: NeurIPS (2019)

Keyphrases

real time
computer vision
vision system
neural network
language learning
visually guided
semantic representations
service robots
visual perception
language processing
higher level
programming language
image processing
object oriented
database
mobile robot
natural language
robotic systems
context dependent
symbolic representation
high level
website
target language
english language
knowledge base
genetic algorithm