ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks.
Jiasen LuDhruv BatraDevi ParikhStefan LeePublished in: NeurIPS (2019)
Keyphrases
- real time
- computer vision
- vision system
- neural network
- language learning
- visually guided
- semantic representations
- service robots
- visual perception
- language processing
- higher level
- programming language
- image processing
- object oriented
- database
- mobile robot
- natural language
- robotic systems
- context dependent
- symbolic representation
- high level
- website
- target language
- english language
- knowledge base
- genetic algorithm