History Aware Multimodal Transformer for Vision-and-Language Navigation.
Shizhe ChenPierre-Louis GuhurCordelia SchmidIvan LaptevPublished in: NeurIPS (2021)
Keyphrases
- real time
- programming language
- vision system
- multi modal
- image processing
- computer vision
- natural language
- language learning
- fuzzy logic
- fault diagnosis
- obstacle avoidance
- web navigation
- multimodal interfaces
- multimodal interaction
- navigation systems
- target language
- language processing
- indoor environments
- natural language processing
- software engineering
- neural network