Login / Signup
A Cross-Modal Object-Aware Transformer for Vision-and-Language Navigation.
Han Ni
Jia Chen
DaYong Zhu
Dianxi Shi
Published in:
ICTAI (2022)
Keyphrases
</>
cross modal
multi modal
natural language
d objects
spatial relationships
visual recognition
multimedia retrieval
visual similarity
computer vision
multimedia databases
low level
low dimensional