Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information.
Jialu LiHao TanMohit BansalPublished in: NAACL-HLT (2021)
Keyphrases
- cross modal
- syntactic information
- multi modal
- question answering
- semantic role labeling
- multimedia retrieval
- semantic information
- computer vision
- visual recognition
- part of speech
- multimedia databases
- semantic roles
- image retrieval
- natural language
- conceptual graphs
- image classification
- parse tree
- object recognition
- visual data
- co occurrence
- relevance feedback
- low level