Multimodal attention networks for low-level vision-and-language navigation.

Published in: Comput. Vis. Image Underst. (2021)

Keyphrases