Login / Signup
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers.
Stella Frank
Emanuele Bugliarello
Desmond Elliott
Published in:
CoRR (2021)
Keyphrases
</>
cross modal
computer vision
multi modal
natural language
high level