Login / Signup
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision.
Andrew Shin
Masato Ishii
Takuya Narihira
Published in:
Int. J. Comput. Vis. (2022)
Keyphrases
</>
cross modal
multi modal
computer vision
image retrieval
perceptual information
multimedia databases
visual data
multimedia retrieval
knowledge base
high level
text classification
visual recognition