Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision.

Published in: Int. J. Comput. Vis. (2022)

Keyphrases