Sign in

Multi-Modal Structure-Embedding Graph Transformer for Visual Commonsense Reasoning.

Jian ZhuHanli WangBin He
Published in: IEEE Trans. Multim. (2024)
Keyphrases