Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation.

Published in: ACL/IJCNLP (Findings) (2021)

Keyphrases