Login / Signup

Align vision-language semantics by multi-task learning for multi-modal summarization.

Chenhao CuiXinnian LiangShuangzhi WuZhoujun Li
Published in: Neural Comput. Appl. (2024)
Keyphrases