Login / Signup
Modeling Paragraph-Level Vision-Language Semantic Alignment for Multi-Modal Summarization.
Xinnian Liang
Chenhao Cui
Shuangzhi Wu
Jiali Zeng
Yufan Jiang
Zhoujun Li
Published in:
CoRR (2022)
Keyphrases
</>
multi modal
semantic concepts
natural language
multi modality
audio visual
cross modal
higher level
video search
computer vision
semantic web
document level
multimedia
high dimensional