Login / Signup
D-MmT: A concise decoder-only multi-modal transformer for abstractive summarization in videos.
Nayu Liu
Xian Sun
Hongfeng Yu
Wenkai Zhang
Guangluan Xu
Published in:
Neurocomputing (2021)
Keyphrases
</>
multi modal
video search
video summarization
audio visual
sports video
semantic concepts
video frames
multi modality
image annotation
error concealment
video content
video database
video sequences
high dimensional
cross modal
video data
humanoid robot