MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos.
Jielin QiuJiacheng ZhuWilliam HanAditesh KumarKarthik MittalClaire JinZhengyuan YangLinjie LiJianfeng WangBo LiDing ZhaoLijuan WangPublished in: CoRR (2023)
Keyphrases
- video browsing
- video summarization
- audio visual
- human actions
- video content
- video dataset
- multimedia event detection
- video segments
- video summaries
- sports video
- action recognition
- personal photos
- video data
- multi modal
- web videos
- video sequences
- generation process
- video search
- visual information
- video annotation
- database
- multimodal fusion
- computer vision
- image classification
- multi document summarization
- text summarization
- video analysis
- video retrieval
- benchmark datasets
- event detection
- video frames