A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension.
Weijia WuYuzhong ZhaoZhuang LiJiahong LiHong ZhouMike Zheng ShouXiang BaiPublished in: CoRR (2023)
Keyphrases
- video retrieval
- cross modal
- reading comprehension
- multi modal
- visual content
- multimedia retrieval
- content based retrieval
- visual similarity
- video data
- semantic gap
- computer assisted
- retrieval systems
- video clips
- visual data
- multimedia databases
- image retrieval
- video content
- key frames
- multimedia
- co occurrence
- human actions
- information retrieval