Visual Spatio-temporal Relation-enhanced Network for Cross-modal Text-Video Retrieval.
Ning HanJingjing ChenGuangyi XiaoYawen ZengChuhao ShiHao ChenPublished in: CoRR (2021)
Keyphrases
- cross modal
- video retrieval
- video search
- multi modal
- spatio temporal
- concept based video retrieval
- visual content
- visual similarity
- video data
- semantic gap
- multimedia retrieval
- visual data
- retrieval systems
- content based retrieval
- text retrieval
- image sequences
- multimedia databases
- key frames
- keywords
- video content
- multimedia documents
- moving objects
- human actions
- feature selection
- multimedia data
- video frames
- space time
- contextual information
- visual features
- high dimensional
- multimedia