Video-Text Pre-training with Learned Regions for Retrieval.
Rui YanMike Zheng ShouYixiao GeJinpeng WangXudong LinGuanyu CaiJinhui TangPublished in: AAAI (2023)
Keyphrases
- multimedia search
- news video
- multimedia documents
- video search
- information retrieval
- video collections
- text retrieval
- video indexing
- video sequences
- video content
- multimedia
- semantic content
- content based indexing
- content based video retrieval
- video dataset
- document analysis
- text detection
- video data
- retrieval engine
- video segments
- image database
- video retrieval
- video database
- audio content
- multimedia information
- natural language descriptions
- visual content
- set of training images
- textual descriptions
- cross media
- dynamic textures
- content based retrieval
- structured documents
- video analysis
- unsupervised manner
- video scene
- multimedia data
- video frames
- text collections
- text regions
- multimedia content
- cut detection
- key frames
- document retrieval
- conceptual retrieval
- video streams
- training set
- relevance feedback
- text mining
- query expansion
- test collection
- video objects
- event detection
- video clips
- video shots
- handwritten documents
- retrieval model
- retrieval systems
- image search
- semantic concepts
- input image
- image retrieval
- visual concepts