UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection.
Ye LiuSiyuan LiYang WuChang Wen ChenYing ShanXiaohu QiePublished in: CVPR (2022)
Keyphrases
- multi modal
- video search
- cross modal
- cut detection
- video indexing
- semantic concepts
- audio visual
- multi modality
- multiple modalities
- news video
- video data
- video content
- video sequences
- information retrieval
- image database
- video frames
- event detection
- image annotation
- high dimensional
- uni modal
- content based retrieval
- video retrieval
- retrieval systems
- information retrieval systems
- image retrieval
- multimedia
- video analysis
- video database
- video clips
- multimedia documents
- multimedia data
- video streams
- image representation