Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection.
Shuo YangYongqi WangXiaofeng JiXinxiao WuPublished in: AAAI (2024)
Keyphrases
- multi modal
- concept detectors
- video search
- semantic concepts
- cross modal
- concept detection
- multiple modalities
- visual data
- multi modality
- visual concepts
- audio visual
- video data
- visual information
- video sequences
- visual cues
- single modality
- video content
- spatial and temporal
- video frames
- video analysis
- visual features
- high dimensional
- fusing multiple
- humanoid robot
- multimedia
- video retrieval
- image annotation
- event detection
- video shots
- broadcast news
- multimedia data
- video streams
- keywords