ViLT-CLIP: Video and Language Tuning CLIP with Multimodal Prompt Learning and Scenario-Guided Optimization.
Hao WangFang LiuLicheng JiaoJiahao WangZehua HaoShuo LiLingling LiPuhua ChenXu LiuPublished in: AAAI (2024)
Keyphrases
- video clips
- learning systems
- language acquisition
- learning algorithm
- real world
- learning process
- learning problems
- optimization algorithm
- neural network
- multi modal
- knowledge acquisition
- real time
- interactive video
- online learning
- active learning
- video data
- prior knowledge
- optimization method
- event detection
- evolutionary algorithm
- reinforcement learning