HawkEye: Training Video-Text LLMs for Grounding Text in Videos.
Yueqian WangXiaojun MengJianxin LiangYuxuan WangQun LiuDongyan ZhaoPublished in: CoRR (2024)
Keyphrases
- natural language descriptions
- information retrieval
- video collections
- video data
- text mining
- video search
- video database
- video sequences
- human activities
- video segments
- video content
- video representation
- video retrieval
- news video
- text documents
- dynamic scenes
- video editing
- video shots
- youtube videos
- moving camera
- video clips
- video frames
- space time
- training set
- keywords
- multimedia documents
- video surveillance
- video summarization
- key frames
- video dataset
- high definition
- multimedia