VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding.
Yizhou WangRuiyi ZhangHaoliang WangUttaran BhattacharyaYun FuGang WuPublished in: CoRR (2023)
Keyphrases
- video content
- multimedia
- video sequences
- video data
- real time
- video frames
- video streams
- image sequences
- real time video
- multimedia data
- video surveillance
- spatial temporal
- global alignment
- online video
- event recognition
- deeper understanding
- video shots
- video database
- video analysis
- video clips
- pairwise
- three dimensional
- data sets