VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding.

Yizhou Wang Ruiyi Zhang Haoliang Wang Uttaran Bhattacharya Yun Fu Gang Wu

Published in: CoRR (2023)

Keyphrases

video content
multimedia
video sequences
video data
real time
video frames
video streams
image sequences
real time video
multimedia data
video surveillance
spatial temporal
global alignment
online video
event recognition
deeper understanding
video shots
video database
video analysis
video clips
pairwise
three dimensional
data sets