Login / Signup
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models.
Shehan Munasinghe
Rusiru Thushara
Muhammad Maaz
Hanoona Abdul Rasheed
Salman Khan
Mubarak Shah
Fahad Khan
Published in:
CoRR (2023)
Keyphrases
</>
language model
video sequences
video content
video data
language modeling
multimedia
video retrieval
key frames
video frames
speech recognition
n gram
retrieval model
machine learning
video shots
web search
relevance feedback
probabilistic model
language modelling