Login / Signup
Is a Video worth n×n Images? A Highly Efficient Approach to Transformer-based Video Question Answering.
Chenyang Lyu
Tianbo Ji
Yvette Graham
Jennifer Foster
Published in:
CoRR (2023)
Keyphrases
</>
question answering
highly efficient
video data
visual data
video sequences
natural language processing
information retrieval
image retrieval
information extraction
video content
multimedia
named entities
video frames
real time
open domain question answering
image annotation
wordnet
low cost
optical flow