Text-Conditioned Resampler For Long Form Video Understanding.
Bruno KorbarYongqin XianAlessio TonioniAndrew ZissermanFederico TombariPublished in: CoRR (2023)
Keyphrases
- video sequences
- natural language descriptions
- video streams
- video frames
- real time
- video data
- information retrieval
- video content
- machine readable form
- computer vision
- text mining
- spatial and temporal
- text retrieval
- database
- video segments
- text detection
- video collections
- text documents
- search engine
- keywords
- video search
- real time video
- closed captions
- image retrieval