Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer.
Guangyi ChenXiao LiuGuangrun WangKun ZhangPhilip H. S. TorrXiao-Ping ZhangYansong TangPublished in: ICCV (2023)
Keyphrases
- question answer
- single image
- image data
- image features
- image content
- image representation
- multiscale
- image classification
- image analysis
- input image
- low level
- image retrieval
- information retrieval
- image collections
- image segmentation
- image regions
- video streams
- high resolution
- semantic labels
- text detection
- video sequences
- image sequences
- multimedia
- search engine
- image frames
- visual concepts
- visual data
- video files
- machine learning
- textual descriptions
- multimedia documents
- edge detection