BUS : Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization.
Chaoya JiangHaiyang XuWei YeQinghao YeChenliang LiMing YanBin BiShikun ZhangFei HuangSongfang HuangPublished in: ICCV (2023)
Keyphrases
- cost effective
- computationally efficient
- high speed
- highly efficient
- programming language
- language learning
- computationally expensive
- supervised learning
- real time
- vision system
- image patches
- online learning
- computer vision
- neural network
- feature vectors
- training examples
- test set
- natural language
- video sequences
- image processing