Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone.
Zi-Yi DouAishwarya KamathZhe GanPengchuan ZhangJianfeng WangLinjie LiZicheng LiuCe LiuYann LeCunNanyun PengJianfeng GaoLijuan WangPublished in: NeurIPS (2022)
Keyphrases
- coarse to fine
- multiscale
- multiresolution
- hierarchical segmentation
- object detection
- image registration
- active shape model
- training set
- dynamic programming
- computer vision
- feature correspondences
- matching scheme
- hierarchical representation
- image pyramid
- optical flow estimation
- image processing
- wavelet transform
- natural images
- object recognition