Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection.
Yuxin FangShusheng YangShijie WangYixiao GeYing ShanXinggang WangPublished in: CoRR (2022)
Keyphrases
- object detection
- computer vision
- image features
- image data
- image classification
- image representation
- visual perception
- scene understanding
- image content
- input image
- image retrieval
- image analysis
- low level image processing
- image segmentation
- image processing
- multiscale
- single image
- image matching
- face detection
- low level
- hough transform
- similarity measure
- image regions
- image pixels
- image set
- region of interest
- keypoints
- segmentation method
- real time
- edge detection
- image structure
- vision system
- image synthesis
- scene recognition
- high resolution