Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Runpei DongZekun QiLinfeng ZhangJunbo ZhangJianjian SunZheng GeLi YiKaisheng MaPublished in: CoRR (2022)
Keyphrases
- perceptual information
- cross modal
- learning process
- image representation
- image retrieval
- image data
- image features
- image segmentation
- multiscale
- learning algorithm
- supervised learning
- multi modal
- image content
- spatial relations
- low level
- image classification
- object recognition
- co occurrence
- spatial information
- test images
- visual recognition
- similarity measure