SpaceCLIP: A Vision-Language Pretraining Framework With Spatial Reconstruction On Text.
Bo ZouChao YangChengbin QuanYoujian ZhaoPublished in: ACM Multimedia (2023)
Keyphrases
- spatial information
- information retrieval
- computer vision
- spatial constraints
- computational linguistics
- image reconstruction
- main contribution
- vision system
- text mining
- probabilistic model
- programming language
- high resolution
- data model
- language processing
- spatio temporal
- three dimensional
- english text
- spatial knowledge
- human language
- language generation