Zero-shot urban function inference with street view images through prompting a pretrained vision-language model.
Weiming HuangJing WangGao CongPublished in: Int. J. Geogr. Inf. Sci. (2024)
Keyphrases
- language model
- street view
- language modeling
- n gram
- image data
- image features
- probabilistic model
- document retrieval
- image database
- input image
- retrieval model
- text detection
- image collections
- test collection
- image classification
- query expansion
- object recognition
- mixture model
- image content
- image annotation
- multiple images
- aerial images
- video sequences
- image retrieval
- complex background
- web images
- urban areas
- query terms
- feature points
- image registration
- search engine