Revolutionizing Urban Safety Perception Assessments: Integrating Multimodal Large Language Models with Street View Images.
Jiaxin ZhangaYunqin LiaTomohiro FukudabBowen WangPublished in: CoRR (2024)
Keyphrases
- language model
- street view
- n gram
- image data
- text detection
- input image
- language modeling
- document retrieval
- image classification
- probabilistic model
- image regions
- image database
- image retrieval
- retrieval model
- query expansion
- test collection
- information retrieval
- multi modal
- image collections
- feature points
- vector space model
- region of interest
- query terms
- image understanding
- multiple images
- smoothing methods