AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization.
Shixiong XuChenghao ZhangLubin FanGaofeng MengShiming XiangJieping YePublished in: CoRR (2024)
Keyphrases
- language model
- image segmentation
- image data
- language modeling
- statistical language models
- image retrieval
- image representation
- document retrieval
- language modelling
- image regions
- image features
- probabilistic model
- image classification
- speech recognition
- n gram
- vector space model
- query expansion
- statistical model
- low level
- similarity measure
- machine learning
- image content
- information retrieval systems
- query specific
- information retrieval