MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment.
Anurag DasXinting HuLi JiangBernt SchielePublished in: CoRR (2024)
Keyphrases
- semantic segmentation
- street scenes
- superpixels
- conditional random fields
- label transfer
- weakly supervised
- scene classification
- text mining
- natural language
- object classes
- bounding box
- object categories
- information retrieval
- image understanding
- long range
- pascal voc
- object class
- object recognition
- keywords
- semantic information
- natural images
- input image
- feature vectors
- image segmentation
- computer vision
- machine learning