Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation.
Simone RossettiDamiano ZappiaMarta SanzariMarco SchaerfFiora PirriPublished in: CoRR (2022)
Keyphrases
- weakly supervised
- semantic segmentation
- superpixels
- object class
- object classes
- topic models
- relation extraction
- conditional random fields
- semi supervised
- object recognition
- multiscale
- pascal voc
- object detection
- computer vision
- named entities
- scene classification
- shape model
- image processing
- bounding box
- pairwise
- input image
- viewpoint