Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation.

Published in: CoRR (2022)

Keyphrases