P2T: Pyramid Pooling Transformer for Scene Understanding.
Yu-Huan WuYun LiuXin ZhanMing-Ming ChengPublished in: CoRR (2021)
Keyphrases
- scene understanding
- object detection
- object recognition
- vision system
- d scene
- spatial pyramid matching
- scene recognition
- robot navigation
- video surveillance
- image representation
- scene categorization
- bag of words
- multiscale
- scene labeling
- computer vision
- geometric reasoning
- scale space
- indoor scenes
- scene classification
- image classification
- object class
- activity recognition
- graph cuts
- image parsing
- dynamic programming
- viewpoint