TransVOD: End-to-end Video Object Detection with Spatial-Temporal Transformers.
Qianyu ZhouXiangtai LiLu HeYibo YangGuangliang ChengYunhai TongLizhuang MaDacheng TaoPublished in: CoRR (2022)
Keyphrases
- spatial temporal
- end to end
- object detection
- video shots
- spatial and temporal
- temporal information
- action recognition
- spatio temporal
- computer vision
- scalable video
- temporal correlation
- object recognition
- human actions
- spatial information
- video data
- congestion control
- spatial and temporal information
- transport layer