An Effective Video Transformer With Synchronized Spatiotemporal and Spatial Self-Attention for Action Recognition.
Saghir AlfaslyCharles K. ChuiQingtang JiangJian LuChen XuPublished in: IEEE Trans. Neural Networks Learn. Syst. (2024)
Keyphrases
- action recognition
- spatial temporal
- human actions
- action classification
- spatial and temporal
- space time
- spatio temporal
- video dataset
- action detection
- bag of words
- motion features
- recognizing human actions
- activity recognition
- computer vision
- human detection
- recognition of human actions
- static images
- mid level
- video frames
- video sequences
- video data
- human activities
- video shots
- space time interest points
- body parts
- depth sensors
- action primitives
- human pose
- human motion
- motion history images
- video content
- human body
- multimedia