Shrinking Temporal Attention in Transformers for Video Action Recognition.
Bonan LiPengfei XiongCongying HanTiande GuoPublished in: AAAI (2022)
Keyphrases
- action recognition
- spatial temporal
- human actions
- action classification
- motion history images
- video dataset
- action detection
- bag of words
- recognition of human actions
- atomic actions
- human activities
- static images
- temporal information
- recognizing human actions
- activity recognition
- human detection
- computer vision
- motion features
- spatio temporal
- spatial and temporal
- body parts
- space time
- space time interest points
- video sequences
- depth sensors
- video data
- action recognition in videos
- bag of features
- human pose
- mid level
- temporal resolution
- multimedia
- human motion
- view invariant
- video frames
- action primitives
- recognizing actions
- machine learning
- video analysis
- view invariant action recognition
- visual features
- motion trajectories
- key frames
- video content
- video shots