CAST: Cross-Attention in Space and Time for Video Action Recognition.
Dongho LeeJongseo LeeJinwoo ChoiPublished in: NeurIPS (2023)
Keyphrases
- action recognition
- human actions
- action classification
- spatial temporal
- space time
- video dataset
- action detection
- recognizing human actions
- recognition of human actions
- spatio temporal interest points
- motion features
- human detection
- static images
- space time interest points
- bag of words
- video sequences
- activity recognition
- human activities
- body parts
- motion history images
- recognizing actions
- mid level
- spatio temporal
- multimedia
- computer vision
- action primitives
- motion capture data
- video frames
- video streams
- human pose
- spatial and temporal
- bag of features
- video surveillance
- image sequences
- video content
- human motion