Multi-dataset Training of Transformers for Robust Action Recognition.
Junwei LiangEnwei ZhangJun ZhangChunhua ShenPublished in: NeurIPS (2022)
Keyphrases
- action recognition
- human actions
- view invariant
- video dataset
- bag of words
- ucf sports
- computer vision
- activity recognition
- human detection
- action recognition in videos
- body parts
- action classification
- static images
- recognition of human actions
- bag of features
- recognizing human actions
- depth sensors
- space time interest points
- spatio temporal
- human activities
- recognizing actions
- training set
- video sequences
- human pose
- action detection
- human motion
- feature descriptors
- three dimensional