Combined CNN Transformer Encoder for Enhanced Fine-grained Human Action Recognition.
Mei Chee LeongHaosong ZhangHui Li TanLiyuan LiJoo Hwee LimPublished in: CoRR (2022)
Keyphrases
- fine grained
- action recognition
- coarse grained
- static images
- human movements
- human actions
- activity recognition
- bag of words
- human activities
- human detection
- motion capture data
- action classification
- human object interactions
- motion history images
- access control
- body parts
- computer vision
- view invariant
- bag of features
- spatial temporal
- depth sensors
- recognizing human actions
- recognizing actions
- action detection
- recognition of human actions
- action primitives
- human pose
- motion estimation
- video dataset
- depth map
- bit rate
- pairwise