Optimizing ViViT Training: Time and Memory Reduction for Action Recognition.
Shreyank N. GowdaAnurag ArnabJonathan HuangPublished in: CoRR (2023)
Keyphrases
- body parts
- action recognition
- human actions
- bag of words
- human pose
- human detection
- computer vision
- activity recognition
- action classification
- static images
- recognizing human actions
- spatial temporal
- video dataset
- recognition of human actions
- mid level
- action primitives
- depth sensors
- action recognition in videos
- view invariant
- training examples
- action detection
- bag of features
- human activities
- recognizing actions
- viewpoint