CLIP-guided Prototype Modulating for Few-shot Action Recognition.
Xiang WangShiwei ZhangJun CenChangxin GaoYingya ZhangDeli ZhaoNong SangPublished in: CoRR (2023)
Keyphrases
- action recognition
- human actions
- bag of words
- activity recognition
- computer vision
- spatial temporal
- human detection
- key frames
- video shots
- action classification
- body parts
- video clips
- visual features
- recognizing human actions
- bag of features
- video sequences
- low level features
- view invariant
- depth sensors
- video data
- human pose
- video indexing
- visual words
- human activities
- action primitives
- low level
- recognizing actions
- independent subspace analysis
- static images
- human body
- detection algorithm
- human computer interaction
- image classification
- viewpoint
- keywords