FSformer: Fast-Slow Transformer for video action recognition.
Shibao LiZhaoyu WangYixuan LiuYunwu ZhangJinze ZhuXuerong CuiJianhang LiuPublished in: Image Vis. Comput. (2023)
Keyphrases
- action recognition
- human actions
- action classification
- spatial temporal
- video dataset
- action detection
- recognition of human actions
- recognizing human actions
- motion features
- human activities
- activity recognition
- spatio temporal interest points
- static images
- bag of words
- space time interest points
- computer vision
- motion history images
- human detection
- mid level
- human pose
- video data
- body parts
- bag of features
- recognizing actions
- video sequences
- view invariant action recognition
- motion capture data
- video content
- multimedia
- video images
- depth sensors
- human motion
- video frames
- space time
- temporal structure
- human activity recognition
- view invariant
- video clips
- event detection
- action recognition in videos
- image sequences