WLiT: Windows and Linear Transformer for Video Action Recognition.
Ruoxi SunTianzhao ZhangYong WanFuping ZhangJianming WeiPublished in: Sensors (2023)
Keyphrases
- action recognition
- human actions
- action classification
- video dataset
- spatial temporal
- action detection
- static images
- spatio temporal interest points
- recognizing human actions
- human activities
- recognition of human actions
- space time interest points
- bag of words
- motion features
- activity recognition
- mid level
- computer vision
- human detection
- body parts
- motion history images
- view invariant
- bag of features
- multimedia
- video sequences
- video frames
- video content
- video clips
- depth sensors
- video analysis
- human motion
- max margin
- space time
- video images
- action primitives
- recognizing actions
- three dimensional
- action recognition in videos
- machine learning