End-to-end Video-level Representation Learning for Action Recognition.
Jiagang ZhuWei ZouZheng ZhuLin LiPublished in: CoRR (2017)
Keyphrases
- end to end
- action recognition
- human actions
- recognizing human actions
- spatial temporal
- recognition of human actions
- bag of words
- action classification
- video dataset
- spatio temporal interest points
- static images
- view invariant
- computer vision
- human detection
- scalable video
- action recognition in videos
- video streams
- video analysis
- activity recognition
- bag of features
- image representation
- multi view
- image quality
- reinforcement learning