Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition.
Erkut AkdagZeqi ZhuEgor BondarevPeter H. N. de WithPublished in: CVPR Workshops (2023)
Keyphrases
- action recognition
- human actions
- spatio temporal
- human pose
- recognizing actions
- spatial temporal
- action recognition in videos
- spatio temporal interest points
- view invariant
- recognition of human actions
- human detection
- computer vision
- action classification
- bag of words
- action detection
- pose estimation
- activity recognition
- static images
- body parts
- human activities
- recognizing human actions
- spatial and temporal
- image sequences
- space time
- machine learning
- d objects
- bag of features
- depth cameras
- human motion
- video dataset
- max margin
- low level
- human body
- depth sensors
- action primitives
- moving objects
- video sequences
- motion patterns