Zero-Shot Action Recognition with Transformer-based Video Semantic Embedding.
Keval DoshiYasin YilmazPublished in: CVPR Workshops (2023)
Keyphrases
- action recognition
- human actions
- action classification
- spatial temporal
- video dataset
- action detection
- static images
- recognizing human actions
- motion features
- recognition of human actions
- human activities
- bag of words
- space time interest points
- activity recognition
- human detection
- semantic concepts
- video data
- video sequences
- mid level
- view invariant
- computer vision
- bag of features
- video frames
- video streams
- space time
- multimedia
- human pose
- video retrieval
- video analysis
- body parts
- depth sensors
- motion history images
- spatio temporal
- high level
- object categories
- video content
- visual words
- semantic information
- visual features
- image classification
- object detection
- recognizing actions
- view invariant action recognition