Spatiotemporal visual-semantic embedding network for zero-shot action recognition.
Rongqiao AnZhenjiang MiaoQingyu LiWanru XuQiang ZhangPublished in: J. Electronic Imaging (2019)
Keyphrases
- action recognition
- human actions
- mid level
- human detection
- bag of words
- action classification
- body parts
- activity recognition
- computer vision
- spatial temporal
- static images
- visual information
- human activities
- recognizing actions
- space time
- semantic information
- low level
- recognizing human actions
- object categories
- video dataset
- action detection
- view invariant
- bag of features
- high level
- visual concepts
- semantic content
- visual features
- recognition of human actions
- view invariant action recognition
- action recognition in videos
- human pose
- visual cues
- spatial and temporal
- spatio temporal
- image retrieval
- object recognition