Speech2Action: Cross-Modal Supervision for Action Recognition.
Arsha NagraniChen SunDavid RossRahul SukthankarCordelia SchmidAndrew ZissermanPublished in: CVPR (2020)
Keyphrases
- action recognition
- cross modal
- human actions
- action classification
- visual data
- recognizing human actions
- recognizing actions
- multi modal
- recognition of human actions
- view invariant
- action primitives
- action detection
- bag of words
- space time interest points
- activity recognition
- image retrieval
- multimedia retrieval
- multimedia databases
- computer vision
- bag of features
- human activities
- human pose
- visual similarity
- motion history images
- spatio temporal
- human motion
- visual features
- principal component analysis
- video sequences
- high level