Speech2Action: Cross-modal Supervision for Action Recognition.
Arsha NagraniChen SunDavid RossRahul SukthankarCordelia SchmidAndrew ZissermanPublished in: CoRR (2020)
Keyphrases
- action recognition
- cross modal
- human actions
- recognizing human actions
- visual data
- action classification
- recognizing actions
- multi modal
- action primitives
- view invariant
- action detection
- bag of words
- recognition of human actions
- activity recognition
- multimedia retrieval
- image retrieval
- computer vision
- space time interest points
- human pose
- multimedia databases
- motion history images
- atomic actions
- spatio temporal
- multimedia data
- human activities
- d objects
- high level