SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition.
Rishabh KabraDaniel ZoranGoker ErdoganLoic MattheyAntonia CreswellMatt M. BotvinickAlexander LerchnerChristopher P. BurgessPublished in: NeurIPS (2021)
Keyphrases
- object representations
- view invariant
- human actions
- complex objects
- action recognition
- spatio temporal
- object categorization
- video sequences
- real world objects
- multi view
- space time
- supervised learning
- object classes
- single view
- video data
- computer vision
- human motion
- multiscale
- motion capture data
- object representation
- object models
- visual features
- query language
- viewpoint
- object categories
- image classification
- active learning