TokenLearner: Adaptive Space-Time Tokenization for Videos.
Michael S. RyooA. J. PiergiovanniAnurag ArnabMostafa DehghaniAnelia AngelovaPublished in: NeurIPS (2021)
Keyphrases
- space time
- video sequences
- dynamic scenes
- input video
- video representation
- human actions
- spatial and temporal
- spatio temporal
- temporal domain
- video frames
- multiple view geometry
- super resolution reconstruction
- video content
- dynamic textures
- video data
- video clips
- video database
- moving objects
- image sequences
- motion patterns
- moving camera
- key frames
- frame rate
- named entities
- video camera
- multimedia
- machine learning