TokenLearner: Adaptive Space-Time Tokenization for Videos.

Michael S. Ryoo A. J. Piergiovanni Anurag Arnab Mostafa Dehghani Anelia Angelova

Published in: NeurIPS (2021)

Keyphrases

space time
video sequences
dynamic scenes
input video
video representation
human actions
spatial and temporal
spatio temporal
temporal domain
video frames
multiple view geometry
super resolution reconstruction
video content
dynamic textures
video data
video clips
video database
moving objects
image sequences
motion patterns
moving camera
key frames
frame rate
named entities
video camera
multimedia
machine learning