Efficient Video Representation Learning via Masked Video Modeling with Motion-centric Token Selection.

Sunil Hwang Jaehong Yoon Youngwan Lee Sung Ju Hwang

Published in: CoRR (2022)

Keyphrases

video representation
space time
spatio temporal
dynamic textures
video content
video streams
video analysis
key frames
video frames
video database
video data
video sequences
video summarization
spatial and temporal
generative model
prior knowledge
moving objects
image sequences
motion analysis
motion patterns
motion estimation
moving camera
optical flow
video synopsis