It Takes Two: Masked Appearance-Motion Modeling for Self-supervised Video Transformer Pre-training.

Published in: CoRR (2022)

Keyphrases