MaskViT: Masked Visual Pre-Training for Video Prediction.
Agrim GuptaStephen TianYunzhi ZhangJiajun WuRoberto Martín-MartínLi Fei-FeiPublished in: ICLR (2023)
Keyphrases
- visual cues
- training set
- visual analysis
- video streams
- prediction accuracy
- visual data
- prediction algorithm
- video sequences
- real time
- video search
- video content
- content based video retrieval
- prediction model
- visual information
- supervised learning
- video frames
- visual features
- visual perception
- digital video
- neural network
- video data
- computer vision
- scalable video coding
- multi layer perceptron
- low level
- training process
- video analysis
- prediction error
- video clips
- keywords
- visual concepts
- training data
- multimedia
- radial basis function network
- online learning
- temporal information