BEVT: BERT Pretraining of Video Transformers.

Rui Wang Dongdong Chen Zuxuan Wu Yinpeng Chen Xiyang Dai Mengchen Liu Yu-Gang Jiang Luowei Zhou Lu Yuan

Published in: CoRR (2021)

Keyphrases

video data
video content
video sequences
video analysis
multimedia
real time video
space time
video processing
audio video
video frames
multimedia data
video segmentation
video clips
video images
real time
event detection
image processing
video streams
video surveillance
image quality
multiresolution
case study
learning algorithm