Login / Signup
UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation.
Huaishao Luo
Lei Ji
Botian Shi
Haoyang Huang
Nan Duan
Tianrui Li
Xilin Chen
Ming Zhou
Published in:
CoRR (2020)
Keyphrases
</>
probabilistic model
unified model
high level
mathematical model
video data
computational model
statistical model
theoretical framework
probability distribution
space time
experimental data
video frames
language learning
neural network model
context dependent