Login / Signup
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling.
Tsu-Jui Fu
Linjie Li
Zhe Gan
Kevin Lin
William Yang Wang
Lijuan Wang
Zicheng Liu
Published in:
CoRR (2021)
Keyphrases
</>
end to end
scalable video
congestion control
real time
high bandwidth
multimedia
video sequences
admission control
wireless ad hoc networks
video content
multipath
video data
real world
ad hoc networks
video streams
video frames
visual features
visual information
quality of service
text localization and recognition