Login / Signup
A Multi-level Alignment Training Scheme for Video-and-Language Grounding.
Yubo Zhang
Feiyang Niu
Qing Ping
Govind Thattai
Published in:
CoRR (2022)
Keyphrases
</>
language learning
video data
real time
video sequences
natural language
video frames
multimedia
video streams
programming language
real time video
video content
pairwise
training set
space time
spatial and temporal
hidden markov models
multimedia data
video surveillance
video analysis
target language
training data