Login / Signup

STVGBert: A Visual-linguistic Transformer based Framework for Spatio-temporal Video Grounding.

Rui SuQian YuDong Xu
Published in: ICCV (2021)
Keyphrases