Deconfounded Multimodal Learning for Spatio-temporal Video Grounding.

Jiawei Wang Zhanchang Ma Da Cao Yuquan Le Junbin Xiao Tat-Seng Chua

Published in: ACM Multimedia (2023)

Keyphrases

spatio temporal
learning algorithm
learning process
learning systems
spatial and temporal
reinforcement learning
learning community
learning problems
video data
supervised learning
prior knowledge
video sequences
visual features
multimedia
temporal information
learning tasks
computer vision
video content
neural network