Login / Signup
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog.
Adnen Abdessaied
Manuel von Hochmeister
Andreas Bulling
Published in:
LREC/COLING (2024)
Keyphrases
</>
multi modal
semantic concepts
video search
multi modality
video sequences
high dimensional
multimedia
particle filter
video data
audio visual
uni modal
video analysis
image annotation
video content
low dimensional
image retrieval
video retrieval
multimedia data
cross modal
state space