Login / Signup
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models.
Rui Qian
Yeqing Li
Zheng Xu
Ming-Hsuan Yang
Serge J. Belongie
Yin Cui
Published in:
CoRR (2022)
Keyphrases
</>
language model
video classification
pre trained
spoken term detection
n gram
training data
speech recognition
information retrieval
probabilistic model
training examples
multi modal
computer vision
video clips
video content
video shots
image classification
broadcast news