Login / Signup
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset.
Sihan Chen
Handong Li
Qunbo Wang
Zijia Zhao
Mingzhen Sun
Xinxin Zhu
Jing Liu
Published in:
CoRR (2023)
Keyphrases
</>
mathematical model
high level
probability distribution
formal model
theoretical foundation
multi modal
management system
machine learning
database
visual information
information retrieval
prior knowledge
probabilistic model
feature selection
signal processing
computer vision
search engine
statistical model