Focus and Align: Learning Tube Tokens for Video-Language Pre-Training.
Yongqing ZhuXiangyang LiMao ZhengJiahao YangZihan WangXiaoqian GuoZifeng ChaiYuchen YuanShuqiang JiangPublished in: IEEE Trans. Multim. (2023)
Keyphrases
- multimedia
- learning process
- real time
- natural language
- training process
- language learning
- learning algorithm
- language acquisition
- learning systems
- online learning
- unsupervised learning
- background knowledge
- programming language
- serious games
- supervised learning
- object oriented programming
- feedforward neural networks
- structured prediction
- training set
- learning stage
- online training