VLCAP: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning.
Kashu YamazakiSang TruongViet-Khoa Vo-HoMichael KiddChase RainwaterKhoa LuuNgan LePublished in: ICIP (2022)
Keyphrases
- learning algorithm
- real time
- learning systems
- learning process
- image processing
- multimedia
- inductive inference
- online learning
- learning tasks
- language acquisition
- learning analytics
- video surveillance
- language learning
- mobile learning
- background knowledge
- active learning
- prior knowledge
- reinforcement learning
- bayesian networks
- neural network