Compound Tokens: Channel Fusion for Vision-Language Representation Learning.

Maxwell Mbabilla Aladago A. J. Piergiovanni

Published in: CoRR (2022)

Keyphrases

language acquisition
learning algorithm
knowledge acquisition
learning process
online learning
learning systems
real time
vision system
representation language
dynamic bayesian networks
learning tasks
reinforcement learning
multiscale
computer vision
machine learning
active learning
natural language
learning problems
conceptual graphs
bayesian networks