Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective.
Emmanuelle SalinBadreddine FarahStéphane AyacheBenoît FavrePublished in: AAAI (2022)
Keyphrases
- learning algorithm
- real time
- learning process
- data sets
- object oriented programming
- unsupervised learning
- multi modal
- online learning
- vision system
- programming language
- supervised learning
- computer vision
- prior knowledge
- multimedia
- human computer interaction
- learning systems
- learning tasks
- language learning
- neural network
- learning mechanism
- language acquisition
- multiple representations