DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention.
Fenglin LiuXian WuShen GeXuancheng RenWei FanXu SunYuexian ZouPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- learning algorithm
- learning process
- learning systems
- language learning
- higher level
- object oriented programming
- learning scheme
- learning analytics
- active learning
- prior knowledge
- multimedia
- collaborative learning
- knowledge acquisition
- real time
- supervised learning
- mobile learning
- image processing
- feature representations
- multiple representations
- automatically discovering
- structured representations
- external representations