VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts.
Wenhui WangHangbo BaoLi DongFuru WeiPublished in: CoRR (2021)
Keyphrases
- programming language
- multi modal
- computer vision
- language learning
- mixture model
- natural language
- training process
- supervised learning
- real time
- training set
- vision system
- neural network
- training algorithm
- serious games
- human experts
- domain experts
- speech recognition
- test set
- gaussian mixture model
- expectation maximization
- maximum likelihood
- online learning
- data model
- image processing
- training phase
- specification language
- subject matter experts