On the Benefits of Learning to Route in Mixture-of-Experts Models.
Nishanth DikkalaNikhil GhoshRaghu MekaRina PanigrahyNikhil VyasXin WangPublished in: EMNLP (2023)
Keyphrases
- learning algorithm
- learning process
- prior knowledge
- latent variable models
- neural nets
- statistical models
- knowledge acquisition
- accurate models
- structured prediction
- online learning
- unsupervised learning
- learning tasks
- complex systems
- learning community
- learning models
- bayesian framework
- experimental data
- statistical model
- neural network
- supervised learning
- hidden markov models
- active learning
- machine learning