Login / Signup

Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT.

James Lee-ThorpJoshua Ainslie
Published in: CoRR (2022)
Keyphrases
  • artificial intelligence
  • high dimensional
  • lightweight
  • machine learning
  • computer vision
  • bayesian networks
  • expert systems
  • cost effective
  • computationally expensive
  • combining multiple