Login / Signup
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity.
William Fedus
Barret Zoph
Noam Shazeer
Published in:
J. Mach. Learn. Res. (2022)
Keyphrases
</>
high dimensional
recommender systems
real time
case study
multiscale
prior knowledge
model selection
statistical model
parameter values
classification models