Login / Signup
Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers.
Rui Liu
Young Jin Kim
Alexandre Muzio
Barzan Mozafari
Hany Hassan Awadalla
Published in:
CoRR (2022)
Keyphrases
</>
real time
data sets
neural network
information retrieval
communication networks
clustering algorithm
case study
prior information
communication systems
communication channels
data dependent
communication patterns