Login / Signup
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention.
Yang Liu
Jiaxiang Liu
Li Chen
Yuxiang Lu
Shikun Feng
Zhida Feng
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
Published in:
CoRR (2022)
Keyphrases
</>
learning tasks
multi task
sparse learning
bayesian networks
multi modal
multi modality