Login / Signup
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion.
Yilong Chen
Linhao Zhang
Junyuan Shang
Zhenyu Zhang
Tingwen Liu
Shuohuan Wang
Yu Sun
Published in:
CoRR (2024)
Keyphrases
</>
learning algorithm
learning process
adaptive learning
artificial neural networks
supervised learning
face recognition
fault diagnosis
case study
online learning
unsupervised learning
data fusion
learning problems
incremental learning
learning analytics
combining multiple