Login / Signup
Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing.
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Zhi-Qin John Xu
Published in:
CoRR (2024)
Keyphrases
</>
probabilistic inference
neural network
bayesian networks
efficient learning
inference mechanism
databases
information retrieval
image processing
k means
belief networks
probabilistic reasoning
bayesian model
inference process