Login / Signup
Provably learning a multi-head attention layer.
Sitan Chen
Yuanzhi Li
Published in:
CoRR (2024)
Keyphrases
</>
bayesian networks
learning algorithm
online learning
real time
learning systems
reinforcement learning
learning process
prior knowledge
machine learning
face recognition
knowledge acquisition
empirical studies
unsupervised learning
artificial intelligence
learning problems
learning scheme
inductive inference