Login / Signup
Implicit Bias and Fast Convergence Rates for Self-attention.
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
Published in:
CoRR (2024)
Keyphrases
</>
convergence rate
step size
learning rate
convergence speed
primal dual
global convergence
conjugate gradient
numerical stability
mutation operator
stopping criterion
particle swarm optimization