Login / Signup

Implicit Regularization of Gradient Flow on One-Layer Softmax Attention.

Heejune SheenSiyu ChenTianhao WangHarrison H. Zhou
Published in: CoRR (2024)
Keyphrases