Login / Signup

Gated Linear Attention Transformers with Hardware-Efficient Training.

Songlin YangBailin WangYikang ShenRameswar PandaYoon Kim
Published in: CoRR (2023)
Keyphrases