Login / Signup

MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training.

Cheng LuoJiawei ZhaoZhuoming ChenBeidi ChenAnima Anandkumar
Published in: CoRR (2024)
Keyphrases