Login / Signup

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.

Soham DeSamuel L. SmithAnushan FernandoAleksandar BotevGeorge-Cristian MuraruAlbert GuRuba HarounLeonard BerradaYutian ChenSrivatsan SrinivasanGuillaume DesjardinsArnaud DoucetDavid BuddenYee Whye TehRazvan PascanuNando de FreitasCaglar Gulcehre
Published in: CoRR (2024)
Keyphrases