Login / Signup

Hierarchical Transformers Are More Efficient Language Models.

Piotr NawrotSzymon TworkowskiMichal TyrolskiLukasz KaiserYuhuai WuChristian SzegedyHenryk Michalewski
Published in: NAACL-HLT (Findings) (2022)
Keyphrases