Sign in

Transformers generalize differently from information stored in context vs in weights.

Stephanie C. Y. ChanIshita DasguptaJunkyung KimDharshan KumaranAndrew K. LampinenFelix Hill
Published in: CoRR (2022)
Keyphrases