Efficient Large Scale Language Modeling with Mixtures of Experts.
Mikel ArtetxeShruti BhosaleNaman GoyalTodor MihaylovMyle OttSam ShleiferXi Victoria LinJingfei DuSrinivasan IyerRamakanth PasunuruGiridharan AnantharamanXian LiShuohui ChenHalil AkinMandeep BainesLouis MartinXing ZhouPunit Singh KouraBrian O'HoroJeffrey WangLuke ZettlemoyerMona T. DiabZornitsa KozarevaVeselin StoyanovPublished in: EMNLP (2022)