Sign in

Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer.

Boan LiuLiang DingLi ShenKeqin PengYu CaoDazhao ChengDacheng Tao
Published in: CoRR (2023)
Keyphrases