Sign in

Understanding the Effectiveness of Early Weight Averaging for Training Large Language Models.

Sunny SanyalJean KaddourAbhishek KumarSujay Sanghavi
Published in: CoRR (2023)
Keyphrases