Login / Signup

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations.

Alexander HägeleElie BakouchAtli KossonLoubna Ben AllalLeandro von WerraMartin Jaggi
Published in: CoRR (2024)
Keyphrases