Login / Signup

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale.

Guilherme PenedoHynek KydlícekLoubna Ben AllalAnton LozhkovMargaret MitchellColin RaffelLeandro von WerraThomas Wolf
Published in: CoRR (2024)
Keyphrases