Login / Signup

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.

Isaac CaswellJulia KreutzerLisa WangAhsan WahabDaan van EschNasanbayar Ulzii-OrshikhAllahsera TapoNishant SubramaniArtem SokolovClaytone SikasoteMonang SetyawanSupheakmungkol SarinSokhar SambBenoît SagotClara RiveraAnnette RiosIsabel PapadimitriouSalomey OseiPedro Javier Ortiz SuárezIroro OrifeKelechi OguejiRubungo Andre NiyongaboToan Q. NguyenMathias MüllerAndré MüllerShamsuddeen Hassan MuhammadNanda MuhammadAyanda MnyakeniJamshidbek MirzakhalovTapiwanashe MatangiraColin LeongNze LawsonSneha KuduguntaYacine JerniteMathias JennyOrhan FiratBonaventure F. P. DossouSakhile DlaminiNisansa de SilvaSakine Çabuk BalliStella BidermanAlessia BattistiAhmed BaruwaAnkur BapnaPallavi BaljekarIsrael Abebe AzimeAyodele AwokoyaDuygu AtamanOrevaoghene AhiaOghenefego AhiaSweta AgrawalMofetoluwa Adeyemi
Published in: CoRR (2021)
Keyphrases