Login / Signup
UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining.
Hyung Won Chung
Noah Constant
Xavier Garcia
Adam Roberts
Yi Tay
Sharan Narang
Orhan Firat
Published in:
CoRR (2023)
Keyphrases
</>
databases
language specific
monte carlo
programming language
language independent
language learning
data sets
real life
machine learning
digital libraries
probabilistic model
bayesian networks
description logics
n gram
case study
artificial intelligence
small scale
random sampling
information retrieval
multilingual documents