Login / Signup
Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining.
Ce Ge
Zhijian Ma
Daoyuan Chen
Yaliang Li
Bolin Ding
Published in:
CoRR (2024)
Keyphrases
</>
language model
probabilistic model
n gram
retrieval model
document clustering
context sensitive
information retrieval
mixture model
document retrieval
language modeling