Login / Signup

Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining.

Ce GeZhijian MaDaoyuan ChenYaliang LiBolin Ding
Published in: CoRR (2024)
Keyphrases
  • language model
  • probabilistic model
  • n gram
  • retrieval model
  • document clustering
  • context sensitive
  • information retrieval
  • mixture model
  • document retrieval
  • language modeling