Login / Signup

Oasis: Data Curation and Assessment System for Pretraining of Large Language Models.

Tong ZhouYubo ChenPengfei CaoKang LiuJun ZhaoShengping Liu
Published in: CoRR (2023)
Keyphrases
  • language model
  • language modeling
  • probabilistic model
  • information retrieval
  • context sensitive
  • machine learning
  • error rate
  • n gram