• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Self-Influence Guided Data Reweighting for Language Model Pre-training.

Megh ThakkarTolga BolukbasiSriram GanapathyShikhar VashishthSarath ChandarPartha Talukdar
Published in: CoRR (2023)
Keyphrases
  • language model
  • n gram
  • language modeling
  • information retrieval
  • training data
  • training set
  • probability distribution
  • query expansion
  • mixture model
  • test collection
  • uncertain data
  • ad hoc information retrieval