Sign in

INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Large Language Models.

H. S. V. N. S. Kowndinya RenduchintalaKrishnateja KillamsettySumit BhatiaMilan AggarwalGanesh RamakrishnanRishabh K. IyerBalaji Krishnamurthy
Published in: CoRR (2023)
Keyphrases
  • language model
  • information retrieval
  • knowledge discovery
  • training data
  • document retrieval
  • language modeling
  • speech recognition
  • retrieval model