INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Large Language Models.
H. S. V. N. S. Kowndinya RenduchintalaKrishnateja KillamsettySumit BhatiaMilan AggarwalGanesh RamakrishnanRishabh K. IyerBalaji KrishnamurthyPublished in: CoRR (2023)