Login / Signup

IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages.

Mohammed Safi Ur Rahman KhanPriyam MehtaAnanth SankarUmashankar KumaravelanSumanth DoddapaneniSuriyaprasaad GVarun Balan GSparsh JainAnoop KunchukuttanPratyush KumarRaj DabreMitesh M. Khapra
Published in: CoRR (2024)
Keyphrases
  • fine tuning
  • fine tuned
  • indian languages
  • viable alternative
  • fine tune
  • language identification
  • document images
  • cross lingual
  • training set