IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages.
Mohammed Safi Ur Rahman KhanPriyam MehtaAnanth SankarUmashankar KumaravelanSumanth DoddapaneniSuriyaprasaad GVarun Balan GSparsh JainAnoop KunchukuttanPratyush KumarRaj DabreMitesh M. KhapraPublished in: CoRR (2024)