GeniL: A Multilingual Dataset on Generalizing Language.
Aida Mostafazadeh DavaniSagar GubbiSunipa DevShachi DaveVinodkumar PrabhakaranPublished in: CoRR (2024)
Keyphrases
- language resources
- language specific
- natural language
- database
- programming language
- benchmark datasets
- language processing
- language learning
- data sets
- parallel corpus
- modeling language
- cross language information retrieval
- digital libraries
- synthetic datasets
- training dataset
- language independent
- machine translation
- specification language
- machine translation system
- feature set
- indian languages
- training data
- text generation