Vārta: A Large-Scale Headline-Generation Dataset for Indic Languages.
Rahul AralikatteZiling ChengSumanth DoddapaneniJackie Chi Kit CheungPublished in: CoRR (2023)
Keyphrases
- expressive power
- small scale
- real world
- million images
- real life
- web scale
- database
- databases
- multi lingual
- benchmark datasets
- syntactic and semantic dependencies
- object oriented languages
- language identification
- synthetic datasets
- information retrieval systems
- search space
- training data
- database systems
- genetic algorithm
- data sets