L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages.
Aishwarya MirashiSrushti SonavanePurva LingayatTejas PadhiyarRaviraj JoshiPublished in: CoRR (2024)
Keyphrases
- document classification
- short text
- short texts
- topic detection
- text classification
- text documents
- text data
- text mining
- short text classification
- text categorization
- classification algorithm
- news articles
- web documents
- document clustering
- data sets
- data mining
- latent topics
- sentiment classification
- keywords
- cross lingual
- topic models
- labeled data
- expectation maximization
- prior knowledge
- digital libraries
- search engine
- neural network