LSHTC: A Benchmark for Large-Scale Text Classification.
Ioannis PartalasAris KosmopoulosNicolas BaskiotisThierry ArtièresGeorge PaliourasÉric GaussierIon AndroutsopoulosMassih-Reza AminiPatrick GallinariPublished in: CoRR (2015)
Keyphrases
- text classification
- real world
- bag of words
- text mining
- text categorization
- document classification
- small scale
- feature selection
- n gram
- knn
- k nearest neighbor
- data cleaning
- text documents
- multi label
- labeled data
- machine learning
- neural network
- unlabeled data
- databases
- natural language processing
- probabilistic model
- knowledge base