Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language.
Philipp ScharpfMoritz SchubotzAbdou YoussefFelix HamborgNorman MeuschkeBela GippPublished in: CoRR (2020)
Keyphrases
- document clustering
- document classification
- unsupervised classification
- text clustering
- pattern recognition
- automatic categorization
- supervised classification
- unsupervised learning
- information retrieval
- document collections
- pre classified
- clustering method
- clustering algorithm
- feature extraction
- classification accuracy
- supervised learning
- machine learning
- unsupervised clustering
- classification algorithm
- text mining
- support vector machine
- text documents
- high dimensionality
- support vector
- journal articles
- automatic classification
- keywords
- xml documents
- indian languages
- decision trees
- scientific literature
- document categorization
- clustering analysis
- multilingual documents
- programming language
- image classification
- vector space model
- feature space
- text classification
- feature vectors