Text Classification and Document Layout Analysis of Paper Fragments.
Markus DiemFlorian KleberRobert SablatnigPublished in: ICDAR (2011)
Keyphrases
- text classification
- text documents
- document classification
- text classifiers
- term frequency
- topic discovery
- training documents
- text categorization
- automatic text classification
- document categorization
- bag of words
- text mining
- feature selection
- document images
- information retrieval
- training corpus
- knn
- labeled data
- n gram
- tf idf
- text data
- naive bayes
- document clustering
- machine learning
- information retrieval systems
- document collections
- document representation
- semantic features
- data mining
- classify documents
- data cleaning
- databases
- search engine
- keywords
- multi label
- text collections
- vector space model
- language modeling
- sentiment analysis
- document retrieval
- retrieval systems