The Challenges of German Archival Document Categorization on Insufficient Labeled Data.
Fabian HoppeTabea TietzDanilo DessìMirjam SprauMehwish AlamHarald SackPublished in: WHiSe@ESWC (2020)
Keyphrases
- labeled data
- document categorization
- text classification
- unlabeled data
- text categorization
- semi supervised learning
- semi supervised
- active learning
- transfer learning
- text documents
- prior knowledge
- machine learning
- data points
- supervised learning
- training data
- naive bayes
- class labels
- bag of words
- text mining
- feature selection
- document classification
- meta learning
- learning algorithm
- k nearest neighbor
- text data
- training examples
- unsupervised learning
- background knowledge
- data sets
- multi label
- language model
- knn
- training set
- real world