The Notary in the Haystack - Countering Class Imbalance in Document Processing with CNNs.
Martin LeipertGeorg VogelerMathias SeuretAndreas K. MaierVincent ChristleinPublished in: CoRR (2020)
Keyphrases
- document processing
- class imbalance
- digital libraries
- active learning
- class distribution
- information retrieval
- cost sensitive
- document images
- concept drift
- high dimensionality
- document clustering
- information extraction
- textual documents
- text processing
- feature selection
- document analysis
- minority class
- data mining
- high dimensional
- neural network
- non stationary
- machine learning
- text mining
- multimedia documents
- knowledge discovery