Character decomposition to resolve class imbalance problem in Hangul OCR.
Geonuk KimJaemin SonKanghyu LeeJaesik MinPublished in: CoRR (2022)
Keyphrases
- class imbalance
- optical character recognition
- class distribution
- active learning
- cost sensitive learning
- cost sensitive
- printed documents
- document images
- character recognition
- imbalanced datasets
- concept drift
- majority class
- software defect prediction
- sampling methods
- small disjuncts
- high dimensionality
- imbalanced data
- feature selection
- training set
- text lines
- test set
- training data
- pairwise