Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm.
Soulib GhoshKhalid Hassan SheikhHussain Ali KhanAnkur MannaShowmik BhowmikRam SarkarPublished in: Soft Comput. (2022)
Keyphrases
- feature selection algorithms
- document images
- feature selection
- text classification
- printed text
- scanned documents
- feature set
- irrelevant features
- selected features
- document processing
- document analysis
- scanned images
- text regions
- optical character recognition
- printed documents
- text lines
- page layout
- feature subset
- text categorization
- text documents
- data sets
- document image analysis
- mathematical formulas
- scanned document images
- n gram
- text mining
- information retrieval
- decision trees
- neural network
- structural features
- semantic features
- learning models
- classification models
- handwritten documents
- image features
- feature space
- mathematical expressions
- learning algorithm
- data mining