Text/Non-Text Classification of Connected Components in Document Images.
Frank D. Julca-AguilarAna L. L. M. MaiaNina S. T. HirataPublished in: SIBGRAPI (2017)
Keyphrases
- document images
- connected components
- text classification
- text lines
- text regions
- document analysis
- printed documents
- binary images
- text detection
- document processing
- text mining
- text documents
- page layout
- scanned document images
- scanned documents
- printed text
- handwriting recognition
- feature selection
- connected component analysis
- text categorization
- historical documents
- language identification
- optical character recognition
- mathematical formulas
- document image analysis
- line extraction
- n gram
- level set
- document image retrieval
- handwritten documents
- scanned images
- machine learning
- indian languages
- computer vision
- text retrieval
- gray scale
- word spotting
- image processing
- image analysis
- image classification
- semantic features
- information retrieval
- mathematical morphology