Segmentation and Classification of Pages for Digitized Documents of the Public Prosecutor's Office.
Kevin RiveraDiana QuintanillaÁngel EspezuaPublished in: SIMBig (2022)
Keyphrases
- document classification
- web documents
- pre classified
- making decisions
- document retrieval
- automatic classification
- decision trees
- information retrieval
- text classification
- image classification
- feature vectors
- multiscale
- keywords
- automatic categorization
- pixel classification
- feature extraction
- segmentation algorithm
- document collections
- classification algorithm
- machine learning
- search engine
- textual content
- website
- xml documents
- classification accuracy
- metadata
- document clustering
- web pages
- document representation
- feature selection
- link structure
- text classifiers
- web information
- image segmentation
- medical images
- support vector machine
- word spotting
- topic segmentation
- text documents
- page layout
- class labels