Document Layout Analysis and Classification and Its Application in OCR.
Gaurav GuptaShobhit NiranjanAnkit ShrivastavaR. Mahesh K. SinhaPublished in: EDOC Workshops (2006)
Keyphrases
- document images
- document classification
- decision trees
- document processing
- support vector
- preprocessing
- classification algorithm
- image classification
- classification accuracy
- classification method
- benchmark datasets
- post processing
- text classification
- feature extraction
- automatic classification
- training set
- pattern recognition
- model selection
- page segmentation
- optical character recognition
- structured documents
- printed documents
- automatic categorization
- data sets
- document image retrieval
- language model
- information retrieval systems
- feature set
- multi class
- support vector machine
- keywords
- information retrieval
- machine learning
- neural network