Document Classification and Page Stream Segmentation for Digital Mailroom Applications.
Albert GordoMarçal RusiñolDimosthenis KaratzasAndrew D. BagdanovPublished in: ICDAR (2013)
Keyphrases
- document classification
- text classification
- text mining
- web documents
- classification algorithm
- text documents
- text categorization
- web document classification
- web pages
- multiscale
- image segmentation
- keywords
- data sets
- topic extraction
- linear classification
- automatic document classification
- bag of words
- knowledge acquisition
- high dimensional
- training set
- image processing
- computer vision
- neural network