Text binarization in color documents.
Efthimios BadekasNikos A. NikolaouNikos PapamarkosPublished in: Int. J. Imaging Syst. Technol. (2006)
Keyphrases
- text documents
- document images
- document analysis
- information retrieval
- free text
- digital documents
- printed documents
- document processing
- web documents
- plagiarism detection
- text collections
- document content
- newspaper articles
- textual content
- keywords
- text retrieval
- text analysis
- textual data
- semantic content
- gray scale images
- text information
- text mining
- latent semantic analysis
- text data
- automatic categorization
- color images
- text clustering
- textual information
- printed text
- text corpus
- document collections
- document structure
- document level
- text content
- text extraction
- electronic documents
- key concepts
- document categorization
- multimedia documents
- natural language text
- text corpora
- document representation
- text classification
- sentence level
- document set
- handwritten text
- grayscale images
- document clustering
- text recognition
- information retrieval systems
- journal articles
- document retrieval
- color information
- text categorization
- xml documents
- language model
- character recognition
- metadata
- scientific literature
- semantic information
- handwritten documents
- topic models
- topic segmentation
- image preprocessing
- complex background
- news stories
- text regions
- digital libraries
- relevant documents
- bag of words
- color space
- information extraction
- natural language processing
- retrieval engine
- optical character recognition
- gray scale
- related documents
- gray level