Text and Non-text Separation in Scanned Color-Official Documents.
Amit Vijay NandedkarJayanta MukherjeeShamik SuralPublished in: ICVGIP Workshops (2016)
Keyphrases
- information retrieval
- text documents
- free text
- keywords
- text retrieval
- text information
- document analysis
- text analysis
- text collections
- digital documents
- textual data
- text mining
- newspaper articles
- web documents
- textual information
- text content
- key concepts
- document categorization
- textual content
- text data
- scientific documents
- document processing
- multimedia documents
- plagiarism detection
- automatic categorization
- scanned documents
- text graphics
- text clustering
- natural language text
- multiword
- semantic content
- document content
- latent semantic analysis
- document images
- semantic information
- document collections
- text categorization
- information retrieval systems