Extraction of line-word-character segments directly from run-length compressed printed text-documents.
Mohammed JavedP. NagabhushanB. B. ChaudhuriPublished in: NCVPRIPG (2013)
Keyphrases
- text documents
- run length
- run length encoding
- information extraction
- text corpus
- extraction patterns
- keywords
- text mining
- gray level
- text classification
- optical character recognition
- text categorization
- wordnet
- chain code
- co occurrence
- topic models
- n gram
- text lines
- named entities
- bag of words
- natural language text
- texture information
- data structure
- information retrieval
- artificial intelligence
- compression rate
- relation extraction
- object recognition
- similarity measure
- data mining
- k nearest neighbor
- data analysis
- image processing
- knowledge base
- knowledge discovery
- natural language processing