Extraction of Line Word Character Segments Directly from Run Length Compressed Printed Text Documents.
Mohammed JavedP. NagabhushanB. B. ChaudhuriPublished in: CoRR (2014)
Keyphrases
- text documents
- run length
- run length encoding
- information extraction
- text corpus
- extraction patterns
- keywords
- text mining
- text classification
- gray level
- optical character recognition
- wordnet
- text categorization
- topic models
- text lines
- chain code
- co occurrence
- named entities
- n gram
- natural language text
- bag of words
- texture information
- sample size
- data hiding
- machine learning
- artificial intelligence
- document images
- natural language processing
- data structure
- multiscale
- information retrieval