Segmentation of Text-Lines and Words from JPEG Compressed Printed Text Documents Using DCT Coefficients.
Bulla RajeshMohammed JavedP. NagabhushanWatanabe OsamuPublished in: DCC (2020)
Keyphrases
- text documents
- dct coefficients
- text lines
- compressed domain
- optical character recognition
- text mining
- document images
- discrete cosine transform
- text classification
- information extraction
- text categorization
- keywords
- topic models
- document clustering
- bag of words
- wordnet
- named entities
- connected components
- word segmentation
- transform domain
- level set
- image segmentation
- complex background
- motion vectors
- macroblock
- feature selection
- video analysis
- character recognition
- video data
- image compression
- image classification
- natural language processing
- multiscale
- search engine
- binary images
- image processing
- handwritten documents