Keyphrases
- page segmentation
- document images
- deep learning
- language identification
- natural language
- indian languages
- unsupervised learning
- comparative evaluation
- document image analysis
- optical character recognition
- machine learning
- weakly supervised
- evaluation methods
- storage and retrieval
- document analysis
- text lines
- machine translation
- chinese characters
- supervised learning
- semantic similarity
- mental models
- semantic information
- multiscale
- page layout
- word segmentation
- semantic features
- object categories
- low level features
- information extraction
- feature selection