Script independent detection of bold words in multi font-size documents.
Pedamalli SaikrishnaA. G. RamakrishnanPublished in: NCVPRIPG (2013)
Keyphrases
- text documents
- arabic documents
- indian languages
- keywords
- index terms
- word frequencies
- word spotting
- document representation
- information retrieval
- related words
- document collections
- relevant documents
- text corpus
- multiword
- semantic relationships
- text corpora
- optical character recognition
- word frequency
- information retrieval systems
- xml documents
- natural language text
- text lines
- latent topics
- word pairs
- n gram
- training documents
- text classification
- character recognition
- word sense disambiguation
- training corpus
- web documents
- word similarity
- metadata