Improving Information Retrieval Performance on OCRed Text in the Absence of Clean Text Ground Truth.
Kripabandhu GhoshAnirban ChakrabortySwapan Kumar ParuiPrasenjit MajumderPublished in: Inf. Process. Manag. (2016)
Keyphrases
- information retrieval
- ground truth
- text mining
- text documents
- text processing
- text retrieval
- computational linguistics
- high quality
- keywords
- natural language generation
- database
- web documents
- text collections
- automatically extracted
- free text
- semantic network
- information extraction
- linguistic analysis
- learning to rank
- string matching
- text information
- document indexing
- textual data
- vector space model
- text data
- relevance feedback
- search engine
- machine learning
- data sets