Topic based language models for OCR correction.
Anurag BhardwajFaisal FarooqHuaigu CaoVenu GovindarajuPublished in: AND (2008)
Keyphrases
- language model
- language modeling
- error correction
- n gram
- document level
- probabilistic model
- document retrieval
- language modelling
- optical character recognition
- retrieval model
- expert finding
- speech recognition
- test collection
- query expansion
- information retrieval
- document images
- context sensitive
- statistical language models
- smoothing methods
- pseudo relevance feedback
- ad hoc information retrieval
- document ranking
- topic models
- character recognition
- relevance model
- translation model
- language models for information retrieval
- spoken term detection
- term dependencies
- language independent
- latent dirichlet allocation
- news articles
- query terms
- search engine