Using topic models for OCR correction.
Faisal FarooqAnurag BhardwajVenu GovindarajuPublished in: Int. J. Document Anal. Recognit. (2009)
Keyphrases
- topic models
- error correction
- optical character recognition
- topic modeling
- latent dirichlet allocation
- latent variables
- latent topics
- probabilistic model
- text documents
- text mining
- generative model
- co occurrence
- latent topic models
- gibbs sampling
- probabilistic topic models
- text corpora
- probabilistic latent semantic analysis
- artificial intelligence
- machine learning
- variational inference
- microblog posts
- databases