An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish.
Quan DuongMika HämäläinenSimon HengchenPublished in: CoRR (2020)
Keyphrases
- experimental evaluation
- clustering method
- high precision
- unsupervised learning
- preprocessing
- pairwise
- computational cost
- high accuracy
- significant improvement
- cost function
- objective function
- synthetic data
- prior knowledge
- hidden markov models
- computational complexity
- post processing
- support vector machine svm
- feature selection
- detection method
- neural network