Improving OCR of historical newspapers and journals published in Finland.
Senka DrobacPekka KauppinenKrister LindénPublished in: DATeCH (2019)
Keyphrases
- papers published
- post processing
- optical character recognition
- historical data
- web pages
- neural network
- document images
- digital libraries
- character recognition
- artificial intelligence
- document processing
- preprocessing
- image analysis
- hidden markov models
- decision trees
- search engine
- news articles
- real world
- printed documents
- historical documents
- data sets