Old Content and Modern Tools - Searching Named Entities in a Finnish OCRed Historical Newspaper Collection 1771-1910.
Kimmo KettunenEetu MäkeläTeemu RuokolainenJuha KuokkalaLaura LöfbergPublished in: CoRR (2016)
Keyphrases
- named entities
- pdf files
- information extraction
- named entity recognition
- named entity extraction
- text mining
- natural language processing
- co occurrence
- relation extraction
- question answering
- multimedia
- text documents
- unsupervised learning
- text corpus
- news corpus
- personal names
- document collections
- annotated corpus
- web news
- global context
- person names
- genia corpus
- data mining
- conditional random fields
- information retrieval systems
- digital libraries
- metadata