Alleviating Digitization Errors in Named Entity Recognition for Historical Documents.
Emanuela BorosAhmed HamdiElvys Linhares PontesLuis Adrián Cabrera-DiegoJose G. MorenoNicolas SidereAntoine DoucetPublished in: CoNLL (2020)
Keyphrases
- named entity recognition
- historical documents
- named entities
- information extraction
- natural language processing
- handwriting recognition
- maximum entropy
- semi supervised
- text summarization
- conditional random fields
- document images
- relation extraction
- annotated corpus
- word recognition
- historical manuscripts
- maximum entropy classifier
- question answering
- generative model
- computer vision
- text mining
- feature space
- pattern recognition
- similarity measure
- machine learning