Building Multilingual Corpora for a Complex Named Entity Recognition and Classification Hierarchy using Wikipedia and DBpedia.
Diego Fernando Válio Antunes AlvesGaurish ThakkarGabriel AmaralTin KuculoMarko TadicPublished in: Qurator (2021)
Keyphrases
- named entity recognition
- natural language processing
- information extraction
- annotated corpus
- named entities
- classifier ensemble
- pattern recognition
- machine learning
- decision trees
- maximum entropy
- conditional random fields
- relation extraction
- classification accuracy
- semi supervised
- text summarization
- natural language
- sequence labeling
- artificial intelligence
- text classification
- databases
- feature selection
- classification algorithm
- model selection
- co occurrence
- supervised learning
- similarity measure
- information retrieval
- active learning
- bayesian networks