Developing a Multilingual Corpus of Wikipedia Biographies.
Hannah DevinneyAnton EklundIgor RyazanovJingwen CaiPublished in: RANLP (2023)
Keyphrases
- wikipedia articles
- digital libraries
- wordnet
- parallel corpus
- world knowledge
- named entity disambiguation
- knowledge base
- natural language text
- manually annotated
- wide coverage
- semi automatically
- text corpus
- cross language information retrieval
- document corpus
- language independent
- test set
- chinese english
- entity extraction
- text categorization