Tools and Methodologies for Annotating Syntax and Named Entities in the National Corpus of Polish.
Jakub WaszczukKatarzyna GlowinskaAgata SavaryAdam PrzepiórkowskiPublished in: IMCSIT (2010)
Keyphrases
- named entities
- news corpus
- annotated corpus
- person names
- co occurrence
- linguistic features
- named entity extraction
- text corpus
- named entity recognition
- genia corpus
- information extraction
- noun phrases
- text mining
- relation extraction
- named entity disambiguation
- question answering
- natural language processing
- personal names
- news articles
- contextual features
- metadata
- unsupervised learning
- data mining
- semantic classes
- automatic summarization
- natural language
- text documents
- global context
- information retrieval
- machine learning
- knowledge discovery
- automatic annotation
- databases