CHisIEC: An Information Extraction Corpus for Ancient Chinese History.
Xuemei TangZekun DengQi SuHao YangJun WangPublished in: CoRR (2024)
Keyphrases
- information extraction
- open domain
- event extraction
- text summarization
- information extraction systems
- mono lingual
- annotated corpus
- manually annotated
- named entity recognition
- relation extraction
- natural language text
- linguistic patterns
- named entities
- natural language processing
- text mining
- precision and recall
- unknown words
- text documents
- question answering
- structured data
- entity extraction
- free text
- machine learning
- web corpora
- tree bank
- web documents
- textual data
- web mining
- chinese english
- text data
- information retrieval
- machine translation
- semi structured
- natural language
- extraction patterns
- wordnet
- test set
- conditional random fields
- word sense disambiguation
- cultural heritage
- relational learning
- word sense
- statistical machine translation
- linguistic features
- domain specific
- sentence level
- text processing