The Electronic Corpus of 17th- and 18th-century Polish Texts.
Wlodzimierz GruszczynskiDorota AdamiecRenata BronikowskaWitold KierasEmanuel ModrzejewskiAleksandra WieczorekMarcin WolinskiPublished in: Lang. Resour. Evaluation (2022)
Keyphrases
- natural language text
- training corpus
- world knowledge
- newspaper articles
- english words
- information extraction systems
- electronic documents
- linguistic information
- linguistic patterns
- word sense
- text corpus
- manually annotated
- st century
- scientific papers
- legal texts
- textual features
- open domain
- information extraction
- news corpus
- spanish language
- supervised machine learning
- linguistic features
- automatic extraction
- text documents
- test set
- natural language
- syntactic structures
- multiword
- sentence level
- design automation
- writing style
- natural language generation
- knowledge base