A Corpus for Automatic Readability Assessment and Text Simplification of German.
Alessia BattistiDominik PfützeAndreas SäuberliMarek KostrzewaSarah EblingPublished in: LREC (2020)
Keyphrases
- broad coverage
- automatic text
- supervised machine learning
- open domain
- plain text
- text data
- information retrieval
- named entity disambiguation
- text corpus
- text corpora
- semi automatically
- lexical features
- world knowledge
- anaphora resolution
- database
- multiword
- recognizing textual entailment
- text retrieval
- hand crafted
- text mining
- topic segmentation
- multiresolution
- english words
- scientific papers
- keywords
- writing style
- temporal expressions
- textual features
- newspaper articles
- semi automatic
- training corpus
- natural language text
- information extraction systems
- search engine
- text collections