Complex Word Identification: Challenges in Data Annotation and System Performance.
Marcos ZampieriShervin MalmasiGustavo PaetzoldLucia SpeciaPublished in: NLP-TEA@IJCNLP (2017)
Keyphrases
- data sets
- complex data
- data analysis
- high quality
- raw data
- data sources
- end users
- synthetic data
- data processing
- high level
- statistical analysis
- data collection
- database
- data quality
- original data
- missing data
- computer systems
- probabilistic model
- keywords
- training data
- probability distribution
- sensor data
- lessons learned
- prior knowledge
- information retrieval