Detection of text quality flaws as a one-class classification problem.
Maik AnderkaBenno SteinNedim LipkaPublished in: CIKM (2011)
Keyphrases
- high quality
- false positives
- object detection
- detection accuracy
- text detection
- novelty detection
- text documents
- detection method
- automatically extracted
- automatic detection
- detection algorithm
- database
- supervised machine learning
- quality measures
- multimedia
- information retrieval
- complex background
- sentence level
- document analysis
- text recognition
- textual data
- higher quality
- false alarms
- data quality
- detection rate
- web documents
- anomaly detection