Misalignment Detection for Web-Scraped Corpora: A Supervised Regression Approach.
Arne DefauwSara SzocAnna BardadymJoris BrabersFrederic EveraertRoko MijicKim ScholteTom VanallemeerschKoen Van WinckelJoachim Van den BogaertPublished in: Informatics (2019)
Keyphrases
- website
- regression model
- false positives
- automatic detection
- object detection
- semantic web
- web applications
- web mining
- learning algorithm
- web pages
- support vector regression
- detection accuracy
- anomaly detection
- detection algorithm
- web users
- training data
- regression analysis
- specific domains
- user generated content
- detection method
- supervised learning
- semi supervised
- web documents
- information sources
- false alarms
- least squares
- regression method
- regression algorithm
- support vector