Cleaneval: a Competition for Cleaning Web Pages.
Marco BaroniFrancis ChantreeAdam KilgarriffSerge SharoffPublished in: LREC (2008)
Keyphrases
- web data
- web pages
- web mining
- data extraction
- web content
- website
- deep web
- search engine
- web documents
- web information extraction
- link structure
- web search
- international competition
- web server
- web search engines
- web users
- web page classification
- web communities
- helping users
- neural network
- hierarchical structure
- data objects
- keywords
- web browser
- feature selection
- social networks
- information retrieval
- data cleaning
- web data extraction
- data mining