Évaluation intrinsèque et extrinsèque du nettoyage de pages Web.
Gaël LejeuneRomain BrixtelCharlotte LecluzePublished in: TALN (2015)
Keyphrases
- website
- web pages
- web users
- web documents
- web information
- search engine
- web applications
- web crawlers
- anchor text
- web graph
- page content
- hyperlink structure
- web objects
- web crawling
- internet archive
- semantic web
- dynamically generated
- log files
- web usage mining
- navigational behavior
- page contents
- web search
- information sources
- topic distillation
- content features
- dynamic content
- web mining
- link structure
- web data
- test collection
- link analysis
- web server
- web sources
- home page
- google search
- web content
- web technologies
- linked data