Application of Localized Similarity for Web Documents.
Peter RebersekMateja VerlicPublished in: EMNLP (2013)
Keyphrases
- web documents
- web pages
- web search engines
- content similarity
- distance function
- similarity measure
- semi structured
- information extraction
- document classification
- focused crawling
- machine learning
- natural language processing
- keywords
- database
- semantic similarity
- vector space model
- textual information
- link structure
- html documents