An effective approach for semantic-based clustering and topic-based ranking of web documents.
Rajendra Kumar RoulPublished in: Int. J. Data Sci. Anal. (2018)
Keyphrases
- web documents
- content similarity
- topic specific
- focused crawling
- clustering algorithm
- information extraction
- semi structured
- web pages
- document classification
- web search engines
- web data
- clustering method
- html documents
- document representation
- ranking algorithm
- social annotations
- vector space model
- k means
- link analysis
- textual information
- semantic association
- web content
- keywords
- document clustering
- co occurrence
- link structure
- structured documents
- data mining
- related web pages
- returned by a search engine
- anchor text
- learning to rank
- text documents
- topic models
- information retrieval systems
- natural language processing