A New Centroid-based Approach for Genre Categorization of Web Pages.
Chaker JebariPublished in: J. Lang. Technol. Comput. Linguistics (2009)
Keyphrases
- web pages
- web documents
- genre classification
- web page classification
- search engine
- website
- text categorization
- web server
- web search
- web search engines
- web content mining
- link analysis
- web content
- k means
- keywords
- textual content
- machine learning
- data sets
- hierarchical structure
- feature selection
- automatic classification
- web logs
- image categorization
- plain text
- data mining