An Exploratory Study on the Impact of Temporal Features on the Classification and Clustering of Future-Related Web Documents.
Ricardo CamposGaël DiasAlípio JorgePublished in: EPIA (2011)
Keyphrases
- web documents
- document classification
- content similarity
- web pages
- semi structured
- information extraction
- web search engines
- clustering algorithm
- text classification
- clustering method
- classification algorithm
- keywords
- automatic classification
- focused crawling
- html documents
- link structure
- related web pages
- vector space model
- k means
- feature selection
- document representation
- document clustering
- domain specific
- feature space
- machine learning