Hierarchical Web Page Classification Based on a Topic Model and Neighboring Pages Integration
Wongkot SriuraiPhayung MeesadChoochart HaruechaiyasakPublished in: CoRR (2010)
Keyphrases
- topic models
- web page classification
- web pages
- pitman yor process
- anchor text
- topic modeling
- web mining
- latent dirichlet allocation
- text classification
- text mining
- text documents
- feature selection
- co occurrence
- automatic classification
- probabilistic model
- generative model
- latent topics
- search engine
- web users
- web documents
- web search
- keywords
- web search engines
- text categorization
- data analysis
- machine learning
- knowledge discovery