Multilayer classification of web pages using random forest and semi-supervised latent dirichlet allocation.
Karim SayadiQuang Vu BuiMarc BuiPublished in: I4CS (2015)
Keyphrases
- random forest
- latent dirichlet allocation
- decision trees
- semi supervised
- topic models
- feature set
- web pages
- support vector
- classification accuracy
- supervised learning
- generative model
- pattern recognition
- semi supervised learning
- decision tree learning algorithms
- topic modeling
- text classification
- ensemble classifier
- unsupervised learning
- feature vectors
- multi label
- image classification
- benchmark datasets
- machine learning methods
- maximum likelihood
- feature space
- ensemble methods
- text mining
- support vector machine svm
- base classifiers
- pairwise
- unlabeled data
- co occurrence
- data mining