TokenTLB: A Token-Based Page Classification Approach.
Albert EsteveAlberto RosAntonio RoblesMaría Engracia GómezJosé DuatoPublished in: ICS (2016)
Keyphrases
- automatic classification
- classification accuracy
- pattern recognition
- machine learning
- web pages
- website
- support vector machine svm
- neural network
- decision trees
- feature vectors
- classification models
- classification method
- decision rules
- classification algorithm
- benchmark datasets
- training samples
- training set
- preprocessing
- benchmark data sets
- classification systems
- classification process
- cross validation
- unsupervised learning
- image classification
- hidden markov models
- feature extraction
- social networks
- search engine