A Comprehensive Study of Techniques for URL-Based Web Page Language Classification.
Eda BaykanMonika HenzingerIngmar WeberPublished in: ACM Trans. Web (2013)
Keyphrases
- web pages
- website
- web page classification
- classification accuracy
- classification method
- image classification
- pattern recognition
- automatic classification
- feature extraction
- keywords
- language learning
- model selection
- text classification
- classification rules
- pattern classification
- support vector machine svm
- search engine
- natural language
- feature selection
- decision rules
- classification process
- machine learning methods
- decision trees
- face recognition
- page segmentation
- data sets
- spam detection
- anchor text
- classification scheme
- classification models
- programming language
- support vector machine
- classification algorithm
- web documents
- natural language processing
- web search