Exploring linguistic features for web spam detection: a preliminary study.
Jakub PiskorskiMarcin SydowDawid WeissPublished in: AIRWeb (2008)
Keyphrases
- linguistic features
- web spam detection
- spam detection
- structural features
- named entities
- text classification
- semantic features
- web spam
- search engine
- part of speech
- named entity recognition
- linguistic knowledge
- news stories
- feature set
- sentence level
- natural language processing
- translation model
- text mining
- information extraction
- training data