Content Based Spam Text Classification: An Empirical Comparison between English and Chinese.
Liumei ZhangJianfeng MaYichuan WangPublished in: INCoS (2013)
Keyphrases
- text classification
- spam filtering
- cross lingual
- mono lingual
- chinese language
- word segmentation
- english text
- dependency parser
- foreign language
- english chinese
- text categorization
- event extraction
- bag of words
- training corpus
- spam filters
- anti spam
- language learning
- chinese text
- english language
- image retrieval
- unknown words
- chinese english
- chinese characters
- spam detection
- chinese web
- text mining
- dependency parsing
- feature selection
- text documents
- cross language
- sentiment analysis
- machine learning
- native speakers
- natural language
- labeled data
- machine translation
- language independent
- n gram
- word level
- query translation
- data cleaning
- semantic features
- text classifiers
- cross language information retrieval
- multi label
- email spam
- answer questions
- sina weibo
- english words
- machine translation system
- feature set
- language modeling
- computer assisted language learning