Using Part-of-Speech N-grams for Sensitive-Text Classification.
Graham McDonaldCraig MacdonaldIadh OunisPublished in: ICTIR (2015)
Keyphrases
- n gram
- part of speech
- text classification
- pos tagging
- bag of words
- language independent
- text documents
- text categorization
- feature selection
- labeled data
- text mining
- language modeling
- language model
- machine learning
- training corpus
- word segmentation
- knn
- semantic features
- unlabeled data
- parse tree
- text retrieval
- cross lingual
- out of vocabulary
- artificial intelligence