N-Gram Feature Selection for Authorship Identification.
John HouvardasEfstathios StamatatosPublished in: AIMSA (2006)
Keyphrases
- n gram
- feature selection
- text classification
- language model
- language independent
- variable length
- text categorization
- language modeling
- bag of words
- viterbi algorithm
- classification accuracy
- support vector
- support vector machine
- language modelling
- part of speech
- word segmentation
- machine learning
- word level
- character n grams
- feature set
- dynamic programming
- information retrieval