Ngram and Bayesian Classification of Documents for Topic and Authorship.
Ross ClementDavid SharpPublished in: Lit. Linguistic Comput. (2003)
Keyphrases
- bayesian classification
- n gram
- document set
- web documents
- writing style
- language model
- document collections
- information retrieval
- document clustering
- language independent
- naive bayes
- text classification
- citation networks
- text documents
- relevant documents
- keywords
- bayesian classifiers
- multi label
- news articles
- semi supervised
- feature selection