Unsupervised authorship attribution using feature selection and weighted cosine similarity.
Carolina Martín del Campo RodríguezGrigori SidorovIldar Z. BatyrshinPublished in: J. Intell. Fuzzy Syst. (2022)
Keyphrases
- cosine similarity
- authorship attribution
- feature selection
- unsupervised learning
- similarity measure
- document clustering
- tf idf
- similarity function
- vector space
- distance measure
- text categorization
- vector space model
- source code
- k means
- euclidean distance
- machine learning
- plagiarism detection
- semantic similarity
- digital forensics
- writing style
- text classification
- dimensionality reduction
- classification accuracy
- feature space
- feature extraction
- feature set
- supervised learning
- support vector machine
- training set
- knowledge base
- support vector