Correlation Analysis of Text Author Identification Results Based on N-Grams Frequency Distribution in Ukrainian Scientific and Technical Articles.
Victoria VysotskaOksana MarkivSofiia TesliaYeva RomanovaInesa PihulechkoPublished in: COLINS (2022)
Keyphrases
- n gram
- correlation analysis
- frequency distribution
- author identification
- language independent
- text collections
- character n grams
- language model
- regression analysis
- text classification
- web documents
- correlation coefficient
- bag of words
- information retrieval
- text documents
- factor analysis
- text retrieval
- text mining
- cluster analysis
- textual data
- keywords
- databases
- news articles
- machine translation
- text categorization
- data analysis
- clustering algorithm
- highly skewed