Fast Preprocessing by Suffix Arrays for Managing Byte n-grams to Detect Malware Subspecies by Machine Learning.
Kouhei KitaRyuya UdaPublished in: J. Inf. Process. (2024)
Keyphrases
- n gram
- machine learning
- suffix array
- text classification
- string matching
- language model
- variable length
- data structure
- suffix tree
- data compression
- space efficient
- text mining
- knowledge representation
- active learning
- feature selection
- character n grams
- neural network
- knowledge discovery
- query expansion
- natural language processing
- data mining