Sampling and Feature Selection in a Genetic Algorithm for Document Clustering.
Arantza CasillasMayte Teresa González de LenaRaquel Martínez-UnanuePublished in: CICLing (2004)
Keyphrases
- document clustering
- genetic algorithm
- feature selection
- text mining
- text documents
- text categorization
- clustering algorithm
- vector space model
- document clusters
- document collections
- text classification
- machine learning
- document representation
- clustering method
- k means
- tf idf
- topic extraction
- neural network
- negative matrix factorization
- feature extraction
- feature space
- dimensionality reduction
- document classification
- tolerance rough set
- pairwise constraints
- digital libraries
- cluster analysis
- genetic algorithm ga
- ant colony optimization
- support vector machine
- image features
- ant based clustering
- model selection