Feature selection in high-dimensional dataset using MapReduce.
Claudio ReggianiYann-Aël Le BorgneGianluca BontempiPublished in: CoRR (2017)
Keyphrases
- feature selection
- high dimensional
- high dimensionality
- feature set
- dimensionality reduction
- gene expression data
- feature space
- high dimensional datasets
- microarray data
- small sample
- dimension reduction
- mutual information
- nearest neighbor
- low dimensional
- classification accuracy
- microarray datasets
- text classification
- model selection
- text categorization
- high dimensional data
- similarity search
- benchmark datasets
- support vector
- sparse data
- parallel processing
- information gain
- variable selection
- noisy data
- cloud computing
- multi dimensional
- data points
- distributed processing
- high dimension
- irrelevant features
- database
- high performance data mining
- distributed computing
- feature selection algorithms
- multi task
- input space
- feature subset
- metric space
- feature extraction
- data sets