A Parallel Implementation of Information Gain Using Hive in Conjunction with MapReduce for Continuous Features.
Sikha BaguiSharon K. JohnJohn P. BaggsSubhash C. BaguiPublished in: PAKDD (Workshops) (2018)
Keyphrases
- information gain
- parallel implementation
- feature selection
- text categorization
- decision trees
- chi square
- chi squared
- mutual information
- feature vectors
- parallel implementations
- correlation coefficient
- occurrence frequency
- naive bayes
- cloud computing
- co occurrence
- image features
- feature set
- parallel computers
- document frequency
- mapreduce framework
- feature space
- data sets