An Information Theoretic Feature Selection Framework for Big Data under Apache Spark.
Sergio Ramírez-GallegoHéctor Mouriño-TalínDavid Martínez-RegoVerónica Bolón-CanedoJosé Manuel BenítezAmparo Alonso-BetanzosFrancisco HerreraPublished in: CoRR (2016)
Keyphrases
- information theoretic
- big data
- mutual information
- theoretic framework
- feature selection
- information theory
- open source
- text categorization
- cloud computing
- similarity measure
- jensen shannon divergence
- data management
- user interface
- object oriented
- image registration
- probability distribution
- feature selection algorithms
- map reduce
- information bottleneck
- data analysis
- information theoretic measures
- distributional clustering
- big data analytics