A novel apache spark-based 14-dimensional scalable feature extraction approach for the clustering of genomics data.
Rajesh DwivediAruna TiwariNeha BharillMilind B. RatnaparkheParul MogrePranjal GadgeKethavath JagadeeshPublished in: J. Supercomput. (2024)
Keyphrases
- feature extraction
- data sets
- data structure
- database
- high quality
- high dimensional data
- data points
- image data
- multidimensional data
- categorical data
- data objects
- original data
- data processing
- data mining techniques
- open source
- data analysis
- recent advances
- image processing
- training data
- raw data
- data acquisition
- data collection
- probability distribution
- xml documents
- high dimensional
- principal component analysis
- missing data
- data sources
- data management
- multi dimensional
- dimensionality reduction
- website
- biological data