Feature selection on a dataset of protein families: from exploratory data analysis to statistical variable importance.
Eugenio Del PreteSerena DotoloAnna MarabottiAngelo M. FacchianoPublished in: PeerJ Prepr. (2016)
Keyphrases
- exploratory data analysis
- feature selection
- protein classification
- feature set
- data clustering
- knowledge discovery
- protein families
- principal components analysis
- predictive modeling
- feature extraction
- data visualization
- protein sequences
- statistical models
- unsupervised learning
- formal concept analysis
- machine learning
- dimensionality reduction
- information visualization
- feature space
- data mining
- databases
- semi supervised
- statistical significance
- support vector machine
- social networks