Using a low correlation high orthogonality feature set and machine learning methods to identify plant pentatricopeptide repeat coding gene/protein.
Changli FengQuan ZouDonghua WangPublished in: Neurocomputing (2021)
Keyphrases
- machine learning methods
- feature set
- gene prediction
- high correlation
- feature selection
- machine learning
- random forest
- machine learning algorithms
- ensemble methods
- feature extraction
- classification accuracy
- feature vectors
- highly correlated
- extracted features
- protein protein interaction networks
- feature space
- sequence alignment
- human genome
- statistical methods
- homo sapiens
- machine learning approaches
- microarray
- amino acids
- selected features
- texture features
- regulatory networks
- learned knowledge
- protein structure
- dna sequences
- gene expression data
- structural features
- cellular processes
- image processing
- page layout analysis
- gene selection
- feature subset
- protein interaction
- svm classifier
- wavelet transform
- model selection
- syntactic features
- training data
- data sets
- gene expression