Distribution based nearest neighbor imputation for truncated high dimensional data with applications to pre-clinical and clinical metabolomics studies.
Jasmit S. ShahShesh N. RaiAndrew P. DeFilippisBradford G. HillAruni BhatnagarGuy N. BrockPublished in: BMC Bioinform. (2017)
Keyphrases
- high dimensional data
- nearest neighbor
- missing values
- high dimensional
- data distribution
- k nearest neighbor
- data sets
- knn
- low dimensional
- high dimensionality
- training set
- high dimensions
- dimensionality reduction
- clinical data
- similarity search
- data analysis
- dimension reduction
- data points
- subspace clustering
- high dimensional spaces
- manifold learning
- clustering high dimensional data
- index structure
- input space
- high dimensional data sets
- nonlinear dimensionality reduction
- lower dimensional
- nearest neighbor search
- dimensional data
- high dimensional datasets
- data mining
- database
- euclidean distance
- missing data
- distance function
- data structure
- machine learning
- survival data