Well-characterized human populations provide excellent opportunities for scientists to study the associations between biomarkers and biological disease. When using biomarkers in the analysis, regression of high dimensional data is difficult. The collinearity generates difficulties to make unbiased conclusion. The goal of this monographic introduces novel approaches to reduce the collinearity difficulties in model fitting and variable identification. It has two parts:1. Binary outcome, we participated the whole space X, so that each subspace is consisting of all “independent factors” and then find the gradient directions in each space, which has the highest effect on outcome. The factors with weakest contribution in gradients will be removed. 2. Continuous outcome, we find the gradient in each subspace by: the Conditional Minimum Variance (CMV), derived from the principal component analysis by adding outcome to predictors, or the Maximum Pearson Correlation (MPC). Numerical results demonstrated that the proposed approaches could improve dimension reduction with a higher sensitivity and accuracy to identify the true predictors when multi-collinearity exists.
As senior PhD statistician with 40 years of experiences in applied statistical analysis. DR. Li has worked at a private consultant form, government agency and an adjunct professor at Lanzhou University, George Washington University etc. to teach mathematical and statistical courses.
Number of Pages:
LAP LAMBERT Academic Publishing
High Dimensional Regression, GRADIENT, Decomposition, regression, case control, Orthogonal Base, biomarker, Sum Statistic, Lasso, Minimum Conditional Variance, Maximum Pearson Correlation, Maxim Extended Principle Component Analysis
MATHEMATICS / Functional Analysis