Recommended Reading for High-Dimensional Data


Donoho, D. L. (2000). High-dimensional data analysis: The curses and blessings of dimensionality. Aide-Memoire of the lecture in American Mathematical Society conference: Math challenges of 21st Centrury. Available at

Fan, J., Han, F., and Liu, H. (2014). Challenges of Big Data analysis. National Science Review, 1, 293-314.

Fan, J. and Liu, H. (2013). Statistical analysis of big data on pharmacogenomics Advanced Drug Delivery Reviews , 65, 987-1000.

Fan, J., & Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statistica Sinica, 20, 101-148. View article

Fan, J., & Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In M. Sanz-Sole, J. Soria, J. L. Varona, & J. Verdera (Eds.), Proceedings of the International Congress of Mathematicians (Vol. III, pp. 595-622). Zurich: European Mathematical Society.

Liu, J, Zhong, W. and Li, R. (2015). A selective overview of feature screening for ultrahigh dimensional data. Science China: Mathematics. 58, 2033 – 2054.

Shao, J. (1997). An asymptotic theory for linear model selection. Statistica Sinica, 7, 221-264.

Information criterion type for variable selection (L0-regularization)

Foster, D. P., & George, E. I. (1994). The risk inflation criterion for multiple regression, Annals of Statistics, 22, 1947-1975.

Nishii, R. (1984) Asymptotic properties of criteria for selection of variables in multiple regression. Annals of Statistics, 12, 758-765.

Schwartz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464.

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716-723.

Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15, 661-675.

Variable selection via penalized methods with a continuous penalty

LASSO and L1-regularization

Li, J., Das, K., Fu, G., Li, R., & Wu, R. (2011). The Bayesian LASSO for genome-wide association studies. Bioinformatics, 27, 516-523.

Wu, T. T., & Lange, K. (2008). Coordinate descent algorithms for LASSO penalized regression. Annals of Applied Statistics, 2, 224 – 244.

Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical Association, 101, 1418 – 1429.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, B 67, 301- 320.

Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, B, 58, 267-288.

Nonconvex penalized least squares and nonconcave penalized likelihood

Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96,1348-1360.

Fu, W. J. (1998). Penalized regression: the bridge versus the LASSO. Journal of Computational and Graphical Statistics, 7, 397-416.

Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37, 373-384.

Frank, I. E., & Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35, 109-148.


Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression (with discussion). Annals of Statistics, 32, 407-499.

Friedman, J., Hastie, T., Hofling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. Annals of Applied Statistics, 1, 302-332.

Fan, J., Xue, L. and Zou, H. (2014). Strong oracle optimality of folded concave penalized estimation. The Annals of Statistics 42, 819–849.

Hunter, D. R., & Li, R. (2005). Variable selection using MM algorithms. Annals of Statistics, 33, 1617-1642.

Liu, H., Yao, T. and Li, R. (2016). Global solutions to folded concave penalized nonconvex learning.  Annals of Statistics, 44, 629-659.

Liu, H., Yao, T, Li, R. and Ye, Y. (2017). Folded concave penalized sparse linear regression: complexity, sparsity, statistical performance, and algorithm theory for local solutions. Mathematical Programming SERIES A, 166, 207-240.

Loh, P.-L. and Wainwright. M. J. (2015). Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima. Journal of Machine Learning Research, 16, 559 – 616.

Wang, Z., Liu, H. and Zhang, T. (2014). Optimal computational and statistical rates of convergence for sparse nonconvex learning problems. The Annals of Statistics, 42, 2164–2201

Wang, L., Kim. Y. and Li, R. (2013). Calibrating nonconvex penalized regression in ultrahigh dimension. The Annals of Statistics, 41, 2505–2536.

Zhang, C. and Zhang, T. (2012). A general theory of concave regularization for high dimensional sparse estimation problems, Statistical Science, 27, 27576–593.

Zou, H., & Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Annals of Statistics, 36, 001509-1566. PMCID: PMC2759727  View article

Tuning parameter selection

Zhang, Y., Li, R., & Tsai, C.-L. (2010). Regularization parameter selections via generalized information criterion. Journal of the American Statistical Association, 105, 312-323. PMCID: PMC2911045 View article

Wang, H., Li, R., & Tsai, C.-L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94, 553-568. View abstract

High-dimensional settings

Candes, E., & Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n (with discussion). Annals of Statistics, 35, 2313-2404.

Fan, J., & Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Annals of Statistics, 32, 928-961.

Zhang, X., Wu, Y., Wang, L. and Li, R. (2016). Variable selection for support vector machine in moderately high dimensions. Journal of Royal Statistical Society, Series B. 78, 53 – 76.

Ultrahigh dimensional settings

Chu, W., Li, R. & Reimherr, M. (2016). Feature screening for time-varying coefficient models with ultrahigh dimensional longitudinal data. Annals of Applied Statistics, 10, 

Cui, H., Li, R. & Zhong, W. (2015). Model-free feature screening for ultrahigh dimensional discriminant analysis. Journal of American Statistical Association. 110. 630 – 641.

Fan, Y., & Li, R. (2012). Variable selection in linear mixed effects models. Annals of Statistics, 40, 2043 – 2068. PMC Journal- In Process

Fan, J., & Lv, J. (2011). Non-concave penalized likelihood with NP-dimensionality. IEEE-Information Theory, 57, 5467–5484.

Fan, J., & Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space (with discussion). Journal of the Royal Statistical Society, B, 70, 849-911.

Fan, J., Samworth, R., & Wu, Y. (2009). Ultrahigh dimensional variable selection: beyond the linear model. Journal of Machine Learning Research, 10, 1829-1853.

Hall, P., & Miller, H. (2009). Using generalized correlation to effect variable selection in very high dimensional problems. Journal of Computational and Graphical Statistics, 18, 533–550.

Huang, D., Li, R., & Wang, H. (2014). Feature screening for ultrahigh dimensional categorical data with applications. Journal of Business and Economic Statistics, 32, 237-244.

Li, R., Liu, J. and Lou, L. (2017). Variable selection via partial correlation. Statistica Sinica, 27, 983-996.

Li, R., Zhong, W., & Zhu, L. (2012). Feature screening via distance correlation learning. Journal of the American Statistical Association. 107, 1129-1139.

Li, J., Zhong, W., Li, R. & Wu, R. (2014). A fast algorithm for detecting gene-gene interactions in genome-wide association studies. The Annals of Applied Statistics. 8,2292 – 2318.

Liu, J., Li, R., & Wu, R. (2014). Feature selection for varying coefficient models with ultrahigh-dimensional covariates. Journal of the American Statistical Association, 109, 266-274.

Ma, S., Li, R. & Tsai, C.-L. (2017). Variable screening via partial quantile correlation. Journal of American Statistical Association, 112, 650-663. View article

Pan, R., Wang, H. & Li, R. (2016). On the ultrahigh dimensional linear discriminant analysis problem with a diverging number of classes. Journal of American Statistical Association, 111, 169-179.

Wang, L., Kim. Y., & Li, R. (2013). Calibrating nonconvex penalized regression in ultrahigh dimension. Annals of Statistics, 41, 2505-2536.

Wang, L., Peng, B. & Li, R. (2015). A high-dimensional nonparametric multivariate test for mean vector. Journal of American Statistical Association. 110. 1658 – 1669.

Yang, G., Yu, Y., Li, R. & Buu, A. (in press). Feature screening in ultrahigh dimensional Cox’s model. Statistica Sinica. 

Zhang, X., Wu, Y., Wang, L. & Li, R. (in press). A consistent information criterion for support vector machines in diverging
model spaces. Journal of Machine Learning Research.

Zhu, L, Li, L., Li, R., & Zhu, L.-X. (2011). Model-free feature screening for ultrahigh dimensional data. Journal of the American Statistical Association, 106, 1464–1475.

Applied examples

Buu, A. Johnson, N. J., Li, R., & Tan, X. (2011). New variable selection methods for zero-inflated count data with applications to the substance abuse field. Statistics in Medicine, 30, 2326-2340.

Du, G., Lewis, M. M., Kanekar, S., Sterling, N. W., He, L., Kong, L, Li, R. and Huang, X. (2017). Combined diffusion tensor imaging and R2* differentiate Parkinson’s disease and atypical Parkinsonism. American Journal of Neuroradiology, 38, 966-972.

Li, J., Das, K., Fu, G., Li, R., & Wu, R. (2011). The Bayesian LASSO for genome-wide association studies. Bioinformatics, 27, 516-523.

Liu, H., Du, G. Zhang, L. , Lewis, M., Wang, X., Yao, T., Li, R. and Huang, X. (2016). Folded concave penalized learning in identifying multimodal MRI marker for Parkinson’s disease. Journal of Neuroscience Methods, 268, 1-6.

Miao, J. Chen, Z. Sebastian, A. Wang, Z., Shrestha, S., Li, X., Praul, C., Albert, I., Li, R. and Cui, L. (2017). Sex-specific biology of the human malaria parasite revealed from transcriptomes and proteomes of male and female gametocytes. Molecular and Cellular Proteomics, 16, 537-551.

Wang, Y., Chen, H., Li, R., Duan, N., & Lewis-Fernandez, R. (2011). Prediction-based structured variable selection through receiver operating curve. Biometrics, 67, 896-905.

Yi, G. Y., Tan, X. & Li, R. (2015). Variable selection and inference procedures for marginal analysis of longitudinal data with missing observations or measurement error. Canadian Journal of Statistics. 43, 498 – 518.

Zhang, L., Wang, X., Wang, M., Sterling, N. W., Du, G. Lewis, M. M., Yao, T., Mailman, R. B., Li, R. Huang, X. (2017). Circulating cholesterol levels may link to the factors influencing Parkinson’s risk. Frontiers in Neurology, 8, 501.