Acta mathematica scientia,Series B ›› 2021, Vol. 41 ›› Issue (1): 207-230.doi: 10.1007/s10473-021-0112-6

• Articles • Previous Articles     Next Articles

WEIGHTED LASSO ESTIMATES FOR SPARSE LOGISTIC REGRESSION:NON-ASYMPTOTIC PROPERTIES WITH MEASUREMENT ERRORS

Huamei HUANG1, Yujing GAO2, Huiming ZHANG3, Bo LI4   

  1. 1. Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China;
    2. Guanghua School of Management, Peking University, Beijing 100871, China;
    3. School of Mathematical Sciences, Peking University, Beijing 100871, China;
    4. School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
  • Received:2019-11-06 Revised:2020-09-17 Online:2021-02-25 Published:2021-04-06
  • Contact: Bo LI E-mail:haoyoulibo@163.com
  • About author:Huamei HUANG,E-mail:huanghm@mail.ustc.edu.cn;Yujing GAO,E-mail:jane.g1996@pku.edu.cn;Huiming ZHANG,E-mail:zhanghuiming@pku.edu.cn
  • Supported by:
    Three authors, Huamei Huang, Yujing Gao and Huiming Zhang, are co-first authors contributed equally to this work. Supported by the National Natural Science Foundation of China (61877023) and the Fundamental Research Funds for the Central Universities (CCNU19TD009).

Abstract: For high-dimensional models with a focus on classification performance, the $\ell_{1}$-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We propose two types of weighted Lasso estimates, depending upon covariates determined by the McDiarmid inequality. Given sample size $n$ and a dimension of covariates $p$, the finite sample behavior of our proposed method with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as the $\ell_{1}$-estimation error and the squared prediction error of the unknown parameters. We compare the performance of our method with that of former weighted estimates on simulated data, then apply it to do real data analysis.

Key words: logistic regression, weighted Lasso, oracle inequalities, high-dimensional statistics, measurement error

CLC Number: 

  • 62J12
Trendmd