数学物理学报(英文版) ›› 2021, Vol. 41 ›› Issue (1): 207-230.doi: 10.1007/s10473-021-0112-6

• 论文 • 上一篇    下一篇

WEIGHTED LASSO ESTIMATES FOR SPARSE LOGISTIC REGRESSION:NON-ASYMPTOTIC PROPERTIES WITH MEASUREMENT ERRORS

黄华妹1, 高钰婧2, 张慧铭3, 李波4   

  1. 1. Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China;
    2. Guanghua School of Management, Peking University, Beijing 100871, China;
    3. School of Mathematical Sciences, Peking University, Beijing 100871, China;
    4. School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
  • 收稿日期:2019-11-06 修回日期:2020-09-17 出版日期:2021-02-25 发布日期:2021-04-06
  • 通讯作者: Bo LI E-mail:haoyoulibo@163.com
  • 作者简介:Huamei HUANG,E-mail:huanghm@mail.ustc.edu.cn;Yujing GAO,E-mail:jane.g1996@pku.edu.cn;Huiming ZHANG,E-mail:zhanghuiming@pku.edu.cn
  • 基金资助:
    Three authors, Huamei Huang, Yujing Gao and Huiming Zhang, are co-first authors contributed equally to this work. Supported by the National Natural Science Foundation of China (61877023) and the Fundamental Research Funds for the Central Universities (CCNU19TD009).

WEIGHTED LASSO ESTIMATES FOR SPARSE LOGISTIC REGRESSION:NON-ASYMPTOTIC PROPERTIES WITH MEASUREMENT ERRORS

Huamei HUANG1, Yujing GAO2, Huiming ZHANG3, Bo LI4   

  1. 1. Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China;
    2. Guanghua School of Management, Peking University, Beijing 100871, China;
    3. School of Mathematical Sciences, Peking University, Beijing 100871, China;
    4. School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
  • Received:2019-11-06 Revised:2020-09-17 Online:2021-02-25 Published:2021-04-06
  • Contact: Bo LI E-mail:haoyoulibo@163.com
  • About author:Huamei HUANG,E-mail:huanghm@mail.ustc.edu.cn;Yujing GAO,E-mail:jane.g1996@pku.edu.cn;Huiming ZHANG,E-mail:zhanghuiming@pku.edu.cn
  • Supported by:
    Three authors, Huamei Huang, Yujing Gao and Huiming Zhang, are co-first authors contributed equally to this work. Supported by the National Natural Science Foundation of China (61877023) and the Fundamental Research Funds for the Central Universities (CCNU19TD009).

摘要: For high-dimensional models with a focus on classification performance, the $\ell_{1}$-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We propose two types of weighted Lasso estimates, depending upon covariates determined by the McDiarmid inequality. Given sample size $n$ and a dimension of covariates $p$, the finite sample behavior of our proposed method with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as the $\ell_{1}$-estimation error and the squared prediction error of the unknown parameters. We compare the performance of our method with that of former weighted estimates on simulated data, then apply it to do real data analysis.

关键词: logistic regression, weighted Lasso, oracle inequalities, high-dimensional statistics, measurement error

Abstract: For high-dimensional models with a focus on classification performance, the $\ell_{1}$-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We propose two types of weighted Lasso estimates, depending upon covariates determined by the McDiarmid inequality. Given sample size $n$ and a dimension of covariates $p$, the finite sample behavior of our proposed method with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as the $\ell_{1}$-estimation error and the squared prediction error of the unknown parameters. We compare the performance of our method with that of former weighted estimates on simulated data, then apply it to do real data analysis.

Key words: logistic regression, weighted Lasso, oracle inequalities, high-dimensional statistics, measurement error

中图分类号: 

  • 62J12