数学物理学报(英文版) ›› 2014, Vol. 34 ›› Issue (2): 579-592.doi: 10.1016/S0252-9602(14)60031-X

• 论文 • 上一篇    

MODEL SELECTION METHOD BASED ON MAXIMAL INFORMATION COEFFICIENT OF RESIDUALS

谭秋衡|蒋杭进*|丁义明   

  1. Wuhan Institute of Physics and Mathematics, CAS, Wuhan 430071, China;University of CAS, Beijing 100049, China; Key Laboratory of Magnetic Resonance in Biological Systems, Wuhan Institute of Physics and Mathematics, CAS, Wuhan 430071, China;National Center for Mathematics and Interdisciplinary Sciences, CAS, Beijing 100049, China
  • 收稿日期:2013-02-05 修回日期:2013-12-20 出版日期:2014-03-20 发布日期:2014-03-20
  • 通讯作者: 蒋杭进,jianghangjin10@mails.ucas.ac.cn E-mail:jianghangjin10@mails.ucas.ac.cn;ding@wipm.ac.cn
  • 基金资助:

    This work was partly supported by National Basic Research Program of China (973 Program, 2011CB707802, 2013CB910200) and National Science Foundation of China (11201466).

MODEL SELECTION METHOD BASED ON MAXIMAL INFORMATION COEFFICIENT OF RESIDUALS

 TAN Qiu-Heng, JIANG Hang-Jin*, DING Xi-Ming   

  1. Wuhan Institute of Physics and Mathematics, CAS, Wuhan 430071, China;University of CAS, Beijing 100049, China; Key Laboratory of Magnetic Resonance in Biological Systems, Wuhan Institute of Physics and Mathematics, CAS, Wuhan 430071, China;National Center for Mathematics and Interdisciplinary Sciences, CAS, Beijing 100049, China
  • Received:2013-02-05 Revised:2013-12-20 Online:2014-03-20 Published:2014-03-20
  • Contact: JIANG Hang-Jin,jianghangjin10@mails.ucas.ac.cn E-mail:jianghangjin10@mails.ucas.ac.cn;ding@wipm.ac.cn
  • Supported by:

    This work was partly supported by National Basic Research Program of China (973 Program, 2011CB707802, 2013CB910200) and National Science Foundation of China (11201466).

摘要:

The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar-ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coefficient (MIC), a recently proposed dependence measure, captures a wide range of associ-ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.

关键词: Model Selection, residual, maximal information coefficient, partial maximal information coefficient

Abstract:

The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar-ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coefficient (MIC), a recently proposed dependence measure, captures a wide range of associ-ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.

Key words: Model Selection, residual, maximal information coefficient, partial maximal information coefficient

中图分类号: 

  • 62B10