数学物理学报 ›› 2017, Vol. 37 ›› Issue (5): 931-949.

• 论文 • 上一篇    下一篇

相依性度量的比较研究

蒋杭进1,2, 尚妍3,4, 吴琼莉2   

  1. 1. 中国科学院大学 北京 100190;
    2. 中国科学院武汉物理与数学研究所 武汉 430071;
    3. 中国科学院大学经济与管理学院 北京 100190;
    4. 中国工商银行博士后科研工作站 北京 100032
  • 收稿日期:2016-12-26 修回日期:2017-05-17 出版日期:2017-10-26 发布日期:2017-10-26
  • 通讯作者: 吴琼莉,E-mail:wuqiongli@wipm.ac.cn E-mail:wuqiongli@wipm.ac.cn
  • 基金资助:
    国家自然科学基金(31600290)

Dependence Measure:A Comparative Study

Jiang Hangjin1,2, Shan Yan3,4, Wu Qiongli2   

  1. 1. University of Chinese Academy of Sciences, Beijing 100190;
    2. Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071;
    3. School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190;
    4. Post-Doctoral Research Center, Industrial and Commercial Bank of China, Beijing 100032
  • Received:2016-12-26 Revised:2017-05-17 Online:2017-10-26 Published:2017-10-26
  • Supported by:
    Supported by the NSFC (31600290)

摘要: 相依性度量在统计分析中起着关键作用,比如fMRI数据分析、变量选择、网络分析、基因数据分析、蛋白质-蛋白质相互作用网络分析.该文综合分析了现有的相依性度量的性质,比如MIC[1]、MI[2]、HHG[3],并将它们分为三大类:第一类是相关系数的直接推广,第二类相依性度量则基于不同的独立性假设,第三类是基于学习理论的相依性度量.该文还指出并没有一种相依性度量方法是在所有情况下都是最优的,从相依性度量的性质和我们的模拟结果看CDC[4]是最好的相依性度量.另外我们还提出用CDC-|ρ|或者CDC2-|ρ|2做非线性度量要比MIC-ρ2合理,其中ρ是相关系数.

关键词: 相依性, CDC, 相依性度量, 非线性度量

Abstract: Dependence measure plays a fundamental role in statistical analysis, such as fMRI data analysis, variable selection, network analysis, genetical analysis, PPI network analysis. The aim of this paper is to provide a state of the art on the topic of dependence measure and to show their properties. We classified all these dependence measures into 3 classes:(1) Extension of correlation coefficient; (2) Dependence measures based on independent conditions; (3) Dependence measures in learning framework; As showed in our analysis, there is no such a dependence measure has the best performance over all kinds of functional types. However, according to the properties that a dependence measure should have combing with our results, CDC[4] is the best one. Furthermore, CDC2-ρ2 or CDC-|ρ|is proposed as a measure of non-linearity, which is better than MIC-ρ2, as showed in our analysis, where ρ is the Pearson correlation coefficient.

Key words: Dependence, CDC, Dependence measure, Non-linearity measurement

中图分类号: 

  • O212