数学物理学报(英文版) ›› 2018, Vol. 38 ›› Issue (1): 57-72.doi: 10.1016/S0252-9602(17)30117-0

• 论文 • 上一篇    下一篇

ROBUST DEPENDENCE MEASURE FOR DETECTING ASSOCIATIONS IN LARGE DATA SET

蒋杭进1,2, 吴琼莉2   

  1. 1. University of Chinese Academy of Sciences, Beijing 100049, China;
    2. Key Laboratory of Magnetic Resonance in Biological Systems, Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071, China
  • 收稿日期:2017-01-26 出版日期:2018-02-25 发布日期:2018-02-25
  • 通讯作者: Qiongli WU E-mail:wuqiongli@wipm.ac.cn
  • 作者简介:Hangjin JIANG,E-mail:jianghangjin10@mails.ucas.ac.cn
  • 基金资助:

    Supported by the National Natural Science Foundation of China (31600290).

ROBUST DEPENDENCE MEASURE FOR DETECTING ASSOCIATIONS IN LARGE DATA SET

Hangjin JIANG1,2, Qiongli WU2   

  1. 1. University of Chinese Academy of Sciences, Beijing 100049, China;
    2. Key Laboratory of Magnetic Resonance in Biological Systems, Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071, China
  • Received:2017-01-26 Online:2018-02-25 Published:2018-02-25
  • Contact: Qiongli WU E-mail:wuqiongli@wipm.ac.cn
  • Supported by:

    Supported by the National Natural Science Foundation of China (31600290).

摘要:

In this paper, we proposed a new statistical dependency measure for two random vectors based on copula, called copula dependency coefficient (CDC). The CDC is proved to be robust to outliers and easy to be implemented. Especially, it is powerful and applicable to high-dimensional problems. All these properties make CDC practically important in related applications. Both experimental and application results show that CDC is a good robust dependence measure for association detecting.

关键词: CDC, dependence measure, EDC, association, large dataset, robust

Abstract:

In this paper, we proposed a new statistical dependency measure for two random vectors based on copula, called copula dependency coefficient (CDC). The CDC is proved to be robust to outliers and easy to be implemented. Especially, it is powerful and applicable to high-dimensional problems. All these properties make CDC practically important in related applications. Both experimental and application results show that CDC is a good robust dependence measure for association detecting.

Key words: CDC, dependence measure, EDC, association, large dataset, robust