## Research on Resolution Based on EM Algorithm

Lu Nana,, Yu Jinghu,

 基金资助: 中央高校基本科研业务费专项资金.  2017IVA073中央高校基本科研业务费.  2018IB016

 Fund supported: the Fundamental Research Funds for the Central Universities.  2017IVA073the Fundamental Research Funds for the Central Universities.  2018IB016

Abstract

Parameter resolution is a criterion for measuring whether two adjacent signals can be distinguished under the given noise conditions, it provides an evaluation of the "ruler" for the measurement of sensitive parameters, effective precision and accuracy. This paper proposes a definition of the parameter resolution of EM algorithm, which is based on the EM algorithm, and the idea of Fisher linear discriminant criterion, two-component Gaussian mixed model is taken as an example to verify it. Experiments show that when two normal distributions with a variance of 0.1 have a mean distance greater than 0.206, the EM algorithm can tell the differences between the two distributions under a confidence of 90%, by constructing the connection between experimental results and theoretical derivation, the scale factor graphs with different confidence levels are obtained. The proposed resolution of the parameters provides a quantitative indicator for the accuracy measurement and also provides a new solution for the differentiation of similar signals.

Keywords： EM algorithm ; Resolution ; Criterion ; Coefficient of variation

Lu Nana, Yu Jinghu. Research on Resolution Based on EM Algorithm. Acta Mathematica Scientia[J], 2019, 39(3): 638-648 doi:

## 1 引言

EM(Expectation-maximization)算法又称期望最大化算法, 1977年由Dempster等[4]提出用于求解含有隐变量数学模型的参数极大似然估计,其基本思想:给定不完全数据初始值情形下,估计出模型参数值;然后再根据参数值估计出缺失数据的值,根据估计出的缺失数据的值再对参数值进行更新,反复迭代直至收敛.自Dempster等人提出EM算法以来,国内外取得了很多研究成果,如Wu和Xu[13, 15]总结出EM算法的一些收敛性质. Wei和Liu[7, 14]提出了改进E步、M步.后来许多学者引入加速算法[1, 5-6, 16-17]研究高维EM算法的迭代性能.正态分布因具有灵活、高效拟合的特点,很多随机现象在足够大样本下都可以用此分布来逼近[10]. EM算法易受初始值影响,不能保证找到全局最优,往往容易陷入局部最优,因此关于初始值设置提出了许多方法, Zhai[18]选取最远点为初始值,易使边界点收敛效果受到影响; Li和Chen[9]使用k-最近邻法删除异常值,然后用k-means来初始化,此估计效果显著优于原始效果.近几年有学者采用基于密度的方法选择初始值,这样可以避免噪声点对参数估计的影响,从而提高迭代结果的稳健性.

### 2.1 EM算法的相关原理

EM算法具体为:第$i$次迭代结果记$\theta^{(i)}$,第$i+1$次迭代开始的参数设置为$\theta^{(i)}$,则第$i+1$次迭代E步和M步分别为:

E步:求关于概率函数$P(z|y, \theta^{(i)})$的条件期望$E_{\theta^{(i)}}[\log P(y, Z|\theta)|\theta^{(i)}]$,并记

M步:极大化$Q(\theta|\theta^{(i)})$,通过极大似然估计找到$\theta^{(i+1)}$

### 2.3 EM算法的参数分辨率

(1)对混合模型进行$n$轮抽样,对每轮抽样结果,运用EM算法对参数$(\mu, \nu)$重新进行估计给出估计值, $k$次估计值记为$(\mu^{(k)}, \nu^{(k)}), k=1, 2, \cdots, n$;

(2)采用判别准则,判别每轮估计是否将参数$\mu$$\nu区分开,如能区分开,则此轮EM算法是成功的; (3)令d_n\triangleq n轮估计中成功次数/n.其中成功分开的标准由Fisher判别准则和阈值(置信度) \alpha决定: Fisher判别准则[3] $$\label{eq:1} \left\{\begin{array}{ll} |\mu_{i}^{*}-\mu|\leq|\nu_{i}^{*}-\mu|, \\ |\nu_{i}^{*}-\nu|\leq|\mu_{i}^{*}-\nu|, \end{array}\right.$$ 其中\mu_{i}^{*}表示EM算法\mu_{i}估计值,其它参数类似;若分辨率达到所给阈值则终止计算. 定义2.1 设置阈值\alpha(0<\alpha<1),当d_n\geq\alpha时,称EM算法能将此对参数(\mu, \nu)是可以分开,并称相对误差(RE)\frac{|\mu-\nu|}{\mu}或者绝对误差(AE)|\mu-\nu|为阈值等于\alpha时, EM算法对模型(两正态混合)的参数分辨率. ### 3 实验与分析 EM算法在参数估计中存在随机性,每轮迭代结果可能不同,为了避免这种随机性,首先,在每组模型下循环100轮,得到100组EM算法估计值.其次,分辨率计算时,把运行50次结果进行升序排列,依次去除前三个和后三个数值后取平均值.这样不仅克服了一定的偶然性,而且也提高了EM算法的鲁棒性.除此,考虑到EM算法易受初始值影响,初始值均在真实值的10\%左右,精度为10^{-6}, p=0.5,绝对误差先取0.01确定出大致区间,然后取0.001进行精选区间,针对阈值、相对误差、变异系数三个方面进行相应的结果分析. ### 3.1 相对误差与变异系数的影响分析 固定\sigma=0.1,选择不同模型,实验按照固定\mu调整\nu和固定\nu调整\mu这两种方式进行,阈值设置85\%$$90\%$$95\%时,计算对应的相对误差和变异系数,变异系数是衡量各观测值离散程度的一个统计量,是方差与均值的比值,简记为CV,计算公式: CV=\frac{\sigma}{\mu}.表 1当模型为\mu=1, \nu=1.189,此时d_n$$84.6818\%$未达到阈值$85\%$,则需要固定$\mu=1$来增加$\nu$的值进行估计,直至$d_n$大于等于阈值.各表中RE1和RE2表示固定$\mu$调整$\nu$和固定$\nu$调整$\mu$所对应的相对误差; $\delta$表示平均相对误差,如表 1$\delta=\frac{RE1+RE2}{2}=\frac{0.1900+0.1860}{2}$; CV1和CV2表示固定$\mu$调整$\nu$和固定$\nu$调整$\mu$所对应的变异系数.

 固定$\mu$调整$\nu$ RE1 AE1 固定$\nu$调整$\mu$ RE2 $\delta$ CV1 CV2 $\mu=1.0, \nu=1.190$ 0.1900 0.190 $\mu=0.814, \nu=1.0$ 0.1860 0.1880 0.1840 0.2229 $\mu=1.2, \nu=1.391$ 0.1592 0.191 $\mu=1.010, \nu=1.2$ 0.1583 0.1588 0.1552 0.1823 $\mu=1.6, \nu=1.792$ 0.1200 0.192 $\mu=1.407, \nu=1.6$ 0.1206 0.1203 0.1183 0.1336 $\mu=1.8, \nu=1.991$ 0.1061 0.191 $\mu=1.606, \nu=1.8$ 0.1078 0.1070 0.1058 0.1178 $\mu=2.0, \nu=2.192$ 0.0960 0.192 $\mu=1.806, \nu=2.0$ 0.0970 0.0965 0.0956 0.1054 $\mu=3.0, \nu=3.192$ 0.0640 0.192 $\mu=2.809, \nu=3.0$ 0.0637 0.0639 0.0647 0.0689

 固定$\mu$调整$\nu$ RE1 AE1 固定$\nu$调整$\mu$ RE2 $\delta$ CV1 CV2 $\mu=1.0, \nu=1.200$ 0.2000 0.200 $\mu=0.798, \nu=1.0$ 0.2020 0.2010 0.1833 0.2253 $\mu=1.2, \nu=1.400$ 0.1667 0.200 $\mu=1.000, \nu=1.2$ 0.1667 0.1667 0.1548 0.1833 $\mu=1.6, \nu=1.804$ 0.1275 0.204 $\mu=1.395, \nu=1.6$ 0.1281 0.1278 0.1179 0.1342 $\mu=1.8, \nu=2.005$ 0.1139 0.205 $\mu=1.597, \nu=1.8$ 0.1128 0.1134 0.1054 0.1182 $\mu=2.0, \nu=2.204$ 0.1020 0.204 $\mu=1.795, \nu=2.0$ 0.1025 0.1023 0.0954 0.1057 $\mu=3.0, \nu=3.206$ 0.0687 0.206 $\mu=2.796, \nu=3.0$ 0.0680 0.0684 0.0645 0.0691

 固定$\mu$调整$\nu$ RE1 AE1 固定$\nu$调整$\mu$ RE2 $\delta$ CV1 CV2 $\mu=1.0, \nu=1.220$ 0.2200 0.220 $\mu=0.780, \nu=1.0$ 0.2200 0.2200 0.1820 0.2282 $\mu=1.2, \nu=1.420$ 0.1833 0.220 $\mu=0.980, \nu=1.2$ 0.1833 0.1833 0.1538 0.1854 $\mu=1.6, \nu=1.827$ 0.1419 0.227 $\mu=1.376, \nu=1.6$ 0.1400 0.1410 0.1172 0.1352 $\mu=1.8, \nu=2.025$ 0.1250 0.225 $\mu=1.576, \nu=1.8$ 0.1244 0.1247 0.1049 0.1190 $\mu=2.0, \nu=2.226$ 0.1130 0.226 $\mu=1.778, \nu=2.0$ 0.1110 0.1120 0.0949 0.1062 $\mu=3.0, \nu=3.224$ 0.0747 0.224 $\mu=2.775, \nu=3.0$ 0.0750 0.0749 0.0644 0.0694

### 图 5

$A(S_2)$对应的概率大小$(P(S_2))$

根据实验结果可知,随着样本容量$n$增大, EM算法对应的分辨率$\nu-\mu$的值会减小,由定理3.1知,当$\frac{\nu-\mu}{\sigma}\geq1.09$时,有

## 参考文献 原文顺序 文献年度倒序 文中引用次数倒序 被引期刊影响因子

Berlinet A F , Roland C .

Acceleration of the EM algorithm:P-EM versus epsilon algorithm

Computational Statistics & Data Analysis, 2012, 56 (12): 4122- 4137

Cucker F , Zhou D X . Learning Theory:An Approximation Theory Viewpoint. Volume 24 Cambridge: Cambridge University Press, 2007

Chen X R . Mathematical Statistics Course. Anhui: China University of Science and Technology Press, 2009

Dempster A P , Laird N M , Rubin D B .

Maximum likelihood from incomplete data via the EM algorithm

Journal of the Royal Statistical Society, 1977, 39 (1): 1- 38

Ikeda S .

Acceleration of the EM algorithm

Systems Computers in Japan, 2015, 31 (2): 10- 18

Kuroda M , Sakakihara M .

Accelerating the convergence of the EM algorithm using the vector ε algorithm

Comput Stat Data Anal, 2006, 51 (3): 1549- 1561

Liu C , Rubin D B .

The ECME algorithm:A simple extension of EM and ECM with faster monotone convergence

Biometrika, 1994, 81 (4): 633- 648

Li H . Statistical Learning Methods. Beijing: Tsinghua University Press, 2012: 156- 169

Li Y , Chen Y .

Research on Initialization on EM algorithm based on Gaussian mixture model

Journal of Applied Mathematics & Physics, 2018, 06 (1): 11- 17

Mclachlan G , Peel D . Finite Mixture Models. New York: Springer-Verlag, 2000

Shi P , Tsai C .

Regression model selection a residual likelihood approach

Journal of the Royal Statistical Society:Series B (Statistical Methodology), 2002, 64 (2): 237- 252

Tan Q H , Jiang H J , Ding Y M .

Model selection method based on maximal information coefficient of residuals

Acta Mathematica Scientia, 2014, 34 (2): 579- 592

Wu C .

On the convergence properties of the EM algorithm

The Annals of Statistics, 1983, 11 (1): 95- 103

Wei G G , Tanner M .

A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms

Publications of the American Statistical Association, 1990, 85 (411): 699- 704

Xu L , Jordan M I .

On convergence properties of the EM algorithm for Gaussian mixtures

Neural Computation, 1995, 8 (1): 129- 151

Yu J , Chaomuriligea C , Yang M S .

On convergence and parameter selection of the EM and DA-EM algorithms for Gaussian mixtures

Pattern Recognition, 2018, 77 (1): 188- 203

Yao W .

A note on EM algorithm for mixture models

Statistics & Probability Letters, 2013, 83 (2): 519- 526

Zhai D H , Yu J , Gao F , et al.

K-means text clustering algorithm based on initial cluster centers selection according maximum distance

Application Research of Computers, 2014, 31 (3): 713- 719

/

 〈 〉