Processing math: 2%

数学物理学报, 2018, 38(5): 963-969 doi:

论文

用概率距离研究非时齐马氏链的收敛性

朱志锋,1,2, 张绍义,1

Study on the Convergence of Nonhomogeneous Markov Chains with Probability Distance

Zhu Zhifeng,1,2, Zhang Shaoyi,1

通讯作者: 张绍义, E-mail:zhshaoyi@aliyun.com

收稿日期: 2017-04-23  

基金资助: 湖北工程学院科研基金;.  Z2014006

Received: 2017-04-23  

Fund supported: the Hubei Engineering University Scientific Research Fund.  Z2014006

作者简介 About authors

朱志锋,E-mail:376574200@qq.com , E-mail:376574200@qq.com

摘要

该文用概率距离和耦合方法研究了一般状态空间非时齐马氏链的收敛性,得到了一般状态空间非时齐马氏链收敛的一个条件.

关键词: 概率距离 ; 耦合方法 ; 非时齐马氏链 ; 收敛性

Abstract

In this paper, we study the convergence of nonhomogeneous Markov chains in general state space by using the probability distance and the coupling property, a condition for convergence of nonhomogeneous Markov chains in general state space is obtained.

Keywords: Probability distance ; Coupling ; Non homogeneous Markov chains ; Convergence

PDF (280KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

朱志锋, 张绍义. 用概率距离研究非时齐马氏链的收敛性. 数学物理学报[J], 2018, 38(5): 963-969 doi:

Zhu Zhifeng, Zhang Shaoyi. Study on the Convergence of Nonhomogeneous Markov Chains with Probability Distance. Acta Mathematica Scientia[J], 2018, 38(5): 963-969 doi:

1 引言和结果

非时齐马氏链是马氏过程中较难研究的一种类型,如何判断马氏链的收敛性,特别是非时齐马氏链的收敛性一直是国内外许多学者研究的课题.Dobrushin-Isaacson-Madsen定理研究了有限状态空间非时齐马氏链的收敛性[4]参考文献的标记与引用本文对一般状态空间非时齐马氏链进行了研究,得到非时齐马氏链收敛的一个充分条件即定理1.1.定理A (Dobrushin-Isaacson-Madsen定理)是定理1.1的特例.

下面Dobrushin-Isaacson-Madsen定理给出了有限状态空间非时齐马氏链的收敛性的充分条件.

定理A (Dobrushin-Isaacson-Madsen定理[4])设X={X1,X2,,Xn,}是有限状态空间S上的非时齐马氏链,其第n步转移矩阵为Pn,如果他们满足

(A1)  Pn作为时齐的转移矩阵时有平稳分布πn;

(A2)  n

(A3)  或者满足Isaacson-Madsen条件:对于S上的任意概率分布向量\mu, \nu及正整数j,恒有

\|(\mu-\nu)P_j\cdots P_n\|\rightarrow0, n\rightarrow\infty;

或者满足Dobrushin条件:对于任意j,记P=(P_j\cdots P_n)=(P_{il}).P的收缩系数C(P)满足

C(P)\rightarrow0, n\rightarrow\infty.

其中

C(P)=\frac{1}{2}\sup\limits_{i, k}\sum\limits_l|P_{il}-P_{kl}|.

那么存在S上的概率测度\pi,使得

(1)  \|\pi_n-\pi\|\rightarrow0, n\rightarrow\infty;

(2)  无论什么初始分布{\mu_0}下,此非时齐的马链X在时刻n的分布{\mu_n}恒有极限

\|\mu_n-\pi\|\rightarrow0, n\rightarrow\infty.

其中\|v\|S上符号测度v的全变差范数.

本文的目的就是将定理A从有限状态空间S推广到一般波兰空间(X, \rho, {\cal B}(X))上去.设(X, \rho, {\cal B}(X))是波兰空间.为叙述方便,先引入几个记号.设gX上的可测函数, v{\cal B}(X)上的符号测度, K(X, {\cal B}(X))上的可测核.记

v(g):=\int{g{\rm d}v},

Kg(x):=\int{K(x, {\rm d}y)g(y)},

(vK)g=v(Kg):=\int v({\rm d}x)\int{K(x, {\rm d}y)g(y)}.

\|v\|:=\sup\{|v(g)|:|g|\leq 1\}.

定义1.1  设\mu_1, \mu_2{\cal B}(X)上的任意概率测度, \widetilde{\mu}{\cal B}(X)\times {\cal B}(X)上的概率测度,称\widetilde{\mu}\mu_1\mu_2的耦合,若满足边缘性

\widetilde{\mu}(A_1\times X)=\mu_1(A_1), A_1\in {\cal B}(X), \\\;\widetilde{\mu}(X\times A_2)=\mu_2(A_2), A_2\in {\cal B}(X).

K(\mu_1, \mu_2)\mu_1\mu_2的全体耦合.

定义1.2  设\mu_1, \mu_2是波兰空间(X, \rho, {\cal B}(X))上的任意概率测度,令

W(\mu_1, \mu_2)=\inf\bigg\{\bigg[\int\mu({\rm d}x, {\rm d}y)\rho(x, y)\bigg]:\mu\in K(\mu_1, \mu_2)\bigg\}.

称之为\mu_1\mu_2的最小概率距离.若存在\overline{\mu}\in K(\mu_1, \mu_2), 使

W(\mu_1, \mu_2)=\int\overline{\mu}({\rm d}x, {\rm d}y)\rho(x, y).

则称\overline{\mu}\mu_1\mu_2\rho最优耦合.\rho最优耦合总是存在的[1].

定理A介绍了有限状态空间上非时齐马氏链的收敛性,在实际应用中是不够的.下面我们将用概率距离和耦合方法研究一般状态空间上非时齐马氏链的收敛性问题.

\Phi=\{\Phi_0, \Phi_1, \cdots , \Phi_n, \cdots \}(X, \rho, {\cal B}(X))上的非时齐马氏链,其第n步转移概率P(\Phi_n\in A|\Phi_{n-1}=x\}:=P_n(x, A).进一步,若P_1=P_2=\cdots =P_n,则称\Phi为时齐马氏链.设{\cal P}(X)X上的全体概率测度, x_{0}X上某个给定的点(等价地任意给定的点x_{0}).

{\cal M}=\bigg\{\mu\in{\cal P}(X):\int\rho(x_{0}, x)\mu({\rm d}x)<\infty\bigg\}.

定理1.1  设(X, \rho, {\cal B}(X))是波兰空间,设\Phi=\{\Phi_0, \Phi_1, \cdots , \Phi_n, \cdots \}是取值X上的非时齐马氏链, \{P_n;n=1, 2, \cdots \}\Phi相应的转移概率核序列.如果满足

(i)  每个概率核{P_n}作为一个时齐马氏链有平稳分布\pi_n,且\pi_n\in {\cal M},满足

\sum\limits_{n}W(\pi_n, \pi_{n+1})<\infty;

(ii)  \int\rho(u, v)\widetilde{P}_n(x, y;{\rm d}u, {\rm d}v)\leq\rho(x, y)c_n , c_nx, y无关且0\leq c_n\leq 1;

(iii)  \prod\limits_{k=j}^n c_n\rightarrow0, n\rightarrow\infty;

(iv)  \forall\mu\in{\cal M}, \mu P_n\in{\cal M}, n=1, 2, \cdots .

其中\widetilde{P}_n(x, y;{\rm d}u, {\rm d}v){P_n}(x, {\rm d}u){P_n}(y, {\rm d}v)的耦合.那么,存在{\cal M}上的概率测度\pi,使得对{\cal M}上的概率测度{\mu_0},当n\rightarrow\infty时有

W({\mu_0P_1P_2\cdots P_n}, \pi)\rightarrow0, n\rightarrow\infty.

  定理中的条件(iv)一般说来是易于验证的,特别地\rho是有界距离时这个条件自动满足.

2 预备知识

引理2.1  设(X, \rho, {\cal B}(X))是波兰空间,则({\cal M}, W)是完备的度量空间.

证明见文献[1]第175页定理5.4.

引理2.2  设\{\mu_n:n\geq1\}\subset{\cal M},当\sum\limits_{n=1}^\infty W(\mu_n, \mu_{n+1})<\infty时, \{\mu_n:n\geq1\}\{{\cal M}, W\}上的Cauchy序列.

  对任意\varepsilon>0,因为\sum\limits_{n=1}^\infty W(\mu_n, \mu_{n+1})<\infty,故存在N,当n>N时,有

\sum\limits_{k=n}^\infty W(\mu_k, \mu_{k+1})<\varepsilon.

于是对任意正整数m,由三角不等式知

W(\mu_n, \mu_{n+m})\leq\sum\limits_{k=n}^{n+m}W(\mu_k, \mu_{k+1})\leq\sum\limits_{k=n}^\infty W(\mu_k, \mu_{k+1})<\varepsilon.

\{\mu_n:n\geq1\}是Cauchy序列.引理证毕.

引理2.3  设\mu_1, \mu_2是任意概率测度, P_{n}是波兰空间(X, \rho, {\cal B}(X))上的概率核,若

\int\rho(u, v)\widetilde{P}_n(x, y;{\rm d}u, {\rm d}v)\leq\rho(x, y)c_n .
(2.1)

其中\widetilde{P}_n(x, y;{\rm d}u, {\rm d}v){P_n}(x, {\rm d}u){P_n}(y, {\rm d}v)的耦合, c_nx, y无关且0\leq c_n\leq 1.则有

W({\mu_1P_n}, {\mu_2P_n})\leq W(\mu_1, \mu_2).
(2.2)

  由(2.1)式有

\int\rho(u, v)\widetilde{P}_n(x, y;{\rm d}u, {\rm d}v)\leq\rho(x, y).
(2.3)

\overline{\mu}\mu_1\mu_2\rho最优耦合,在(2.3)式两边对\overline{\mu}求积分得

\int\overline{\mu}({\rm d}x, {\rm d}y)\int\rho(u, v)\widetilde{P}_n(x, y;{\rm d}u, {\rm d}v)\leq\int\overline{\mu}({\rm d}x, {\rm d}y)\rho(x, y).

\int\overline{\mu}\widetilde{P}_n({\rm d}u, {\rm d}v)\rho(u, v)\leq W(\mu_1, \mu_2).
(2.4)

下面证明

W(\mu_1P_n, \mu_2P_n)\leq\int\overline{\mu}\widetilde{P}_n({\rm d}u, {\rm d}v)\rho(u, v).
(2.5)

要证明(2.5)式成立,只需证明\overline{\mu}\widetilde{P}_n\mu_1P_n\mu_2P_n的耦合.下面验证边缘性条件

\begin{eqnarray*}\overline{\mu}\widetilde{P}_n(A\times X)&=&\int\overline{\mu}({\rm d}x, {\rm d}y)\widetilde{P}_n(x, y;A\times X)\\&=&\int\overline{\mu}({\rm d}x, {\rm d}y)P_n(x, A)\\&=&\int\mu_1({\rm d}x)P_n(x, A)\\&=&\mu_1P_n(A).\end{eqnarray*}

同理可证\overline{\mu}\widetilde{P}_n(X\times B)=\mu_2P_n(B).\overline{\mu}\widetilde{P}_n\mu_1P_n\mu_2P_n的耦合.(2.5)式获证.再结合(2.4)式即得(2.2)式.

3 主要结果的证明

为了证明定理1.1,下面分三步来证明.

(a)  对\forall\mu_1, \mu_2\in{\cal M},任意正整数j

W({\mu_1P_jP_{j+1}\cdots P_n}, {\mu_2P_jP_{j+1}\cdots P_n})\rightarrow0, n\rightarrow\infty;

(b)  存在概率测度\pi\in{\cal M}使得

W({\pi_n}, \pi)\rightarrow0, n\rightarrow\infty;
(1)

(c)  对初始分布{\mu_0}\in{\cal M}

W({\mu_0P_1P_2\cdots P_n}, \pi)\rightarrow0, n\rightarrow\infty.

定理1.1的证明  (a) 先证明\overline{\mu}\widetilde{P}_j\widetilde{P}_{j+1}\cdots \widetilde{P}_n\mu_1P_jP_{j+1}\cdots P_n\mu_2P_jP_{j+1}\cdots P_n的耦合.其中\overline{\mu}\mu_1\mu_2\rho最优耦合, \widetilde{P}_n(x, y;{\rm d}u, {\rm d}v){P_n}(x, {\rm d}u){P_n}(y, {\rm d}v)的耦合, \widetilde{P}_n(x, y;\cdot)(X\times X, {\cal B}(X)\times{\cal B}(X))上的概率核.为简单见,我们只证n=2的情形(一般情形可类似地证明).即证\overline{\mu}\widetilde{P}_1\widetilde{P}_2\mu_1P_1P_2\mu_2P_1P_2的耦合.由耦合的边缘性知:

\widetilde{P}_1(x, y;A\times X)=P_1(x, A), \widetilde{P}_1(x, y;X\times A)=P_1(y, A), x, y\in X, A\in {\cal B}(X);\\\;\widetilde{P}_2(x, y;A\times X)=P_2(x, A), \widetilde{P}_2(x, y;X\times A)=P_2(y, A), x, y\in X, A\in {\cal B}(X).

下面验证边缘性条件

\overline{\mu}\widetilde{P}_1\widetilde{P}_2(A\times X)=\mu_1P_1P_2(A) , x, y\in X, A\in {\cal B}(X);
(3.1)

\overline{\mu}\widetilde{P}_1\widetilde{P}_2(X\times A)=\mu_2P_1P_2(A) , x, y\in X, A\in {\cal B}(X).
(3.2)

注意到\mu_1P_1P_2(A)=\int\!\!\!\int\mu_1({\rm d}x)P_1(x, {\rm d}u)P_2(u, A),而

\begin{eqnarray*}\overline{\mu}\widetilde{P}_1\widetilde{P}_2(A\times X)&=&\int\!\!\!\int\overline{\mu}({\rm d}x, {\rm d}y)\widetilde{P}_1(x, y;{\rm d}u, {\rm d}v)\widetilde{P}_2(u, v;A\times X) \\&=&\int\!\!\!\int\overline{\mu}({\rm d}x, {\rm d}y)\widetilde{P}_1(x, y;{\rm d}u, {\rm d}v)P_2(u, A) \\&=&\int\overline{\mu}({\rm d}x, {\rm d}y)\int P_1(x, {\rm d}u)P_2(u, A) \\&=&\int\mu_1({\rm d}x)\int P_1(x, {\rm d}u)P_2(u, A) \\&=&\int\!\!\!\int\mu_1({\rm d}x) P_1(x, {\rm d}u)P_2(u, A) \\&=&\mu_1P_1P_2(A).\end{eqnarray*}

即(3.1)式成立,类似地,可证(3.2)式成立.

于是有

\begin{eqnarray*}&&W({\mu_1P_jP_{j+1}\cdots P_n}, {\mu_2P_jP_{j+1}\cdots P_n})\\&=&\inf\bigg\{\bigg[\int\overline{\mu}\widetilde{P}_j\widetilde{P}_{j+1}\cdots \widetilde{P}_n({\rm d}u, {\rm d}v)\rho(u, v)\bigg]:\overline{\mu}\in K(\mu_1, \mu_2)\bigg\}\\&\leq&\int\overline{\mu}\widetilde{P}_j\widetilde{P}_{j+1}\cdots \widetilde{P}_n({\rm d}u, {\rm d}v)\rho(u, v).\end{eqnarray*}

反复利用条件(ii)得

\begin{eqnarray*}\int\overline{\mu}\widetilde{P}_j\widetilde{P}_{j+1}\cdots \widetilde{P}_n({\rm d}u, {\rm d}v)\rho(u, v)&\leq&\int\overline{\mu}\widetilde{P}_j\widetilde{P}_{j+1}\cdots \widetilde{P}_{n-1}({\rm d}u, {\rm d}v)\rho(u, v)c_n\\&\leq&\cdots \\&\leq&\int\overline{\mu}({\rm d}x, {\rm d}y)\rho(u, v)\prod _{k=j}^{n}c_n\\&=& W(\mu_1, \mu_2)\prod _{k=j}^{n}c_n.\end{eqnarray*}

由于\mu_1, \mu_2\in{\cal M},从而有

\begin{eqnarray*}W(\mu_1, \mu_2)&=&\int\rho(x, y)\overline{\mu}({\rm d}x, {\rm d}y)\\&\leq&\int\rho(x, x_0)\overline{\mu}({\rm d}x, {\rm d}y)+\int\rho(x_0, y)\overline{\mu}({\rm d}x, {\rm d}y)\\&=&\int\rho(x, x_0)\mu_1({\rm d}x)+\int\rho(x_0, y)\mu_2({\rm d}y)\\&<& \infty.\end{eqnarray*}

结合条件(iii) \prod\limits_{k=j}^n c_n\rightarrow0, n\rightarrow\infty证得(a)成立.

(b) 由条件(i)知

\sum\limits_{n}W(\pi_n, \pi_{n+1})<\infty .

由引理(2.1)\{{\cal M}, W\}是完备的度量空间.又由引理(2.2)\{\pi_n:n\geq1\}\{{\cal M}, W\}上的Cauchy序列.于是存在\pi\in{\cal M}, 使得W(\pi_n, \pi)\rightarrow0, n\rightarrow\infty.结论(b)得证.

(c)  为证(c),即要证对\forall\varepsilon>0, \exists N\in Z^{+}, s.t.当n>N时有

W({\mu_0P_1P_2\cdots P_n}, \pi)<\varepsilon.
(3.3)

对任意的正整数j\leq n, 利用三角不等式有

W({\mu_0P_1P_2\cdots P_n}, \pi)<W({\mu_0P_1P_2\cdots P_n}, {\pi P_jP_{j+1}\cdots P_n})+W({\pi P_jP_{j+1}\cdots P_n}, \pi).
(3.4)

对上式第二项反复利用三角不等式并注意条件\pi_mP_m=\pi_m推得

\begin{eqnarray*}&&W({\pi P_jP_{j+1}\cdots P_n}, \pi)\\&\leq& W({\pi P_jP_{j+1}\cdots P_n}, {\pi_j P_jP_{j+1}\cdots P_n})+W({\pi_j P_jP_{j+1}\cdots P_n}, \pi)\\&\leq &W(\pi, \pi_j)+W({\pi_j P_jP_{j+1}\cdots P_n}, \pi)\\&=&W(\pi, \pi_j)+W({\pi_j P_{j+1}\cdots P_n}, \pi) \qquad (\pi_j P_j=\pi_j)\\&\leq& W(\pi, \pi_j)+W({\pi_j P_{j+1}\cdots P_n}, {\pi_{j+1} P_{j+1}\cdots P_n})+W({\pi_{j+1}P_{j+1}\cdots P_n}, \pi) \\&\leq &W(\pi, \pi_j)+W({\pi_j}, {\pi_{j+1}})+W({\pi_{j+1}P_{j+2}\cdots P_n}, \pi) \\&\leq&\cdots \\&\leq& W(\pi, \pi_j)+\sum\limits_{k=1}^{n-j} W({\pi_{j+k-1}}, {\pi_{j+k}})+W({\pi_n}, \pi).\end{eqnarray*}

从而有

W({\pi P_jP_{j+1}\cdots P_n}, \pi)\leq W(\pi, \pi_j)+\sum\limits_{k=1}^{n-j} W({\pi_{j+k-1}}, {\pi_{j+k}})+W({\pi_n}, \pi).
(3.5)

由(b)知,当j\rightarrow\infty时, W(\pi, \pi_j)\rightarrow0, 所以\exists M_1\in Z^{+}, s.t.当j>M_1时,有

W(\pi, \pi_j)<\frac{\varepsilon}{4}.
(3.6)

同理,当n\rightarrow\inftyW({\pi_n}, \pi)\rightarrow0, 所以\exists N_1\in Z^{+}, s.t.当n>N_1时,有

W({\pi_n}, \pi)<\frac{\varepsilon}{4}.
(3.7)

由条件\sum\limits_{n}W(\pi_n, \pi_{n+1})<\infty 知,当j\rightarrow\infty时, \sum\limits_{n=j}^{\infty}W(\pi_n, \pi_{n+1})\rightarrow0, \exists M_2\in Z^{+}, s.t.当j>M_2时,有

\sum\limits_{k=1}^{n-j} W({\pi_{j+k-1}}, {\pi_{j+k}})\leq\sum\limits_{n=j}^{\infty}W(\pi_n, \pi_{n+1})<\frac{\varepsilon}{4}.
(3.8)

{\mu_0}\in{\cal M}和条件(iv)易证得\mu_0P_1P_2\cdots P_{j-1}\in{\cal M}, 由(a)知

\begin{eqnarray*}&&W({\mu_0P_1P_2\cdots P_n}, {\pi P_jP_{j+1}\cdots P_n})\\&=&W((\mu_0P_1P_2\cdots P_{j-1}){P_jP_{j+1}\cdots P_n}, {\pi P_jP_{j+1}\cdots P_n})\rightarrow0, \;\;\;\; n\rightarrow\infty. \end{eqnarray*}

所以\exists N_2\in Z^{+}, s.t.当n>N_2时,有

W({\mu_0P_1P_2\cdots P_n}, {\pi P_jP_{j+1}\cdot \cdot \cdot P_n})<\frac{\varepsilon}{4}.
(3.9)

最后,取N=\max\{N_1, N_2\}, M=\max\{M_1, M_2\}, (3.6)(3.8)式中取j=M, 则当n>N时, (3.6)-(3.9)都成立.把(3.6)-(3.8)式代入(3.5)式得

W({\pi_{M}P_{M}\cdots P_{n}}, \pi)<\frac{3\varepsilon}{4}.
(3.10)

再把(3.9)和(3.10)式代入(3.4)式得到(3.3)式成立.结论(c)得证.定理证毕.

参考文献

Chen M F . From Markov Chains to Nonequilibrium Particle Systems. SingaPore: World Scientific, 2004

[本文引用: 2]

Meyn S P , Tweedie R L . Markov Chains and Stochastic Stability. London: Springer-Verlag, 1992

Bowerman B , David H T , Isaacson D .

The convergence of Cesaro averagesfor certain non-stationary Markov chains

Stachastic Processes and Their Applications, 1977, 5: 221- 230

DOI:10.1016/0304-4149(77)90032-1     

龚光鲁, 钱敏平. 应用随机过程教程及在算法和智能计算中的随机模型. 北京: 清华大学出版社, 2004

[本文引用: 2]

Gong Guanglu , Qian Minping . A Course in Appied Stochastic Processes and Stochastic Models in Algorithms and Intelligence Computation. Beijing: Tsinghua University Press, 2004

[本文引用: 2]

张绍义.

转移概率的可测耦合与概率距离

数学年刊A辑, 1995, 6 (16): 769- 775

URL    

Zhang Shaoyi .

The measurable coupling and probability distance of transition probability

Chinese Annals of Mathematics, 1995, 6 (16): 769- 775

URL    

张绍义, 徐侃.

转移概率最优可测耦合的存在性

数学学报, 1997, 1 (40): 5- 13

URL    

Zhang Shaoyi , Xu Kan .

The existence of optimal measurable coupling of transition probability

Acta Mathematica Sinica, 1997, 1 (40): 5- 13

URL    

Zhang S Y .

Regularity and existence of invariant measures for jump processes

Acta Mathematica Sinica, 2005, 48: 785- 788

URL    

Zhang S Y .

Existence of the optimal measurable coupling and ergodicity for markov processes

Science in China (Series A), 1999, 42: 58- 67

DOI:10.1007/BF02872050     

Andradottir S .

A method for discrete stochastic optimization

Management Science, 1995, 12 (41): 1946- 1961

URL    

Gutjahr W J , Pflug G C .

Simulated annealing for noisy cost functions

Journal of Global Optimization, 1996, 8: 1- 13

URL    

Ahmed M A , Alkhamis T .

M-Simulation-based optimization using simulated an. nealing with ranking and selection

Computers Operations Research, 2002, 29 (4): 387- 402

DOI:10.1016/S0305-0548(00)00073-3     

/