数学物理学报, 2023, 43(1): 291-304

具有线性化技术的三块非凸不可分优化问题Bregman ADMM收敛性分析

刘富勤, 彭建文,*, 罗洪林

重庆师范大学数学科学学院 重庆401331

Convergence Analysis of Bregman ADMM for Three-Block Nonconvex Indivisible Optimization Problems with Linearization Technique

Liu Fuqin, Peng Jianwen,*, Luo Honglin

School of Mathematical Sciences, Chongqing Normal University, Chongqing 401331

通讯作者: *彭建文, E-mail: jwpeng168@hotmail.com

收稿日期: 2022-03-31   修回日期: 2022-08-5  

基金资助: 国家自然科学基金重大项目(11991024)
国家自然科学基金面上项目(12271071)
重庆英才$\cdot$ 创新创业领军人才$\cdot$ 创新创业示范团队项目(CQYC20210309536)
重庆英才计划包干制项目(cstc2022ycjh-bgzxm0147)
重庆市高校创新研究群体项目(CXQT20014)
重庆市自然科学基金项目(cstc2021jcyj-msxmX0300)

Received: 2022-03-31   Revised: 2022-08-5  

Fund supported: The NSFC(11991024)
The NSFC(12271071)
Team Project of Innovation Leading Talent in Chongqing(CQYC20210309536)
Contract System Project of Chongqing Talent Plan(cstc2022ycjh-bgzxm0147)
Chongqing University Innovation Research Group Project(CXQT20014)
Chongqing Natural Science Foundation Project(cstc2021jcyj-msxmX0300)

摘要

交替方向乘子法是求解两块可分离凸优化问题的有效方法, 但是对于三块不可分的非凸优化问题的交替方向乘子法的收敛性可能无法保证. 该文主要研究的是用线性化广义Bregman交替方向乘子法(L-G-BADMM)求解目标函数是三块不可分的非凸极小化问题的收敛性分析. 在适当假设条件下, 对算法中子问题进行求解并构建满足Kurdyka-Lojasiewicz性质的效益函数, 经过理论证明可以得到该算法的收敛性.

关键词: Bregman散度; 交替方向乘子法; Kurdyka-Łojasiewicz性质; 线性化

Abstract

Alternating direction multiplier method is an effective method to solve two separable convex optimization problems, but the convergence of alternating direction multiplier method may not be guaranteed for three nonseparable nonconvex optimization problems. This paper mainly studies the convergence analysis of the linearized generalized Bregman alternating direction multiplier method (L-G-BADMM) for solving the nonconvex minimization problem whose objective function is three indivisible blocks. Under appropriate assumptions, we solve the subproblem of the algorithm and construct a benefit function satisfying Kurdyka-Lojasiewicz property. The convergence of the algorithm can be obtained through theoretical proof.

Keywords: Bregman divergence; ADMM; Kurdyka-Lojasiewicz property; Linearization

PDF (362KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

刘富勤, 彭建文, 罗洪林. 具有线性化技术的三块非凸不可分优化问题Bregman ADMM收敛性分析[J]. 数学物理学报, 2023, 43(1): 291-304

Liu Fuqin, Peng Jianwen, Luo Honglin. Convergence Analysis of Bregman ADMM for Three-Block Nonconvex Indivisible Optimization Problems with Linearization Technique[J]. Acta Mathematica Scientia, 2023, 43(1): 291-304

1 引言

和Marroco[12], Gabay和Mercier [11]提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7,23]. 在一定的假设条件下, 文献[8,10,17]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20]针对两块可分离非凸优化问题

$\begin{equation} \min \limits_{x \in X,y \in Y} f(x) + g(y)~~{\rm s.t.~}Ax + By = c, \end{equation}$

其中, $A \in {\Bbb R^{m \times {n_1}}},~B \in {\Bbb R^{m \times {n_2}}},~f:{\Bbb R^{{n_1}}} \to \Bbb R\bigcup {\left\{ { + \infty } \right\}},~g: {\Bbb R^{{n_2}}} \to \Bbb R,~X \subset {\Bbb R^{{n_1}}}, ~Y \subset {\Bbb R^{{n_2}}},$$c \in \Bbb R$. 提出Bregman ADMM算法(简记为BADMM算法)

$\begin{equation} \left\{ \begin{array}{ll} {x_{k + 1}} = \mathop {\arg \min }\limits_{x \in X} {L_\beta }(x,{y_k},{\lambda _k}) + {\Delta_\phi }(x,{x_k}), \\ {y_{k + 1}} = \mathop {\arg \min }\limits_{y \in Y} {L_\beta }({x_{k + 1}},y,{\lambda _k}) + {\Delta_\psi }(y,{y_k}), \\ {\lambda _{k + 1}} = {\lambda _k} + \beta (A{x_{k + 1}} + B{y_{k + 1}} - c). \\ \end{array} \right. \end{equation}$

其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7]中不仅提出只要$\beta $足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6-7,13-15,19,24].

下面我们讨论的是三块不可分结构的非凸优化问题

$\begin{equation} \mathop {\min }\limits_{{x_i} \in X,y \in Y} {f_1}({x_1}) + {f_2}({x_2}) + g({x_1},{x_2},y)~~{\rm s.t.~}{A_1}{x_1} + {A_2}{x_2} + By = c, \end{equation}$

其中 ${A_i} \in {\Bbb R^{m \times {n_i}}},~B \in {\Bbb R^{m \times n}},~X \subset {\Bbb R^{{n_i}}},~Y \subset {\Bbb R^n},~c \in {\Bbb R^m},~{f_i}:{\Bbb R^{{n_i}}} \to \Bbb R\bigcup {\left\{ { + \infty } \right\}}$$ (i = 1,2)$ 是非凸可微函数且 $g: {\Bbb R^{{n_1} + {n_2} + n}} \to \Bbb R$ 是可微函数. 问题(1.3)的增广拉格朗日函数为

$\begin{eqnarray*} {L_\beta }({x_1},{x_2},y,\lambda ) &=& {f_1}({x_1}) + {f_2}({x_2}) + g({x_1},{x_2},y) + \langle \lambda,{A_1}{x_1} + {A_2}{x_2} + By - c \rangle \\ &&+ \frac{\beta }{2}\left\| {{A_1}{x_1} + {A_2}{x_2} + By - c} \right\|_2^2. \end{eqnarray*}$

Wang等[21]提出用Bregman散度代替增广拉格朗日函数中的二次惩罚项得到广义BADMM算法. 受文献[21]的启发, 对于问题(1.3)我们得到新的增广拉格朗日函数

$\begin{eqnarray*}{\hat L_\beta }({x_1},{x_2},y,\lambda )& =& {f_1}({x_1}) + {f_2}({x_2}) + g({x_1},{x_2},y) + \langle \lambda,{A_1}{x_1} + {A_2}{x_2} + By - c \rangle \\ && + \beta {\Delta _\psi }({A_1}{x_1} + {A_2}{x_2} + By,c). \end{eqnarray*}$

Feng等[10]引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3] 和广义ADMM[9]通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6]针对三块不可分的非凸优化问题提出一种线性化BADMM算法.

本文针对问题(1.3)提出一种具有线性化技术的广义BADMM算法(L-G-BADMM)

$\begin{equation} \left\{ \begin{array}{ll} x_1^{k + 1} =& \mathop {\arg \min }\limits_{{x_1} \in {X_1}} {f_1}({x_1}) + {({x_1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}{x_1} + {A_2}x_2^k + B{y^k} - c \rangle\\ & + \beta {\Delta _\psi }({A_1}{x_1} + {A_2}x_2^k + B{y^k},c) + {\Delta _{{\varphi _1}}}({x_1},x_1^k), \\ x_2^{k + 1} =&\mathop {\arg \min }\limits_{{x_2} \in {X_2}} {f_2}({x_2}) + {({x_2} - x_2^k)^{\rm T}}{\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) + \langle {\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{} + B{y^k} - c \rangle \\ & + \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{} + B{y^k},c) + {\Delta _{{\varphi _2}}}({x_2},x_2^k) \\ {y^{k + 1}} =&\mathop {\arg \min }\limits_{y \in Y} {(y - {y^k})^{\rm T}}{\nabla _y}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + By - c \rangle \\ & + \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + By,c) + {\Delta _\phi }(y,{y^k}) \\ {\lambda ^{k + 1}}= & {\lambda ^k} + \beta ({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c). \end{array} \right. \end{equation} $

其中${\Delta_\psi },~{\Delta_{{\varphi _1}}},~{\Delta_{{\varphi _2}}},~{\Delta_\phi }$分别是函数$\psi,{\varphi _1},{\varphi _2},\phi $ 导出的Bregman散度. 文献[6]提出的L-BADMM算法采用的是二范数作为二次惩罚项, 而本文算法中的二次惩罚项则使用的是Bregman散度. 由定义2.4可知二范数是Bregman散度的特例, 因此这对于经典的ADMM 算法来说具有更广泛的意义. 目前, 在目标函数非凸情况下, 还没有二次惩罚项不是二范数的理论证明, 这也是本文将要解决的问题.

2 预备知识

定义2.1[20]$f:{\Bbb R^n} \to \Bbb R\bigcup{\left\{ + \infty \right\} }$ 为下半连续函数.

(1)给定 $x \in {\rm dom}f$, $f$$x$ 处的 Frechet 次微分记做 $\hat \partial f(x)$, 对所有的 $\mu \in {\Bbb R^n}$ 都满足

$\hat \partial f(x) =\bigg \{ \mu \in {\Bbb R^n}:\frac{{f(y) - f(x) - \langle \mu,y - x\rangle }}{{\parallel y - x\parallel }} \ge 0\bigg\}. $

(2)如果

$\partial f(x) = \{ \mu \in {\Bbb R^n}:\exists {x^k} \to x,f({x^k}) \to f(x),{\mu ^k} \in \hat \partial f(x) \to \mu,k \to \infty \}, $

则称 $f$$x$ 处的极限次微分或者简单次微分, 记做 $\partial f(x)$.

(3)如果 ${x_0}$称为函数$f$的稳定点或者临界点, 则 $0 \in \partial {f(x_0)}$.

定义2.2[20] (鞍点定理) ${w^ * } = ({x^ * }, {y^ * }, {\lambda ^ * })$ 是拉格朗日函数 $L$ 的稳定点, 则满足

$\left\{ \begin{array}{ll} A{x^*} = B{x^*},\\ - \nabla f({x^*}) = {A^T}{\lambda ^*},\\ {A^T}{\lambda ^*} \in \partial g({y^*}). \end{array} \right.$

定义2.3[2] 设函数$f: {\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$ 是真且下半连续, 如果存在 $\eta > 0, \delta > 0, \varphi \in {A_n}$, 使得对所有

$x \in O({x_0},\delta ) \cap \left\{ {x:f({x_0}) < f(x) < f({x_0}) + \eta } \right\},$

我们有

$\varphi '(f(x) - f({x_0})){\rm dist}(0,\partial f(x)) \ge 1,$

则称函数 $f$${x_0}$ 处满足Kurdyka-Łojasiewicz性质(K-L 性质). 其中, $\varphi $ 表示 $\left[ {0,\eta } \right] \to {\Bbb R^ + }$上的函数, ${\rm dist}(0,\partial f(x)) = \inf \left\{ {\left\| {{x_0} - y} \right\|:y \in \partial f(x)} \right\}$ 且满足

(a) $\varphi $$\left[ {0,\eta } \right]$ 上连续,

(b) $\varphi $$\left[ {0,\eta } \right]$ 上光滑凹,

(c) $\varphi (0) = 0, \varphi '(x) > 0,\forall x \in (0,\eta )$.

引理2.1[6] (一致K-L性质) 设 $\Omega$ 是紧集, $f:{\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$是真且下半连续函数, 若函数$f$ 在集合上取常数$f({x_0})$, 并在$\Omega $ 中任意一点满足K-L性质, 则存在$\eta > 0, \delta > 0, \varphi \in {A_n}$, 使得对所有

$x \in O({x_0},\delta ) \cap \left\{ {x:f({x_0}) < f(x) < f({x_0}) + \eta } \right\},$

我们有

$\varphi '(f(x) - f({x_0}))(0,\partial f(x)) \ge 1.$

定义2.4[4] 对于一个凸可微的函数$\Phi $, 定义Bregman散度如下

${\Delta _\Phi }(x,y) = \Phi (x) - \Phi (y) - \left\langle {\nabla \Phi (y),x - y} \right\rangle.$

特别地, 如果$\Phi (x) = \left\| x \right\|_Q^2 = {x^{\rm T}}Qx$, 则有${\Delta _\Phi }(x,y) = \left\| {x - y} \right\|_Q^2$, 其中$Q$是对称半正定矩阵.

定义2.5 (强凸性) 设$f:{\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$是真函数, 如果存在$\delta > 0$, 使得

$f(\lambda x + (1 - \lambda )y) \le \lambda f(x) + (1 - \lambda )f(y) - \frac{{\delta \lambda (1 - \lambda )}}{2}\left\| {x - y} \right\|_{}^2,\forall \lambda \in (0,1).$

引理2.2[1]${\Delta _\phi }$是可微的凸函数, 则有

(i) (非负性) ${\Delta _\Phi }(x,y) \ge 0,{\Delta _\Phi }(x,x) = 0$.

(ii) (凸性) ${\Delta _\Phi }(x,y)$$x$具有凸性.

(iii) (强凸性) $\Phi $如果是一个强凸函数, 强凸系数为$\delta $, 则${\Delta _\Phi }(x,y) \ge \frac{\delta }{2}{\left\| {x - y} \right\|^2}$.

引理2.3[6] (下降性引理) 设$g:{\Bbb R^n} \to \Bbb R$ 是利普西茨可微函数, 其利普西茨系数为${l_g}$, 则对任意$x,y \in {\Bbb R^n}$都有

$\left| {g(y) - g(x) - \left\langle {\nabla g(x),y - x} \right\rangle } \right| \le \frac{{{l_g}}}{2}{\left\| {y - x} \right\|^2}.$

假设1 (i) $\nabla g$,$\nabla {f_1},\nabla {f_2}$关于${l_g}$,${l_{{f_1}}},{l_{{f_2}}}$是利普希茨连续;

(ii) 存在$\mu > 0$, 使得${B^{\rm T}}B \succ \frac{1}{\mu }I$;

(iii) ${\nabla _\psi }, {\nabla _{{\varphi _1}}},{\nabla _{{\varphi _2}}},{\nabla _\phi }$满足利普希茨连续, 且利普希茨系数分别为${l_\psi },{l_{{\varphi _1}}},{l_{{\varphi _2}}},{l_\phi };$

(iv) $\psi, {\varphi _1}, {\varphi _2},\phi $是强凸函数, 所对应的强凸系数分别为${\delta _\psi }, {\delta _{{\varphi _1}}},{ \delta _{{\varphi _2}}}, {\delta _\phi }$, 且${\delta _{{\varphi _1}}},{\delta _{{\varphi _2}}},{\delta _\phi }$均大于${l_g}$;

(v) ${A_1},{A_2}$均为正交矩阵, 即${A_1}^{\rm T}{A_1} = I,{A_2}^{\rm T}{A_2} = I;$

(vi)

$ \frac{{{\delta _{{\varphi _1}}} - l_g^{}}}{2} > \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu,\frac{{{\delta _{{\varphi _2}}} - l_g^{}}}{2} > \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu, $
$ \frac{{{\delta _\phi } - l_g^{}}}{2} > \frac{{4l_g^2\mu + 8l_\phi ^2\mu }}{\beta } - 4l_\psi ^2{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu. $

3 收敛性分析

引理3.1$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$是L-G-BADMM算法产生的点列, 若假设1(ii)成立, 对任意的$k \in N$, 我们有

$\begin{matrix} {\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2} & \le&4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2} \\ & &+ (4l_\phi ^2\mu + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2} + 4l_g^2\mu {\left\| {x_1^{k + 1} - x_1^k} \right\|^2} \\ &&+4l_g^2\mu {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + (4l_g^2\mu + 4l_\phi ^2\mu ){\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$

由假设1 (ii), 得到

$\begin{equation} {\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2} \le \mu {\left\| {B({\lambda ^{k + 1}} - {\lambda ^k})} \right\|^2}. \end{equation}$

求(1.4)式中子问题最优性条件并移项可得

$\begin{matrix} {B^{\rm T}}{\lambda ^k}&=& {\nabla _\phi }({y^k}) - {\nabla _\phi }({y^{k + 1}}) - {\nabla _y}g(x_1^k,x_2^k,{y^k}) \\ &&+ \beta {B^{\rm T}}({\nabla _\psi }(c) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}})), \end{matrix}$

$\begin{matrix} {B^{\rm T}}{\lambda ^{k + 1}} &= &{\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^{k + 2}}) - {\nabla _y}g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) \\ & &+ \beta {B^{\rm T}}({\nabla _\psi }(c) - {\nabla _\psi }({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}})), \end{matrix}$

得出

$\begin{matrix} && {\left\| {{B^{\rm T}}({\lambda ^{k + 1}} - {\lambda ^k})} \right\|^2} \\ &=&\left\| {{\nabla _y}g(x_1^k,x_2^k,{y^k}) - {\nabla _y}g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}})} \right. + {\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^{k + 2}})+ {\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^k}) \\ &&{\left. { + \beta {B^{\rm T}}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}}))} \right\|^2} \\ &\le & 4{\left[ {\left\| {{\nabla _y}g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - {\nabla _y}g(x_1^k,x_2^k,{y^k})} \right\|} \right.^2} \\ &&+ {\left\| {{\nabla _\phi }({y^{k + 2}}) - {\nabla _\phi }({y^{k + 1}})} \right\|^2} + {\left\| {{\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^k})} \right\|^2} \\ &&+ {\beta ^2}{{\left\| {{B^{\rm T}}} \right\|}^2}\left. {{{\left\| {({\nabla _\psi }({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}})} \right\|}^2}} \right] \\ &\le & 4l_g^2{\left\| {(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - (x_1^k,x_2^k,{y^k})} \right\|^2} + 4l_\phi ^2{\left\| {({y^{k + 2}} - {y^{k + 1}})} \right\|^2} + 4l_\phi ^2{\left\| {({y^{k + 1}} - {y^k})} \right\|^2} \\ && + 4{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}l_\psi ^2{\left\| {({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}}) - ({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}})} \right\|^2}. \end{matrix}$

最后一步是由${\nabla _g},{\nabla _{_\phi }},{\nabla _\psi }$ 的利普西茨连续性得到的. 下面我们将上式代入(3.2)式可得

$\begin{matrix} {\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2} &\le & \mu {\left\| {B({\lambda ^{k + 1}} - {\lambda ^k})} \right\|^2} \\ &\le & 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2} \\ && + 4l_g^2\mu {\left\| {x_1^{k + 1} - x_1^k} \right\|^2} + (4l_\phi ^2\mu + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2} \\ &&+ 4l_g^2\mu {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + (4l_g^2\mu + 4l_\phi ^2\mu ){\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$

引理结论显然可得.证毕.

注3.1 由范数的三角不等式性$\left\| {A + B} \right\| \le \left\| A \right\| + \left\| B \right\|$, 再将其左右两边平方可得

${\left\| {A + B} \right\|^2} \le {\left\| A \right\|^2} + {\left\| B \right\|^2} + 2\left\| A \right\|\left\| B \right\|. $

又因为基本不等式$2\left\| A \right\|\left\| B \right\| \le {\left\| A \right\|^2} + {\left\| B \right\|^2}$, 所以综上可得

${\left\| {A + B} \right\|^2} \le 2({\left\| A \right\|^2} + {\left\| B \right\|^2}). $

$\left\{ {{w^k} = {{(x_1^k,x_2^k,{y^k},{\lambda ^k})}^{\rm T}}} \right\}$是L-G-BADMM算法产生的点列, 为了证明${\hat L_\beta }({w^k})$ 的单调性, 下面构造函数$\hat L({\hat w^k})$

$\hat L({x_1},{x_2},y,\lambda,{\hat x_1},{\hat x_2},\hat y) = {\hat L_\beta }({x_1},{x_2},y,\lambda ) + {\sigma _1}{\left\| {\hat x_1^{} - x_1^{}} \right\|^2} +{\sigma _2}{\left\| {\hat x_2^{} - x_2^{}} \right\|^2} + {\sigma _3}{\left\| {\hat y - y} \right\|^2}, $

其中,

${\sigma _1} = - \left( {\frac{{{\delta _{{\varphi _1}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{{\left\| {{B^{\rm T}}} \right\|}^2}{{\left\| {{A_1}} \right\|}^2}\mu } \right), $
$ {\sigma _2} = - \left( {\frac{{{\delta _{{\varphi _2}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{{\left\| {{B^{\rm T}}} \right\|}^2}{{\left\| {{A_2}} \right\|}^2}\mu } \right), $
$ {\sigma _3} = - \left( {\frac{{{\delta _\phi } - l_g^{}}}{2} - \frac{{4l_g^2\mu + 8l_\phi ^2\mu }}{\beta } - 4l_\psi ^2\beta{{\left\| {{B^{\rm T}}} \right\|}^2}{{\left\| B \right\|}^2}\mu } \right). $

下面证明序列$ {\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$的单调性.

引理3.2$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$是L-G-BADMM算法产生的序列且 $\{ {u^k} = (x_1^k,x_2^k,$${y^k}) \}$是有界序列, 序列对任意的$k \in N$, 下面有

$\begin{equation} \hat L({\hat w^{k + 1}}) \le \hat L({\hat w^k}) - {\sigma _4}\left( {{{\left\| {x_1^{k + 1} - x_1^k} \right\|}^2} + {{\left\| {x_2^{k + 1} - x_2^k} \right\|}^2} + {{\left\| {{y^{k + 1}} - {y^k}} \right\|}^2}} \right), \end{equation} $

其中

$\begin{equation} {\sigma _4} = \min \left\{ \begin{array}{c} \frac{{{\delta _{{\varphi _1}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu,\frac{{{\delta _{{\varphi _2}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu,\\ \frac{{{\delta _\phi } - l_g^{}}}{2} - \frac{{4l_g^2\mu + 8l_\phi ^2\mu }}{\beta } - 4l_\psi ^2\beta{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu. \end{array} \right\}. \end{equation} $

显然(1.4)式中${x_1}$子问题变换可得

$\begin{matrix} &&{f_1}(x_1^{k + 1}) + {(x_1^{k + 1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k}, {A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k} - c \rangle \\ &&+ \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k},c) \\ & \le & {f_1}(x_1^k) + \langle{\lambda ^k},{A_1}x_1^k + {A_2}x_2^k + B{y^k} - c\rangle+ \beta {\Delta _\psi }({A_1}x_1^k + {A_2}x_2^k + B{y^k},c) \\ && - {\Delta _{{\varphi _1}}}(x_1^{k + 1},x_1^k), \end{matrix}$

我们可以由假设1(iv)和引理2.2(iii)得到${\Delta _{{\varphi _1}}}(x_1^{k + 1},x_1^k) \ge \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2}$, 于是有

$\begin{matrix} &&{f_1}(x_1^{k + 1}) + {(x_1^{k + 1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k} - c \rangle \\ & &+ \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k},c) \\ &\le &{f_1}(x_1^k) + \langle{\lambda ^k},{A_1}x_1^k + {A_2}x_2^k + B{y^k} - c \rangle + \beta {\Delta _\psi }({A_1}x_1^k + {A_2}x_2^k + B{y^k},c) \\ &&- \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2}. \end{matrix}$

同理(1.4)式中${x_2}$${y}$子问题类似可以得到上式的式子, 我们将其相加变换可得

$\begin{matrix} &&{f_1}(x_1^{k + 1}) + {f_2}(x_2^{k + 1}) + g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) + \langle{\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c\rangle \\ &&+ \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}},c) \\ & \le & {f_1}(x_1^k) + {f_2}(x_2^{k + 1}) + g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}x_1^k + {A_2}x_2^k + B{y^k} - c\rangle \\ &&+ \beta {\Delta _\psi }({A_1}x_1^k + {A_2}x_2^k + B{y^k},c) + [g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - g(x_1^k,x_2^k,{y^k}) \\ &&- ({(x_1^{k + 1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + {(x_2^{k + 1} - x_2^k)^{\rm T}}{\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) \\ & &+ {({y^{k + 1}} - {y^k})^{\rm T}}{\nabla _y}g(x_1^k,x_2^k,{y^k})) - \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2} - \frac{{{\delta _{{\varphi _2}}}}}{2}{\left\| {x_2^{k + 1} - x_2^k} \right\|^2} \\ & &- \frac{{{\delta _\phi }}}{2}{\left\| {{y^{k + 1}} - {y^k}} \right\|^2}, \end{matrix}$

由(3.11)式和引理2.3可得

$\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^k}) \\ &\le &{{\hat L}_\beta }(x_1^k,x_2^k,{y^k},{\lambda ^k}) + \frac{{{l_g}}}{2}{\left\| {(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - (x_1^k,x_2^k,{y^k})} \right\|^2} \\ && - \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2} - \frac{{{\delta _{{\varphi _2}}}}}{2}{\left\| {x_2^{k + 1} - x_2^k} \right\|^2} - \frac{{{\delta _\phi }}}{2}{\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$

又由(1.4)式中乘子迭代可得

$\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) \\ &\le & {{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^k}) + \langle{\lambda ^{k + 1}} - {\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c\rangle \\ &=& {{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^k}) + \frac{1}{\beta }{\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2}, \end{matrix}$

将(3.12)式与(3.13)式相加变化可得

$\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) \\ &\le & {{\hat L}_\beta }(x_1^k,x_2^k,{y^k},{\lambda ^k}) - \frac{{{\delta _{{\varphi _1}}} - {l_g}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2}- \frac{{{\delta _{{\varphi _2}}} - {l_g}}}{2}{\left\| {x_2^{k + 1} - x_2^k} \right\|^2} \\ &&- \frac{{{\delta _\phi } - {l_g}}}{2}{\left\| {{y^{k + 1}} - {y^k}} \right\|^2} + \frac{1}{\beta }{\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2}. \end{matrix}$

将(3.1)式代入(3.14)式, 可得

$\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} \\ &&- 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2}- (\frac{{4l_\phi ^2\mu }}{\beta } + 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2} \\ &\le &{{\hat L}_\beta }(x_1^k,x_2^k,{y^k},{\lambda ^k}) - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 1} - x_1^k} \right\|^2} \\ && - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} - (\frac{{4l_\phi ^2\mu }}{\beta } + 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 1}} - {y^k}} \right\|^2} \\ & &- (\frac{{{\delta _{{\varphi _1}}} - {l_g}}}{2} - \frac{{4l_g ^2\mu }}{\beta } - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu ){\left\| {x_1^{k + 1} - x_1^k} \right\|^2} \\ & &- (\frac{{{\delta _{{\varphi _2}}} - {l_g}}}{2} - \frac{{4l_g ^2\mu }}{\beta } - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu ){\left\| {x_2^{k + 1} - x_2^k} \right\|^2} \\ & &- (\frac{{{\delta _\phi } - {l_g}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu - \frac{{8l_\phi ^2\mu }}{\beta }){\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$

引理由此得证.

注3.2经过上述证明我们可以得到辅助函数

$\begin{matrix} \hat L({{\hat w}^{k + 1}}) &= &\hat L(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}},\hat x_1^{k + 1},\hat x_2^{k + 1},{{\hat y}^{k + 1}}) \\ & = &{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} \\ & &-4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2} \\ && - (\frac{{4l_\phi ^2\mu }}{\beta } + 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2}. \end{matrix}$

引理3.3$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$是L-G-BADMM算法产生的序列且 $\big\{ {u^k} = (x_1^k,x_2^k,$${y^k}) \big\}$是有界序列, 则

$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty, $

那么$\left\| {{w^{k + 1}} - {w^k}} \right\| \to 0,k \to \infty $ 成立.

首先, 我们需要证明序列$\left\{ {{{\hat w}^k}} \right\}$有界. 由(3.3)式可得

$\begin{matrix} {\left\| {{B^{\rm T}}{\lambda ^k}} \right\|^2}&= &\left\| { - {\nabla _y}} \right.g(x_1^k,x_2^k,{y^k}) + {\nabla _\phi }({y^k}) - {\nabla _\phi }({y^{k + 1}}) \\ & &{\left. { + \beta {B^{\rm T}}({\nabla _\psi }(c) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}))} \right\|^2}. \end{matrix}$

由假设1 (ii)和(3.17)式可知

$\begin{matrix} {\left\| {{\lambda ^k}} \right\|^2}&\le& \mu {\left\| {{B^{\rm T}}{\lambda ^k}} \right\|^2} \\ &\le &3\mu {\left\| {{\nabla _y}g(x_1^k,x_2^k,{y^k})} \right\|^2}+ 3\mu l_\phi ^2{\left\| {{y^{k + 1}} - {y^k}} \right\|^2} \\ &&+ 3\mu l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c} \right\|^2}, \end{matrix}$

因此, 由$\left\{ {{u^k}} \right\}$有界可知$\left\{ {{\lambda ^k}} \right\}$有界, 那么我们得到序列$\left\{ {{w^k}} \right\}$$\left\{ {{{\hat w}^k}} \right\}$有界.

其次, 我们将证明${\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$序列收敛, 由$\left\{ {{{\hat w}^k}} \right\}$有界可得$\left\{ {{{\hat w}^k}} \right\}$有收敛子列$\left\{ {{{\hat w}^{{k_j}}}} \right\}$, 设其极限点为${\hat w^*}$. 又由函数$\hat L( \cdot )$的下半连续性可得$\mathop {\lim }\limits_{j \to \infty } \inf \hat L({\hat w^{{k_j}}}) \ge \hat L({\hat w^*})$, 则$\hat L({\hat w^{{k_j}}})$ 有下界, 故$\hat L({\hat w^{{k_j}}})$ 单调递减有下界, 则$\hat L({\hat w^{{k_j}}})$收敛, 从而${\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$ 收敛. 最后, 对(3.7)式两边$k$从1到$t$分别求和, 得到

${\sigma _4}\sum\limits_{k = 1}^t {({{\left\| {x_1^{k + 1} - x_1^k} \right\|}^2} + {{\left\| {x_2^{k + 1} - x_2^k} \right\|}^2} + {{\left\| {{y^{k + 1}} - {y^k}} \right\|}^2})} \le \hat L({\hat w^1}) - \hat L({\hat w^{t + 1}}) < + \infty.$

${\sigma _4} > 0$, 以及$t$的任意性可知

$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{u^{k + 1}} - {u^k}} \right\|}^2}} < + \infty, $

结合(3.1)式可得

$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|}^2}} < + \infty. $

那么我们有

$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty, $

则显然$\left\| {{w^{k + 1}} - {w^k}} \right\| \to 0,k \to \infty $成立. 引理由此得证.

引理3.4$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$是L-G-BADMM算法产生的有界序列, 存在$\gamma > 0$, 对于任意$k \in N$, 使得

$\begin{matrix} d\left( {0,\partial \hat L\left( {{{\hat w}^{k + 1}}} \right)} \right) &\le &\gamma (\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\| + \left\| {x_2^{k + 2} - x_2^{k + 1}} \right\| + \left\| {{y^{k + 2}} - {y^{k + 1}}} \right\| \\ & &+ \left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|). \end{matrix}$

通过上述对$\hat L\left( {\hat w} \right)$的定义, 我们可以得到

$\begin{equation} \left\{ \begin{array}{ll} {\nabla _{{x_1}}}\hat L({{\hat w}^{k + 1}}) = \nabla {f_1}(x_1^{k + 1}) + {\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + A_1^{\rm T}{\lambda ^{k + 1}}\\ + \beta A_1^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)) - 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}),\\ {\nabla _{{x_2}}}\hat L({{\hat w}^{k + 1}}) = \nabla {f_2}(x_2^{k + 1}) + {\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) + A_2^{\rm T}{\lambda ^{k + 1}}\\ + \beta A_2^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)) - 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}),\\ {\nabla _y}\hat L({{\hat w}^{k + 1}}) = {\nabla _y}g(x_1^k,x_2^k,{y^k}) + B_{}^{\rm T}{\lambda ^{k + 1}} - 2{\sigma _3}({y^{k + 2}} - {y^{k + 1}})\\ \ + \beta B_{}^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)),\\ {\nabla _{{{\hat x}_1}}}\hat L({{\hat w}^{k + 1}}) = 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}),\\ {\nabla _{{{\hat x}_2}}}\hat L({{\hat w}^{k + 1}}) = 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}),\\ {\nabla _{{{\hat y}_{}}}}\hat L({{\hat w}^{k + 1}}) = 2{\sigma _3}({y^{k + 2}} - {y^{k + 1}}),\\ {\nabla _{{\lambda _{}}}}\hat L({{\hat w}^{k + 1}}) = \frac{1}{\beta }({\lambda ^{k + 1}} - {\lambda ^k}). \end{array} \right. \end{equation}$

由L-G-BADMM算法中子问题的最优性条件整理可得

$\begin{equation} \left\{ \begin{array}{ll} \nabla {f_1}(x_1^{k + 1}) = - {\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) - A_1^{\rm T}{\lambda ^k} + {\nabla _{{\varphi _1}}}(x_1^k) - {\nabla _{{\varphi _1}}}(x_1^{k + 1})\\ - \beta A_1^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k}) - {\nabla _\psi }(c)),\\ \nabla {f_2}(x_2^{k + 1}) = - {\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) - A_2^{\rm T}{\lambda ^k} + {\nabla _{{\varphi _2}}}(x_2^k) - {\nabla _{{\varphi _2}}}(x_2^{k + 1})\\ - \beta A_2^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^k}) - {\nabla _\psi }(c)),\\ {\nabla _y}g(x_1^k,x_2^k,{y^k}) = - B_{}^{\rm T}{\lambda ^k} + {\nabla _\phi }(y_{}^k) - {\nabla _\phi }(y_{}^{k + 1})\\ - \beta B_{}^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)),\\ {\lambda ^{k + 1}} = {\lambda ^k} + \beta ({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c). \end{array} \right. \end{equation}$

下面我们将(3.21)式代入(3.20)式, 化简可得

$\begin{equation} \left\{ \begin{array}{ll} \rho _1^k = A_1^{\rm T}({\lambda ^{k + 1}} - {\lambda ^k}) - 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}) + {\nabla _{{\varphi _1}}}(x_1^k) - {\nabla _{{\varphi _1}}}(x_1^{k + 1})\\ + \beta A_1^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k})),\\ \rho _2^k = A_2^{\rm T}({\lambda ^{k + 1}} - {\lambda ^k}) - 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}) + {\nabla _{{\varphi _2}}}(x_2^k) - {\nabla _{{\varphi _2}}}(x_2^{k + 1})\\ + \beta A_2^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^k})),\\ \rho _3^k = B_{}^{\rm T}({\lambda ^{k + 1}} - {\lambda ^k}) - 2{\sigma _3}(y_{}^{k + 2} - y_{}^{k + 1}) + {\nabla _\phi }(y_{}^k) - {\nabla _\phi }(y_{}^{k + 1}),\\ \rho _4^k = 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}),\\ \rho _5^k = 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}),\\ \rho _6^k = 2{\sigma _3}({y^{k + 2}} - {y^{k + 1}}),\\ \rho _7^k = \frac{1}{\beta }({\lambda ^{k + 1}} - {\lambda ^k}). \end{array} \right. \end{equation}$

因此$(\rho _1^k,\rho _2^k,\rho _3^k,\rho _4^k,\rho _5^k,\rho _6^k,\rho _7^k) \in \partial \hat L({\hat w^k})$, 则

$d\left( {0,\partial \hat L\left( {{{\hat w}^{k + 1}}} \right)} \right) \le \left\| {{{(\rho _1^k,\rho _2^k,\rho _3^k,\rho _4^k,\rho _5^k,\rho _6^k,\rho _7^k)}^{\rm T}}} \right\|. $

因为$\left\{ {{u^k}} \right\}$有界, 所以

$\begin{matrix} &&d\left( {0,\partial \hat L\left( {{{\hat w}^{k + 1}}} \right)} \right) \\ &\le&{\gamma _1}(\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\| + \left\| {x_2^{k + 2} - x_2^{k + 1}} \right\| + \left\| {{y^{k + 2}} - {y^{k + 1}}} \right\| + \left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|) \\ & &+\beta \left\| {A_1^{\rm T}} \right\|\left\| {{\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k})} \right\| \\ &&+ \beta\left\| {A_2^{\rm T}} \right\|\left\| {{\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^k})} \right\| \\ & &+\left\| {{\nabla _{{\varphi _1}}}(x_1^k) - {\nabla _{{\varphi _1}}}(x_1^{k + 1})} \right\| + \left\| {{\nabla _{{\varphi _2}}}(x_2^k) - {\nabla _{{\varphi _2}}}(x_2^{k + 1})} \right\| + \left\| {{\nabla _\phi }({y^k}) - {\nabla _\phi }({y^{k + 1}})} \right\|, \end{matrix}$

其中${\gamma _1} > 0$. 又因为${\nabla _\psi },{\nabla _{{\varphi _1}}},{\nabla _{{\varphi _2}}},{\nabla _\phi },{\nabla _g}$满足利普希茨连续以及引理3.1可知, 存在$\gamma > 0$, 使得(3.19)式成立. 证毕.

定理3.1$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$是L-G-BADMM算法产生的有界序列, 以及$S({\hat w^0})$ 是序列$\left\{ {{{\hat w}^k}} \right\}$的全体极限点集,

1) 故$\mathop {\lim }\limits_{k \to \infty } d({\hat w^k},S({\hat w^0})) = 0$;

2) 若$(x_1^*,x_2^*,{y^*},{\lambda ^*},\hat x_1^*,x_2^*,{y^*}) \in S({\hat w^0})$, 则${\hat w^*}$$\hat L( \cdot )$的稳定点;

3) 则$\hat L( \cdot )$在集合$S({\hat w^0})$上取值为常数且$\mathop {\inf }\limits_{k \in N} \hat L({\hat w^k}) = \mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k})$.

1) 由定义可得$\mathop {\lim }\limits_{k \to \infty } d({\hat w^k},S({\hat w^0})) = 0.$

2) 设${\hat w^*} \in S({\hat w^0})$, 则存在$\left\{ {{{\hat w}^k}} \right\}$的收敛子列$\left\{ {{{\hat w}^{{k_j}}}} \right\}$收敛于$\left\{ {{{\hat w}^*}} \right\}$.$x_1^{k + 1}$的最优性条件可得

$\begin{equation} \hat L(x_1^{k + 1},x_2^k,{y^k},{\lambda ^k},x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) \le \hat L(x_1^*,x_2^k,{y^k},{\lambda ^k},x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}), \end{equation} $

$\mathop {\lim }\limits_{j \to \infty } {\hat w^{{k_j}}} = {\hat w^*}$, 可得

$\begin{matrix} & &\mathop {\lim }\limits_{j \to \infty } \sup \hat L(x_1^{{k_j} + 1},x_2^{{k_j}},{y^{{k_j}}},{\lambda ^{{k_j}}},x_1^{{k_j} + 1},x_2^{{k_j} + 1},{y^{{k_j} + 1}}) \\ & \le& \mathop {\lim }\limits_{j \to \infty } \sup \hat L(x_1^{{k_j} + 1},x_2^{{k_j} + 1},{y^{{k_j} + 1}},{\lambda ^{{k_j} + 1}},x_1^{{k_j} + 2},x_2^{{k_j} + 2},{y^{{k_j} + 2}}) \\ &\le &\hat L({{\hat w}^*}). \end{matrix}$

又由$\hat L( \cdot )$的下半连续性, 可知

$\begin{equation} \mathop {\lim }\limits_{j \to \infty } \inf \hat L({\hat w^{{k_j} + 1}}) \ge \hat L({\hat w^*}), \end{equation}$

结合(3.25)式和(3.26)式可得

$\begin{equation} \mathop {\lim }\limits_{j \to \infty } \hat L({\hat w^{{k_j} + 1}}) = \hat L({\hat w^*}). \end{equation}$

又由${\nabla _g}$的连续性和$\partial f$的闭性并对(3.21)式中序列$\left\{ {{{\hat w}^{{k_j} + 1}}} \right\}$取极限, 则有

$\begin{equation} \left\{ \begin{array}{ll} \nabla {f_1}(x_1^*) = - {\nabla _{{x_1}}}g(x_1^*,x_2^*,{y^*}) - A_1^{\rm T}{\lambda ^*},\\ \nabla {f_2}(x_2^*) = - {\nabla _{{x_2}}}g(x_1^*,x_2^*,{y^*}) - A_2^{\rm T}{\lambda ^*},\\ {\nabla _y}g(x_1^*,x_2^*,{y^*}) = - B_{}^{\rm T}{\lambda ^*},\\ {A_1}x_1^* + {A_2}x_2^* + Bx_2^* = c. \end{array} \right. \end{equation}$

${\hat w^*}$$\hat L( \cdot )$的稳定点.

3) 结合(3.25)式和${\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$单调递减可得$\mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k}) = \hat L({\hat w^*})$, 并有函数$\hat L( \cdot )$在非空紧集$S({\hat w^0})$上是常数, 以及$\mathop {\inf }\limits_{k \in N} \hat L({\hat w^k}) = \mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k})$成立. 证毕.

下面证明L-G-BADMM算法的收敛性.

定理3.2$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$是L-G-BADMM算法产生的有界序列, 假设1成立且$\hat L\left( {\hat w} \right)$ 满足K-L性质, 则

(1) $\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty $;

(2) 序列$\left\{ {{w^k}} \right\}$收敛到本文问题(1.3)的稳定点.

由定理3.1显然可得, 对于任意${\hat w^*} \in S({\hat w^0})$则有$\mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^{k + 1}}) = \hat L({\hat w^*})$. 下面我们分两种情况证明定理3.2(1).

1a) 若存在整数${k_0}$, 有$\hat L({\hat w^{{k_0}}}) = \hat L({\hat w^*})$, 又由引理3.4可得, 对任意的$k > {k_0}$, 使得

$\begin{eqnarray*} &&{\sigma _4}\left( {{{\left\| {x_1^{k + 1} - x_1^k} \right\|}^2} + {{\left\| {x_2^{k + 1} - x_2^k} \right\|}^2} + {{\left\| {{y^{k + 1}} - {y^k}} \right\|}^2}} \right) \\ & \le& \hat L({{\hat w}^k}) - \hat L({{\hat w}^{k + 1}}) \le \hat L({{\hat w}^{{k_0}}}) - \hat L({{\hat w}^*}) = 0 \end{eqnarray*} $

成立, 故对于任意的$k > {k_0}$, 则$x_1^{k + 1} = x_1^k,x_2^{k + 1} = x_2^k,{y^{k + 1}} = {y^k}$. 那么对于任意的$k > {k_0} + 1$, 我们有${\hat w^{k + 1}} = {\hat w^k}$, 故$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty. $

1b) 对任意${k}$均有$\hat L({\hat w^{{k}}}) = \hat L({\hat w^*})$, 因为$\mathop {\lim }\limits_{k \to \infty } d({\hat w^k},S({\hat w^0})) = 0$, 所以对于任意给定的$\varepsilon > 0$, 存在${k_1} > 0$, 当$k > {k_1}$时, 有$d({\hat w^k},S({\hat w^0})) < \varepsilon $. 又因为$\mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k}) = \hat L({\hat w^*})$, 所以对任意给定$\eta > 0$, 存在${k_2} > 0$, 且$k > {k_2}$时, 有$\hat L({\hat w^k}) < L({\hat w^*}) + \eta $. 那么当$k > \tilde k = \max \left\{ {{k_1},{k_2}} \right\}$, 有$d({\hat w^k},S({\hat w^0})) < \varepsilon,L({\hat w^*}) < \hat L({\hat w^k}) < L({\hat w^*}) + \eta $. 因为$S({\hat w^0})$为紧集且非空时, 函数$\hat L( \cdot )$在集合$S({\hat w^0})$上为常数, 所以由一致K-L性质, 对任意$k > \tilde k$, 有$\varphi '(\hat L({\hat w^k}) - L({\hat w^*}))d(0,\partial L({\hat w^*})) \ge 1,$$\forall k > \tilde k$, 再由引理3.4, 则有

$\begin{matrix} \frac{1}{{\varphi '(\hat L({{\hat w}^k}) - \hat L({{\hat w}^*}))}}&\le& d(0,\partial \hat L({{\hat w}^*})) \\ & \le &\gamma (\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ && + \left\| {x_1^k - x_1^{k - 1}} \right\|+ \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|). \end{matrix}$

由函数$\varphi $的凹性, 则有

$\begin{matrix} &&\varphi '(\hat L({\hat w^k}) - L({\hat w^*}))(\hat L({\hat w^k}) - L({\hat w^{k + 1}})) \\ &\le& \varphi (\hat L({\hat w^k}) - L({\hat w^*})) - \varphi (\hat L({\hat w^{k + 1}}) - L({\hat w^*})). \end{matrix}$

结合引理3.2, 则有

$\begin{matrix} {\sigma _4}{\left\| {{u^{k + 1}} - {u^k}} \right\|^2} &\le &\hat L({{\hat w}^k}) - \hat L({{\hat w}^{k + 1}}) \\ &\le& \frac{{\varphi (\hat L({{\hat w}^k}) - \hat L({{\hat w}^*})) - \varphi (\hat L({{\hat w}^{k + 1}}) - \hat L({{\hat w}^*}))}}{{\varphi '(\hat L({{\hat w}^k}) - \hat L({{\hat w}^*}))}} \\ &\le& \gamma (\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ &&+ \left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|){\Delta _{k,k + 1}}. \end{matrix}$

其中${\Delta _{k,k + 1}} = \varphi (\hat L({\hat w^k}) - \hat L({\hat w^*})) - \varphi (\hat L({\hat w^{k + 1}}) - \hat L({\hat w^*}))$, 上式等价于

$\begin{matrix} &&{\left\| {x_1^{k + 1} - x_1^k} \right\|^2} + {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + {\left\| {{y^{k + 1}} - {y^k}} \right\|^2} \\ & \le& \frac{\gamma }{{{\sigma _4}}}(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| + \left\| {x_1^k - x_1^{k - 1}} \right\| \\ && + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|){\Delta _{k,k + 1}}. \end{matrix}$

又由${\left( {a + b + c} \right)^2} \le 3\left( {{a^2} + {b^2} + {c^2}} \right)$, 可得

$\begin{eqnarray*} && 3(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|) \\ & \le& 3\sqrt 3 {({\left\| {x_1^{k + 1} - x_1^k} \right\|^2} + {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + {\left\| {{y^{k + 1}} - {y^k}} \right\|^2})^{\frac{1}{2}}} \\ &\le& 2(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ &&+\left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|{)^{\frac{1}{2}}}{(\frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}})^{\frac{1}{2}}} \\ & \le& \left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ & &+\left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\| + \frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}}, \end{eqnarray*}$

结合(3.32)式和基本不等式$2\sqrt {ab} \le a + b(a,b > 0)$可以得到上述结果. 对上式中, 取$k = \tilde k + 1,\cdots,m$, 并求和可得

$\begin{eqnarray*} && \sum\limits_{k = \tilde k + 1}^m {3(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|)} \\ & \le& \sum\limits_{k = \tilde k + 1}^m {(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|} \\ &&+ \left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\| + \frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}}), \end{eqnarray*}$

由于$\varphi (\hat L({\hat w^{m + 1}}) - L({\hat w^*})) > 0$, 左右移项再令$m \to \infty $, 则有

$\begin{matrix} &&\sum\limits_{k = \tilde k + 1}^m {\left\| {x_1^{k + 1} - x_1^k} \right\| + } \sum\limits_{k = \tilde k + 1}^m {\left\| {x_2^{k + 1} - x_2^k} \right\|} + \sum\limits_{k = \tilde k + 1}^m {\left\| {{y^{k + 1}} - {y^k}} \right\|} \\ & \le& \sum\limits_{k = \tilde k + 1}^m {(\left\| {x_1^k - x_1^{k - 1}} \right\| - \left\| {x_1^{k + 1} - x_1^k} \right\|)} + \sum\limits_{k = \tilde k + 1}^m {(\left\| {x_2^k - x_2^{k - 1}} \right\|} - \left\| {x_2^{k + 1} - x_2^k} \right\|) \\ & & + \sum\limits_{k = \tilde k + 1}^m {(\left\| {{y^k} - {y^{k - 1}}} \right\| - \left\| {{y^{k + 1}} - {y^k}} \right\|)} + \sum\limits_{k = \tilde k + 1}^m {\frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}}} \\ &\le& \left\| {x_1^{\tilde k + 1} - x_1^{\tilde k}} \right\| - \left\| {x_1^{m + 1} - x_1^m} \right\| + \left\| {x_2^{\tilde k + 1} - x_2^{\tilde k}} \right\| - \left\| {x_2^{m + 1} - x_2^m} \right\| \\ & & + \left\| {{y^{\tilde k + 1}} - {y^{\tilde k}}} \right\| - \left\| {{y^{m + 1}} - {y^m}} \right\| + \frac{{27\gamma }}{{4{\sigma _4}}}( \varphi (\hat L({{\hat w}^{\tilde k + 1}}) - \hat L({{\hat w}^*})) \\ & \le& \left\| {x_1^{\tilde k + 1} - x_1^{\tilde k}} \right\| + \left\| {x_2^{\tilde k + 1} - x_2^{\tilde k}} \right\| + \left\| {{y^{\tilde k + 1}} - {y^{\tilde k}}} \right\|+ \frac{{27\gamma }}{{4{\sigma _4}}}\varphi (\hat L({{\hat w}^{\tilde k + 1}}) - \hat L({{\hat w}^*})). \end{matrix}$

因此, 我们有

$\sum\limits_{k = \tilde k + 1}^m {(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|)} < + \infty. $

结合引理3.1我们有

$\sum\limits_{k = \tilde k + 1}^m {(\left\| {w_{}^{k + 1} - w_{}^k} \right\|)} < + \infty, $

又因为$\tilde k$的任意性可得

$\sum\limits_{k = 0}^{ + \infty } {\left\| {{w^{k + 1}} - {w^k}} \right\|} < + \infty. $

(2) 根据上述证明以及$\left\{ {{w^k}} \right\}$是柯西序列, 可得$\left\{ {{w^k}} \right\}$收敛, 又由定理3.1(2) 可知该定理显然成立. 定理3.2得证.

参考文献

Banerjee A, Merugu S, Dhillon I, Ghosh J.

Clustering with bregman divergences

Journal of Machine Learning Research, 2005, 6: 1705-1749

[本文引用: 1]

Bolte J, Daniilidis A, Ley O, et al.

Characterizations of Lojasiewicz inequalities and applications: subgradient flows, talweg, convexity

Transactions of the American Mathematical Society, 2010, 362(6): 3319-3363

DOI:10.1090/S0002-9947-09-05048-X      URL     [本文引用: 1]

Boyd S, Parikh N, Chu E, et al.

Distributed optimization and statistical learning via the alternating direction method of multipliers

Foundations Trends in Machine Learning, 2010, 3(1): 1-122

DOI:10.1561/2200000016      URL     [本文引用: 1]

Bregman L M.

The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming

USSR Computational Mathematics and Mathematical Physics, 1967, 7(3): 200-217

[本文引用: 1]

Chao M T, Cheng C Z, Zhang H B.

A linearized alternating direction method of multipliers with substitution procedure

Asia-Pacific Journal of Operational Research, 2015, 32(3): 1550011

DOI:10.1142/S0217595915500116      URL    

Chao M T, Deng Z, Jian J B.

Convergence of linear Bregman ADMM for nonconvex and nonsmooth problems with nonseparable structure

Complexity, 2020, 2020: Art ID 6237942

[本文引用: 5]

Chao M T, Zhang Y, Jian J B.

An inertial proximal alternating direction method of multipliers for nonconvex optimization

International Journal of Computer Mathematics, 2021, 98(6): 1199-1217

DOI:10.1080/00207160.2020.1812585      URL     [本文引用: 3]

Chen C H, He B S, Ye Y Y, et al.

The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent

Mathematical Programming, 2016, 155: 57-79

DOI:10.1007/s10107-014-0826-5      URL     [本文引用: 1]

Deng W, Yin W T.

On the global and linear convergence of the generalized alternating direction method of multipliers

Journal of Scientific Computing, 2016, 66(3): 889-916

DOI:10.1007/s10915-015-0048-x      URL     [本文引用: 1]

Feng J K, Zhang H B, Cheng C Z, et al.

Convergence analysis of L-ADMM for multi-block linear-constrained separable convex minimization problem

Journal of the Operations Research Society of China, 2015, 3(4): 563-579

DOI:10.1007/s40305-015-0084-0      URL     [本文引用: 2]

Gabay D, Mercier B.

A dual algorithm for the solution of nonlinear variational problems via finite element approximation

Computers and Mathematics with Applications, 1976, 2(1): 17-40

DOI:10.1016/0898-1221(76)90003-1      URL     [本文引用: 1]

Glowinski R, Marroco A.

Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires

Journal of E-quine Veterinary Science, 1975, 2(2): 41-76

[本文引用: 1]

Guo K, Han D R, Wang D Z W, et al.

Convergence of ADMM for multi-block nonconvex separable optimization models

Frontiers of Mathematics in China, 2017, 12(5): 1139-1162

DOI:10.1007/s11464-017-0631-6      [本文引用: 1]

For solving minimization problems whose objective function is the sum of two functions without coupled variables and the constrained function is linear, the alternating direction method of multipliers (ADMM) has exhibited its efficiency and its convergence is well understood. When either the involved number of separable functions is more than two, or there is a nonconvex function, ADMM or its direct extended version may not converge. In this paper, we consider the multi-block separable optimization problems with linear constraints and absence of convexity of the involved component functions. Under the assumption that the associated function satisfies the Kurdyka- Lojasiewicz inequality, we prove that any cluster point of the iterative sequence generated by ADMM is a critical point, under the mild condition that the penalty parameter is sufficiently large. We also present some sufficient conditions guaranteeing the sublinear and linear rate of convergence of the algorithm.

Guo K, Hand D R, Wu T T.

Convergence of alternating direction method for minimizing sum of two nonconvex functions with linear constraints

International Journal of Computer Mathematics, 2017, 94(8): 1653-1669

DOI:10.1080/00207160.2016.1227432      URL     [本文引用: 1]

Guo K, Wang X.

Convergence of generalized alternating direction method of multipliers for nonseparable nonconvex objective with linear constraints

Journal of Mathematical Research with Applications, 2018, 38(5): 523-540

[本文引用: 1]

Hong M Y, Lou Z Q, Razaviyayn M.

Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems

SIAM Journal on Optimization, 2016, 26(1): 337-364

DOI:10.1137/140990309      URL    

Li M, Sun D F, Toh K C.

A convergent 3-blocksemi-proximal ADMM for convex minimization problems with one strongly convex block

Asia-Pacific Journal of Operational Research, 2015, 32(4): 1550024

DOI:10.1142/S0217595915500244      URL     [本文引用: 1]

Rockafellar R T, Wets R J B.

Variational Analysis

Berlin: Springer Science and Business Media, 2009

Wang F H, Cao W F, Xu Z B.

Convergence of multi-block Bregman ADMM for nonconvex composite problems

Science China Information Sciences, 2018, 61(12): 1-12

[本文引用: 1]

Wang F H, Xu Z B, Xu H K.

Convergence of Bregman alternating direction method with multipliers for nonconvex composite problems

arXiv: 1410.8625

[本文引用: 3]

Wang H H, Banerjee A.

Bregman alternating direction method of multipliers

//Ghahramani Z, Welling M, Cortes C, et al. Advances in Neural Information Processing Systems, 2014: 2816-2824

[本文引用: 3]

Xu Z B, Chang X Y, Xu F M, Zhang H.

$L_\frac{1}{2}$ regularization: a thresholding representation theory and a fast solver

IEEE Transactions on Neural Networks and Learning Systems, 2012, 23: 1013-1027

DOI:10.1109/TNNLS.2012.2197412      URL    

Yang W H, Han D.

Linear Convergence of the alternating direction method of multipliers for a class of convex optimization problems

SIAM Journal on Numerical Analysis, 2016, 54(2): 625-640

DOI:10.1137/140974237      URL     [本文引用: 1]

Yashtini M.

Multi-block nonconvex nonsmooth proximal ADMM: Convergence and rates Under Kurdyka-ojasiewicz property

Journal of Optimization Theory and Applications, 2021, 190(3): 966-998

DOI:10.1007/s10957-021-01919-7      URL     [本文引用: 1]

Zeng J S, Fang J, Xu Z B.

Sparse SAR imaging based on $L_\frac{1}{2}$ regularization

Sci China Infor Sci, 2012, 55: 1755-1775

Zeng J S, Lin S B, Wang Y, et al.

$L_\frac{1}{2}$ Regularization: Convergence of iterative half thresholding algorithm

IEEE Transactions on Signal Processing, 2014, 62(9): 2317-2329

DOI:10.1109/TSP.2014.2309076      URL    

Zeng J S, Xu Z B, Zhang B, et al.

Accelerated $L_\frac{1}{2}$ regularization based SAR imaging via BCR and reduced Newton skills

Signal Processing, 2013, 93: 1831-184

DOI:10.1016/j.sigpro.2012.12.017      URL    

/