1 引言
和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题
(1.1) $\begin{equation} \min \limits_{x \in X,y \in Y} f(x) + g(y)~~{\rm s.t.~}Ax + By = c, \end{equation}$
其中, $A \in {\Bbb R^{m \times {n_1}}},~B \in {\Bbb R^{m \times {n_2}}},~f:{\Bbb R^{{n_1}}} \to \Bbb R\bigcup {\left\{ { + \infty } \right\}},~g: {\Bbb R^{{n_2}}} \to \Bbb R,~X \subset {\Bbb R^{{n_1}}}, ~Y \subset {\Bbb R^{{n_2}}},$ $c \in \Bbb R$ . 提出Bregman ADMM算法(简记为BADMM算法)
(1.2) $\begin{equation} \left\{ \begin{array}{ll} {x_{k + 1}} = \mathop {\arg \min }\limits_{x \in X} {L_\beta }(x,{y_k},{\lambda _k}) + {\Delta_\phi }(x,{x_k}), \\ {y_{k + 1}} = \mathop {\arg \min }\limits_{y \in Y} {L_\beta }({x_{k + 1}},y,{\lambda _k}) + {\Delta_\psi }(y,{y_k}), \\ {\lambda _{k + 1}} = {\lambda _k} + \beta (A{x_{k + 1}} + B{y_{k + 1}} - c). \\ \end{array} \right. \end{equation}$
其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] .
(1.3) $\begin{equation} \mathop {\min }\limits_{{x_i} \in X,y \in Y} {f_1}({x_1}) + {f_2}({x_2}) + g({x_1},{x_2},y)~~{\rm s.t.~}{A_1}{x_1} + {A_2}{x_2} + By = c, \end{equation}$
其中 ${A_i} \in {\Bbb R^{m \times {n_i}}},~B \in {\Bbb R^{m \times n}},~X \subset {\Bbb R^{{n_i}}},~Y \subset {\Bbb R^n},~c \in {\Bbb R^m},~{f_i}:{\Bbb R^{{n_i}}} \to \Bbb R\bigcup {\left\{ { + \infty } \right\}}$ $ (i = 1,2)$ 是非凸可微函数且 $g: {\Bbb R^{{n_1} + {n_2} + n}} \to \Bbb R$ 是可微函数. 问题(1.3)的增广拉格朗日函数为
$\begin{eqnarray*} {L_\beta }({x_1},{x_2},y,\lambda ) &=& {f_1}({x_1}) + {f_2}({x_2}) + g({x_1},{x_2},y) + \langle \lambda,{A_1}{x_1} + {A_2}{x_2} + By - c \rangle \\ &&+ \frac{\beta }{2}\left\| {{A_1}{x_1} + {A_2}{x_2} + By - c} \right\|_2^2. \end{eqnarray*}$
Wang等[21 ] 提出用Bregman散度代替增广拉格朗日函数中的二次惩罚项得到广义BADMM算法. 受文献[21 ]的启发, 对于问题(1.3)我们得到新的增广拉格朗日函数
$\begin{eqnarray*}{\hat L_\beta }({x_1},{x_2},y,\lambda )& =& {f_1}({x_1}) + {f_2}({x_2}) + g({x_1},{x_2},y) + \langle \lambda,{A_1}{x_1} + {A_2}{x_2} + By - c \rangle \\ && + \beta {\Delta _\psi }({A_1}{x_1} + {A_2}{x_2} + By,c). \end{eqnarray*}$
Feng等[10 ] 引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21 ]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3 ] 和广义ADMM[9 ] 通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6 ] 针对三块不可分的非凸优化问题提出一种线性化BADMM算法.
本文针对问题(1.3)提出一种具有线性化技术的广义BADMM算法(L-G-BADMM)
(1.4) $\begin{equation} \left\{ \begin{array}{ll} x_1^{k + 1} =& \mathop {\arg \min }\limits_{{x_1} \in {X_1}} {f_1}({x_1}) + {({x_1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}{x_1} + {A_2}x_2^k + B{y^k} - c \rangle\\ & + \beta {\Delta _\psi }({A_1}{x_1} + {A_2}x_2^k + B{y^k},c) + {\Delta _{{\varphi _1}}}({x_1},x_1^k), \\ x_2^{k + 1} =&\mathop {\arg \min }\limits_{{x_2} \in {X_2}} {f_2}({x_2}) + {({x_2} - x_2^k)^{\rm T}}{\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) + \langle {\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{} + B{y^k} - c \rangle \\ & + \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{} + B{y^k},c) + {\Delta _{{\varphi _2}}}({x_2},x_2^k) \\ {y^{k + 1}} =&\mathop {\arg \min }\limits_{y \in Y} {(y - {y^k})^{\rm T}}{\nabla _y}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + By - c \rangle \\ & + \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + By,c) + {\Delta _\phi }(y,{y^k}) \\ {\lambda ^{k + 1}}= & {\lambda ^k} + \beta ({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c). \end{array} \right. \end{equation} $
其中${\Delta_\psi },~{\Delta_{{\varphi _1}}},~{\Delta_{{\varphi _2}}},~{\Delta_\phi }$ 分别是函数$\psi,{\varphi _1},{\varphi _2},\phi $ 导出的Bregman散度. 文献[6 ]提出的L-BADMM算法采用的是二范数作为二次惩罚项, 而本文算法中的二次惩罚项则使用的是Bregman散度. 由定义2.4可知二范数是Bregman散度的特例, 因此这对于经典的ADMM 算法来说具有更广泛的意义. 目前, 在目标函数非凸情况下, 还没有二次惩罚项不是二范数的理论证明, 这也是本文将要解决的问题.
2 预备知识
定义2.1 [20 ] $f:{\Bbb R^n} \to \Bbb R\bigcup{\left\{ + \infty \right\} }$ 为下半连续函数.
(1)给定 $x \in {\rm dom}f$ , $f$ 在 $x$ 处的 Frechet 次微分记做 $\hat \partial f(x)$ , 对所有的 $\mu \in {\Bbb R^n}$ 都满足
$\hat \partial f(x) =\bigg \{ \mu \in {\Bbb R^n}:\frac{{f(y) - f(x) - \langle \mu,y - x\rangle }}{{\parallel y - x\parallel }} \ge 0\bigg\}. $
$\partial f(x) = \{ \mu \in {\Bbb R^n}:\exists {x^k} \to x,f({x^k}) \to f(x),{\mu ^k} \in \hat \partial f(x) \to \mu,k \to \infty \}, $
则称 $f$ 在 $x$ 处的极限次微分或者简单次微分, 记做 $\partial f(x)$ .
(3)如果 ${x_0}$ 称为函数$f$ 的稳定点或者临界点, 则 $0 \in \partial {f(x_0)}$ .
定义2.2 [20 ] (鞍点定理) ${w^ * } = ({x^ * }, {y^ * }, {\lambda ^ * })$ 是拉格朗日函数 $L$ 的稳定点, 则满足
$\left\{ \begin{array}{ll} A{x^*} = B{x^*},\\ - \nabla f({x^*}) = {A^T}{\lambda ^*},\\ {A^T}{\lambda ^*} \in \partial g({y^*}). \end{array} \right.$
定义2.3 [2 ] 设函数$f: {\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$ 是真且下半连续, 如果存在 $\eta > 0, \delta > 0, \varphi \in {A_n}$ , 使得对所有
$x \in O({x_0},\delta ) \cap \left\{ {x:f({x_0}) < f(x) < f({x_0}) + \eta } \right\},$
$\varphi '(f(x) - f({x_0})){\rm dist}(0,\partial f(x)) \ge 1,$
则称函数 $f$ 在 ${x_0}$ 处满足Kurdyka-Łojasiewicz性质(K-L 性质). 其中, $\varphi $ 表示 $\left[ {0,\eta } \right] \to {\Bbb R^ + }$ 上的函数, ${\rm dist}(0,\partial f(x)) = \inf \left\{ {\left\| {{x_0} - y} \right\|:y \in \partial f(x)} \right\}$ 且满足
(a) $\varphi $ 在$\left[ {0,\eta } \right]$ 上连续,
(b) $\varphi $ 在 $\left[ {0,\eta } \right]$ 上光滑凹,
(c) $\varphi (0) = 0, \varphi '(x) > 0,\forall x \in (0,\eta )$ .
引理2.1 [6 ] (一致K-L性质) 设 $\Omega$ 是紧集, $f:{\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$ 是真且下半连续函数, 若函数$f$ 在集合上取常数$f({x_0})$ , 并在$\Omega $ 中任意一点满足K-L性质, 则存在$\eta > 0, \delta > 0, \varphi \in {A_n}$ , 使得对所有
$x \in O({x_0},\delta ) \cap \left\{ {x:f({x_0}) < f(x) < f({x_0}) + \eta } \right\},$
$\varphi '(f(x) - f({x_0}))(0,\partial f(x)) \ge 1.$
定义2.4 [4 ] 对于一个凸可微的函数$\Phi $ , 定义Bregman散度如下
${\Delta _\Phi }(x,y) = \Phi (x) - \Phi (y) - \left\langle {\nabla \Phi (y),x - y} \right\rangle.$
特别地, 如果$\Phi (x) = \left\| x \right\|_Q^2 = {x^{\rm T}}Qx$ , 则有${\Delta _\Phi }(x,y) = \left\| {x - y} \right\|_Q^2$ , 其中$Q$ 是对称半正定矩阵.
定义2.5 (强凸性) 设$f:{\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$ 是真函数, 如果存在$\delta > 0$ , 使得
$f(\lambda x + (1 - \lambda )y) \le \lambda f(x) + (1 - \lambda )f(y) - \frac{{\delta \lambda (1 - \lambda )}}{2}\left\| {x - y} \right\|_{}^2,\forall \lambda \in (0,1).$
引理2.2 [1 ] ${\Delta _\phi }$ 是可微的凸函数, 则有
(i) (非负性) ${\Delta _\Phi }(x,y) \ge 0,{\Delta _\Phi }(x,x) = 0$ .
(ii) (凸性) ${\Delta _\Phi }(x,y)$ 对$x$ 具有凸性.
(iii) (强凸性) $\Phi $ 如果是一个强凸函数, 强凸系数为$\delta $ , 则${\Delta _\Phi }(x,y) \ge \frac{\delta }{2}{\left\| {x - y} \right\|^2}$ .
引理2.3 [6 ] (下降性引理) 设$g:{\Bbb R^n} \to \Bbb R$ 是利普西茨可微函数, 其利普西茨系数为${l_g}$ , 则对任意$x,y \in {\Bbb R^n}$ 都有
$\left| {g(y) - g(x) - \left\langle {\nabla g(x),y - x} \right\rangle } \right| \le \frac{{{l_g}}}{2}{\left\| {y - x} \right\|^2}.$
假设1 (i) $\nabla g$ ,$\nabla {f_1},\nabla {f_2}$ 关于${l_g}$ ,${l_{{f_1}}},{l_{{f_2}}}$ 是利普希茨连续;
(ii) 存在$\mu > 0$ , 使得${B^{\rm T}}B \succ \frac{1}{\mu }I$ ;
(iii) ${\nabla _\psi }, {\nabla _{{\varphi _1}}},{\nabla _{{\varphi _2}}},{\nabla _\phi }$ 满足利普希茨连续, 且利普希茨系数分别为${l_\psi },{l_{{\varphi _1}}},{l_{{\varphi _2}}},{l_\phi };$
(iv) $\psi, {\varphi _1}, {\varphi _2},\phi $ 是强凸函数, 所对应的强凸系数分别为${\delta _\psi }, {\delta _{{\varphi _1}}},{ \delta _{{\varphi _2}}}, {\delta _\phi }$ , 且${\delta _{{\varphi _1}}},{\delta _{{\varphi _2}}},{\delta _\phi }$ 均大于${l_g}$ ;
(v) ${A_1},{A_2}$ 均为正交矩阵, 即${A_1}^{\rm T}{A_1} = I,{A_2}^{\rm T}{A_2} = I;$
$ \frac{{{\delta _{{\varphi _1}}} - l_g^{}}}{2} > \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu,\frac{{{\delta _{{\varphi _2}}} - l_g^{}}}{2} > \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu, $
$ \frac{{{\delta _\phi } - l_g^{}}}{2} > \frac{{4l_g^2\mu + 8l_\phi ^2\mu }}{\beta } - 4l_\psi ^2{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu. $
3 收敛性分析
引理3.1 $\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$ 是L-G-BADMM算法产生的点列, 若假设1(ii)成立, 对任意的$k \in N$ , 我们有
(3.1) $\begin{matrix} {\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2} & \le&4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2} \\ & &+ (4l_\phi ^2\mu + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2} + 4l_g^2\mu {\left\| {x_1^{k + 1} - x_1^k} \right\|^2} \\ &&+4l_g^2\mu {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + (4l_g^2\mu + 4l_\phi ^2\mu ){\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$
(3.2) $\begin{equation} {\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2} \le \mu {\left\| {B({\lambda ^{k + 1}} - {\lambda ^k})} \right\|^2}. \end{equation}$
(3.3) $\begin{matrix} {B^{\rm T}}{\lambda ^k}&=& {\nabla _\phi }({y^k}) - {\nabla _\phi }({y^{k + 1}}) - {\nabla _y}g(x_1^k,x_2^k,{y^k}) \\ &&+ \beta {B^{\rm T}}({\nabla _\psi }(c) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}})), \end{matrix}$
(3.4) $\begin{matrix} {B^{\rm T}}{\lambda ^{k + 1}} &= &{\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^{k + 2}}) - {\nabla _y}g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) \\ & &+ \beta {B^{\rm T}}({\nabla _\psi }(c) - {\nabla _\psi }({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}})), \end{matrix}$
(3.5) $\begin{matrix} && {\left\| {{B^{\rm T}}({\lambda ^{k + 1}} - {\lambda ^k})} \right\|^2} \\ &=&\left\| {{\nabla _y}g(x_1^k,x_2^k,{y^k}) - {\nabla _y}g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}})} \right. + {\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^{k + 2}})+ {\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^k}) \\ &&{\left. { + \beta {B^{\rm T}}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}}))} \right\|^2} \\ &\le & 4{\left[ {\left\| {{\nabla _y}g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - {\nabla _y}g(x_1^k,x_2^k,{y^k})} \right\|} \right.^2} \\ &&+ {\left\| {{\nabla _\phi }({y^{k + 2}}) - {\nabla _\phi }({y^{k + 1}})} \right\|^2} + {\left\| {{\nabla _\phi }({y^{k + 1}}) - {\nabla _\phi }({y^k})} \right\|^2} \\ &&+ {\beta ^2}{{\left\| {{B^{\rm T}}} \right\|}^2}\left. {{{\left\| {({\nabla _\psi }({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}})} \right\|}^2}} \right] \\ &\le & 4l_g^2{\left\| {(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - (x_1^k,x_2^k,{y^k})} \right\|^2} + 4l_\phi ^2{\left\| {({y^{k + 2}} - {y^{k + 1}})} \right\|^2} + 4l_\phi ^2{\left\| {({y^{k + 1}} - {y^k})} \right\|^2} \\ && + 4{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}l_\psi ^2{\left\| {({A_1}x_1^{k + 2} + {A_2}x_2^{k + 2} + B{y^{k + 2}}) - ({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}})} \right\|^2}. \end{matrix}$
最后一步是由${\nabla _g},{\nabla _{_\phi }},{\nabla _\psi }$ 的利普西茨连续性得到的. 下面我们将上式代入(3.2)式可得
(3.6) $\begin{matrix} {\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2} &\le & \mu {\left\| {B({\lambda ^{k + 1}} - {\lambda ^k})} \right\|^2} \\ &\le & 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2} \\ && + 4l_g^2\mu {\left\| {x_1^{k + 1} - x_1^k} \right\|^2} + (4l_\phi ^2\mu + 4l_\psi ^2{\beta ^2}{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2} \\ &&+ 4l_g^2\mu {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + (4l_g^2\mu + 4l_\phi ^2\mu ){\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$
注3.1 由范数的三角不等式性$\left\| {A + B} \right\| \le \left\| A \right\| + \left\| B \right\|$ , 再将其左右两边平方可得
${\left\| {A + B} \right\|^2} \le {\left\| A \right\|^2} + {\left\| B \right\|^2} + 2\left\| A \right\|\left\| B \right\|. $
又因为基本不等式$2\left\| A \right\|\left\| B \right\| \le {\left\| A \right\|^2} + {\left\| B \right\|^2}$ , 所以综上可得
${\left\| {A + B} \right\|^2} \le 2({\left\| A \right\|^2} + {\left\| B \right\|^2}). $
设$\left\{ {{w^k} = {{(x_1^k,x_2^k,{y^k},{\lambda ^k})}^{\rm T}}} \right\}$ 是L-G-BADMM算法产生的点列, 为了证明${\hat L_\beta }({w^k})$ 的单调性, 下面构造函数$\hat L({\hat w^k})$
$\hat L({x_1},{x_2},y,\lambda,{\hat x_1},{\hat x_2},\hat y) = {\hat L_\beta }({x_1},{x_2},y,\lambda ) + {\sigma _1}{\left\| {\hat x_1^{} - x_1^{}} \right\|^2} +{\sigma _2}{\left\| {\hat x_2^{} - x_2^{}} \right\|^2} + {\sigma _3}{\left\| {\hat y - y} \right\|^2}, $
${\sigma _1} = - \left( {\frac{{{\delta _{{\varphi _1}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{{\left\| {{B^{\rm T}}} \right\|}^2}{{\left\| {{A_1}} \right\|}^2}\mu } \right), $
$ {\sigma _2} = - \left( {\frac{{{\delta _{{\varphi _2}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{{\left\| {{B^{\rm T}}} \right\|}^2}{{\left\| {{A_2}} \right\|}^2}\mu } \right), $
$ {\sigma _3} = - \left( {\frac{{{\delta _\phi } - l_g^{}}}{2} - \frac{{4l_g^2\mu + 8l_\phi ^2\mu }}{\beta } - 4l_\psi ^2\beta{{\left\| {{B^{\rm T}}} \right\|}^2}{{\left\| B \right\|}^2}\mu } \right). $
下面证明序列$ {\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$ 的单调性.
引理3.2 设$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$ 是L-G-BADMM算法产生的序列且 $\{ {u^k} = (x_1^k,x_2^k,$ ${y^k}) \}$ 是有界序列, 序列对任意的$k \in N$ , 下面有
(3.7) $\begin{equation} \hat L({\hat w^{k + 1}}) \le \hat L({\hat w^k}) - {\sigma _4}\left( {{{\left\| {x_1^{k + 1} - x_1^k} \right\|}^2} + {{\left\| {x_2^{k + 1} - x_2^k} \right\|}^2} + {{\left\| {{y^{k + 1}} - {y^k}} \right\|}^2}} \right), \end{equation} $
(3.8) $\begin{equation} {\sigma _4} = \min \left\{ \begin{array}{c} \frac{{{\delta _{{\varphi _1}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu,\frac{{{\delta _{{\varphi _2}}} - l_g^{}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta{\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu,\\ \frac{{{\delta _\phi } - l_g^{}}}{2} - \frac{{4l_g^2\mu + 8l_\phi ^2\mu }}{\beta } - 4l_\psi ^2\beta{\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu. \end{array} \right\}. \end{equation} $
证 显然(1.4)式中${x_1}$ 子问题变换可得
(3.9) $\begin{matrix} &&{f_1}(x_1^{k + 1}) + {(x_1^{k + 1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k}, {A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k} - c \rangle \\ &&+ \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k},c) \\ & \le & {f_1}(x_1^k) + \langle{\lambda ^k},{A_1}x_1^k + {A_2}x_2^k + B{y^k} - c\rangle+ \beta {\Delta _\psi }({A_1}x_1^k + {A_2}x_2^k + B{y^k},c) \\ && - {\Delta _{{\varphi _1}}}(x_1^{k + 1},x_1^k), \end{matrix}$
我们可以由假设1(iv)和引理2.2(iii)得到${\Delta _{{\varphi _1}}}(x_1^{k + 1},x_1^k) \ge \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2}$ , 于是有
(3.10) $\begin{matrix} &&{f_1}(x_1^{k + 1}) + {(x_1^{k + 1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k} - c \rangle \\ & &+ \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k},c) \\ &\le &{f_1}(x_1^k) + \langle{\lambda ^k},{A_1}x_1^k + {A_2}x_2^k + B{y^k} - c \rangle + \beta {\Delta _\psi }({A_1}x_1^k + {A_2}x_2^k + B{y^k},c) \\ &&- \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2}. \end{matrix}$
同理(1.4)式中${x_2}$ 和${y}$ 子问题类似可以得到上式的式子, 我们将其相加变换可得
(3.11) $\begin{matrix} &&{f_1}(x_1^{k + 1}) + {f_2}(x_2^{k + 1}) + g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) + \langle{\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c\rangle \\ &&+ \beta {\Delta _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}},c) \\ & \le & {f_1}(x_1^k) + {f_2}(x_2^{k + 1}) + g(x_1^k,x_2^k,{y^k}) + \langle{\lambda ^k},{A_1}x_1^k + {A_2}x_2^k + B{y^k} - c\rangle \\ &&+ \beta {\Delta _\psi }({A_1}x_1^k + {A_2}x_2^k + B{y^k},c) + [g(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - g(x_1^k,x_2^k,{y^k}) \\ &&- ({(x_1^{k + 1} - x_1^k)^{\rm T}}{\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + {(x_2^{k + 1} - x_2^k)^{\rm T}}{\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) \\ & &+ {({y^{k + 1}} - {y^k})^{\rm T}}{\nabla _y}g(x_1^k,x_2^k,{y^k})) - \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2} - \frac{{{\delta _{{\varphi _2}}}}}{2}{\left\| {x_2^{k + 1} - x_2^k} \right\|^2} \\ & &- \frac{{{\delta _\phi }}}{2}{\left\| {{y^{k + 1}} - {y^k}} \right\|^2}, \end{matrix}$
(3.12) $\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^k}) \\ &\le &{{\hat L}_\beta }(x_1^k,x_2^k,{y^k},{\lambda ^k}) + \frac{{{l_g}}}{2}{\left\| {(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) - (x_1^k,x_2^k,{y^k})} \right\|^2} \\ && - \frac{{{\delta _{{\varphi _1}}}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2} - \frac{{{\delta _{{\varphi _2}}}}}{2}{\left\| {x_2^{k + 1} - x_2^k} \right\|^2} - \frac{{{\delta _\phi }}}{2}{\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$
(3.13) $\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) \\ &\le & {{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^k}) + \langle{\lambda ^{k + 1}} - {\lambda ^k},{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c\rangle \\ &=& {{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^k}) + \frac{1}{\beta }{\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2}, \end{matrix}$
(3.14) $\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) \\ &\le & {{\hat L}_\beta }(x_1^k,x_2^k,{y^k},{\lambda ^k}) - \frac{{{\delta _{{\varphi _1}}} - {l_g}}}{2}{\left\| {x_1^{k + 1} - x_1^k} \right\|^2}- \frac{{{\delta _{{\varphi _2}}} - {l_g}}}{2}{\left\| {x_2^{k + 1} - x_2^k} \right\|^2} \\ &&- \frac{{{\delta _\phi } - {l_g}}}{2}{\left\| {{y^{k + 1}} - {y^k}} \right\|^2} + \frac{1}{\beta }{\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|^2}. \end{matrix}$
(3.15) $\begin{matrix} &&{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} \\ &&- 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2}- (\frac{{4l_\phi ^2\mu }}{\beta } + 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2} \\ &\le &{{\hat L}_\beta }(x_1^k,x_2^k,{y^k},{\lambda ^k}) - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 1} - x_1^k} \right\|^2} \\ && - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} - (\frac{{4l_\phi ^2\mu }}{\beta } + 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 1}} - {y^k}} \right\|^2} \\ & &- (\frac{{{\delta _{{\varphi _1}}} - {l_g}}}{2} - \frac{{4l_g ^2\mu }}{\beta } - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu ){\left\| {x_1^{k + 1} - x_1^k} \right\|^2} \\ & &- (\frac{{{\delta _{{\varphi _2}}} - {l_g}}}{2} - \frac{{4l_g ^2\mu }}{\beta } - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu ){\left\| {x_2^{k + 1} - x_2^k} \right\|^2} \\ & &- (\frac{{{\delta _\phi } - {l_g}}}{2} - \frac{{4l_g^2\mu }}{\beta } - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu - \frac{{8l_\phi ^2\mu }}{\beta }){\left\| {{y^{k + 1}} - {y^k}} \right\|^2}. \end{matrix}$
(3.16) $\begin{matrix} \hat L({{\hat w}^{k + 1}}) &= &\hat L(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}},\hat x_1^{k + 1},\hat x_2^{k + 1},{{\hat y}^{k + 1}}) \\ & = &{{\hat L}_\beta }(x_1^{k + 1},x_2^{k + 1},{y^{k + 1}},{\lambda ^{k + 1}}) - 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}} \right\|^2}\mu {\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\|^2} \\ & &-4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_2}} \right\|^2}\mu {\left\| {x_2^{k + 2} - x_2^{k + 1}} \right\|^2} \\ && - (\frac{{4l_\phi ^2\mu }}{\beta } + 4l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| B \right\|^2}\mu ){\left\| {{y^{k + 2}} - {y^{k + 1}}} \right\|^2}. \end{matrix}$
引理3.3 设$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$ 是L-G-BADMM算法产生的序列且 $\big\{ {u^k} = (x_1^k,x_2^k,$ ${y^k}) \big\}$ 是有界序列, 则
$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty, $
那么$\left\| {{w^{k + 1}} - {w^k}} \right\| \to 0,k \to \infty $ 成立.
证 首先, 我们需要证明序列$\left\{ {{{\hat w}^k}} \right\}$ 有界. 由(3.3)式可得
(3.17) $\begin{matrix} {\left\| {{B^{\rm T}}{\lambda ^k}} \right\|^2}&= &\left\| { - {\nabla _y}} \right.g(x_1^k,x_2^k,{y^k}) + {\nabla _\phi }({y^k}) - {\nabla _\phi }({y^{k + 1}}) \\ & &{\left. { + \beta {B^{\rm T}}({\nabla _\psi }(c) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}))} \right\|^2}. \end{matrix}$
(3.18) $\begin{matrix} {\left\| {{\lambda ^k}} \right\|^2}&\le& \mu {\left\| {{B^{\rm T}}{\lambda ^k}} \right\|^2} \\ &\le &3\mu {\left\| {{\nabla _y}g(x_1^k,x_2^k,{y^k})} \right\|^2}+ 3\mu l_\phi ^2{\left\| {{y^{k + 1}} - {y^k}} \right\|^2} \\ &&+ 3\mu l_\psi ^2\beta {\left\| {{B^{\rm T}}} \right\|^2}{\left\| {{A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c} \right\|^2}, \end{matrix}$
因此, 由$\left\{ {{u^k}} \right\}$ 有界可知$\left\{ {{\lambda ^k}} \right\}$ 有界, 那么我们得到序列$\left\{ {{w^k}} \right\}$ 与$\left\{ {{{\hat w}^k}} \right\}$ 有界.
其次, 我们将证明${\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$ 序列收敛, 由$\left\{ {{{\hat w}^k}} \right\}$ 有界可得$\left\{ {{{\hat w}^k}} \right\}$ 有收敛子列$\left\{ {{{\hat w}^{{k_j}}}} \right\}$ , 设其极限点为${\hat w^*}$ . 又由函数$\hat L( \cdot )$ 的下半连续性可得$\mathop {\lim }\limits_{j \to \infty } \inf \hat L({\hat w^{{k_j}}}) \ge \hat L({\hat w^*})$ , 则$\hat L({\hat w^{{k_j}}})$ 有下界, 故$\hat L({\hat w^{{k_j}}})$ 单调递减有下界, 则$\hat L({\hat w^{{k_j}}})$ 收敛, 从而${\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$ 收敛. 最后, 对(3.7)式两边$k$ 从1到$t$ 分别求和, 得到
${\sigma _4}\sum\limits_{k = 1}^t {({{\left\| {x_1^{k + 1} - x_1^k} \right\|}^2} + {{\left\| {x_2^{k + 1} - x_2^k} \right\|}^2} + {{\left\| {{y^{k + 1}} - {y^k}} \right\|}^2})} \le \hat L({\hat w^1}) - \hat L({\hat w^{t + 1}}) < + \infty.$
由${\sigma _4} > 0$ , 以及$t$ 的任意性可知
$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{u^{k + 1}} - {u^k}} \right\|}^2}} < + \infty, $
$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|}^2}} < + \infty. $
$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty, $
则显然$\left\| {{w^{k + 1}} - {w^k}} \right\| \to 0,k \to \infty $ 成立. 引理由此得证.
引理3.4 设$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$ 是L-G-BADMM算法产生的有界序列, 存在$\gamma > 0$ , 对于任意$k \in N$ , 使得
(3.19) $\begin{matrix} d\left( {0,\partial \hat L\left( {{{\hat w}^{k + 1}}} \right)} \right) &\le &\gamma (\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\| + \left\| {x_2^{k + 2} - x_2^{k + 1}} \right\| + \left\| {{y^{k + 2}} - {y^{k + 1}}} \right\| \\ & &+ \left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|). \end{matrix}$
证 通过上述对$\hat L\left( {\hat w} \right)$ 的定义, 我们可以得到
(3.20) $\begin{equation} \left\{ \begin{array}{ll} {\nabla _{{x_1}}}\hat L({{\hat w}^{k + 1}}) = \nabla {f_1}(x_1^{k + 1}) + {\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) + A_1^{\rm T}{\lambda ^{k + 1}}\\ + \beta A_1^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)) - 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}),\\ {\nabla _{{x_2}}}\hat L({{\hat w}^{k + 1}}) = \nabla {f_2}(x_2^{k + 1}) + {\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) + A_2^{\rm T}{\lambda ^{k + 1}}\\ + \beta A_2^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)) - 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}),\\ {\nabla _y}\hat L({{\hat w}^{k + 1}}) = {\nabla _y}g(x_1^k,x_2^k,{y^k}) + B_{}^{\rm T}{\lambda ^{k + 1}} - 2{\sigma _3}({y^{k + 2}} - {y^{k + 1}})\\ \ + \beta B_{}^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)),\\ {\nabla _{{{\hat x}_1}}}\hat L({{\hat w}^{k + 1}}) = 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}),\\ {\nabla _{{{\hat x}_2}}}\hat L({{\hat w}^{k + 1}}) = 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}),\\ {\nabla _{{{\hat y}_{}}}}\hat L({{\hat w}^{k + 1}}) = 2{\sigma _3}({y^{k + 2}} - {y^{k + 1}}),\\ {\nabla _{{\lambda _{}}}}\hat L({{\hat w}^{k + 1}}) = \frac{1}{\beta }({\lambda ^{k + 1}} - {\lambda ^k}). \end{array} \right. \end{equation}$
由L-G-BADMM算法中子问题的最优性条件整理可得
(3.21) $\begin{equation} \left\{ \begin{array}{ll} \nabla {f_1}(x_1^{k + 1}) = - {\nabla _{{x_1}}}g(x_1^k,x_2^k,{y^k}) - A_1^{\rm T}{\lambda ^k} + {\nabla _{{\varphi _1}}}(x_1^k) - {\nabla _{{\varphi _1}}}(x_1^{k + 1})\\ - \beta A_1^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k}) - {\nabla _\psi }(c)),\\ \nabla {f_2}(x_2^{k + 1}) = - {\nabla _{{x_2}}}g(x_1^k,x_2^k,{y^k}) - A_2^{\rm T}{\lambda ^k} + {\nabla _{{\varphi _2}}}(x_2^k) - {\nabla _{{\varphi _2}}}(x_2^{k + 1})\\ - \beta A_2^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^k}) - {\nabla _\psi }(c)),\\ {\nabla _y}g(x_1^k,x_2^k,{y^k}) = - B_{}^{\rm T}{\lambda ^k} + {\nabla _\phi }(y_{}^k) - {\nabla _\phi }(y_{}^{k + 1})\\ - \beta B_{}^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }(c)),\\ {\lambda ^{k + 1}} = {\lambda ^k} + \beta ({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}} - c). \end{array} \right. \end{equation}$
下面我们将(3.21)式代入(3.20)式, 化简可得
(3.22) $\begin{equation} \left\{ \begin{array}{ll} \rho _1^k = A_1^{\rm T}({\lambda ^{k + 1}} - {\lambda ^k}) - 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}) + {\nabla _{{\varphi _1}}}(x_1^k) - {\nabla _{{\varphi _1}}}(x_1^{k + 1})\\ + \beta A_1^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k})),\\ \rho _2^k = A_2^{\rm T}({\lambda ^{k + 1}} - {\lambda ^k}) - 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}) + {\nabla _{{\varphi _2}}}(x_2^k) - {\nabla _{{\varphi _2}}}(x_2^{k + 1})\\ + \beta A_2^{\rm T}({\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^k})),\\ \rho _3^k = B_{}^{\rm T}({\lambda ^{k + 1}} - {\lambda ^k}) - 2{\sigma _3}(y_{}^{k + 2} - y_{}^{k + 1}) + {\nabla _\phi }(y_{}^k) - {\nabla _\phi }(y_{}^{k + 1}),\\ \rho _4^k = 2{\sigma _1}(x_1^{k + 2} - x_1^{k + 1}),\\ \rho _5^k = 2{\sigma _2}(x_2^{k + 2} - x_2^{k + 1}),\\ \rho _6^k = 2{\sigma _3}({y^{k + 2}} - {y^{k + 1}}),\\ \rho _7^k = \frac{1}{\beta }({\lambda ^{k + 1}} - {\lambda ^k}). \end{array} \right. \end{equation}$
因此$(\rho _1^k,\rho _2^k,\rho _3^k,\rho _4^k,\rho _5^k,\rho _6^k,\rho _7^k) \in \partial \hat L({\hat w^k})$ , 则
$d\left( {0,\partial \hat L\left( {{{\hat w}^{k + 1}}} \right)} \right) \le \left\| {{{(\rho _1^k,\rho _2^k,\rho _3^k,\rho _4^k,\rho _5^k,\rho _6^k,\rho _7^k)}^{\rm T}}} \right\|. $
因为$\left\{ {{u^k}} \right\}$ 有界, 所以
(3.23) $\begin{matrix} &&d\left( {0,\partial \hat L\left( {{{\hat w}^{k + 1}}} \right)} \right) \\ &\le&{\gamma _1}(\left\| {x_1^{k + 2} - x_1^{k + 1}} \right\| + \left\| {x_2^{k + 2} - x_2^{k + 1}} \right\| + \left\| {{y^{k + 2}} - {y^{k + 1}}} \right\| + \left\| {{\lambda ^{k + 1}} - {\lambda ^k}} \right\|) \\ & &+\beta \left\| {A_1^{\rm T}} \right\|\left\| {{\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^k + B{y^k})} \right\| \\ &&+ \beta\left\| {A_2^{\rm T}} \right\|\left\| {{\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^{k + 1}}) - {\nabla _\psi }({A_1}x_1^{k + 1} + {A_2}x_2^{k + 1} + B{y^k})} \right\| \\ & &+\left\| {{\nabla _{{\varphi _1}}}(x_1^k) - {\nabla _{{\varphi _1}}}(x_1^{k + 1})} \right\| + \left\| {{\nabla _{{\varphi _2}}}(x_2^k) - {\nabla _{{\varphi _2}}}(x_2^{k + 1})} \right\| + \left\| {{\nabla _\phi }({y^k}) - {\nabla _\phi }({y^{k + 1}})} \right\|, \end{matrix}$
其中${\gamma _1} > 0$ . 又因为${\nabla _\psi },{\nabla _{{\varphi _1}}},{\nabla _{{\varphi _2}}},{\nabla _\phi },{\nabla _g}$ 满足利普希茨连续以及引理3.1可知, 存在$\gamma > 0$ , 使得(3.19)式成立. 证毕.
定理3.1 设$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$ 是L-G-BADMM算法产生的有界序列, 以及$S({\hat w^0})$ 是序列$\left\{ {{{\hat w}^k}} \right\}$ 的全体极限点集,
1) 故$\mathop {\lim }\limits_{k \to \infty } d({\hat w^k},S({\hat w^0})) = 0$ ;
2) 若$(x_1^*,x_2^*,{y^*},{\lambda ^*},\hat x_1^*,x_2^*,{y^*}) \in S({\hat w^0})$ , 则${\hat w^*}$ 是$\hat L( \cdot )$ 的稳定点;
3) 则$\hat L( \cdot )$ 在集合$S({\hat w^0})$ 上取值为常数且$\mathop {\inf }\limits_{k \in N} \hat L({\hat w^k}) = \mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k})$ .
证 1) 由定义可得$\mathop {\lim }\limits_{k \to \infty } d({\hat w^k},S({\hat w^0})) = 0.$
2) 设${\hat w^*} \in S({\hat w^0})$ , 则存在$\left\{ {{{\hat w}^k}} \right\}$ 的收敛子列$\left\{ {{{\hat w}^{{k_j}}}} \right\}$ 收敛于$\left\{ {{{\hat w}^*}} \right\}$ . 由$x_1^{k + 1}$ 的最优性条件可得
(3.24) $\begin{equation} \hat L(x_1^{k + 1},x_2^k,{y^k},{\lambda ^k},x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}) \le \hat L(x_1^*,x_2^k,{y^k},{\lambda ^k},x_1^{k + 1},x_2^{k + 1},{y^{k + 1}}), \end{equation} $
由$\mathop {\lim }\limits_{j \to \infty } {\hat w^{{k_j}}} = {\hat w^*}$ , 可得
(3.25) $\begin{matrix} & &\mathop {\lim }\limits_{j \to \infty } \sup \hat L(x_1^{{k_j} + 1},x_2^{{k_j}},{y^{{k_j}}},{\lambda ^{{k_j}}},x_1^{{k_j} + 1},x_2^{{k_j} + 1},{y^{{k_j} + 1}}) \\ & \le& \mathop {\lim }\limits_{j \to \infty } \sup \hat L(x_1^{{k_j} + 1},x_2^{{k_j} + 1},{y^{{k_j} + 1}},{\lambda ^{{k_j} + 1}},x_1^{{k_j} + 2},x_2^{{k_j} + 2},{y^{{k_j} + 2}}) \\ &\le &\hat L({{\hat w}^*}). \end{matrix}$
又由$\hat L( \cdot )$ 的下半连续性, 可知
(3.26) $\begin{equation} \mathop {\lim }\limits_{j \to \infty } \inf \hat L({\hat w^{{k_j} + 1}}) \ge \hat L({\hat w^*}), \end{equation}$
(3.27) $\begin{equation} \mathop {\lim }\limits_{j \to \infty } \hat L({\hat w^{{k_j} + 1}}) = \hat L({\hat w^*}). \end{equation}$
又由${\nabla _g}$ 的连续性和$\partial f$ 的闭性并对(3.21)式中序列$\left\{ {{{\hat w}^{{k_j} + 1}}} \right\}$ 取极限, 则有
(3.28) $\begin{equation} \left\{ \begin{array}{ll} \nabla {f_1}(x_1^*) = - {\nabla _{{x_1}}}g(x_1^*,x_2^*,{y^*}) - A_1^{\rm T}{\lambda ^*},\\ \nabla {f_2}(x_2^*) = - {\nabla _{{x_2}}}g(x_1^*,x_2^*,{y^*}) - A_2^{\rm T}{\lambda ^*},\\ {\nabla _y}g(x_1^*,x_2^*,{y^*}) = - B_{}^{\rm T}{\lambda ^*},\\ {A_1}x_1^* + {A_2}x_2^* + Bx_2^* = c. \end{array} \right. \end{equation}$
故${\hat w^*}$ 是$\hat L( \cdot )$ 的稳定点.
3) 结合(3.25)式和${\left\{ {\hat L({{\hat w}^k})} \right\}_{k \in N}}$ 单调递减可得$\mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k}) = \hat L({\hat w^*})$ , 并有函数$\hat L( \cdot )$ 在非空紧集$S({\hat w^0})$ 上是常数, 以及$\mathop {\inf }\limits_{k \in N} \hat L({\hat w^k}) = \mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k})$ 成立. 证毕.
定理3.2 设$\left\{ {{w^k} = (x_1^k,x_2^k,{y^k},{\lambda ^k})} \right\}$ 是L-G-BADMM算法产生的有界序列, 假设1成立且$\hat L\left( {\hat w} \right)$ 满足K-L性质, 则
(1) $\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty $ ;
(2) 序列$\left\{ {{w^k}} \right\}$ 收敛到本文问题(1.3)的稳定点.
证 由定理3.1显然可得, 对于任意${\hat w^*} \in S({\hat w^0})$ 则有$\mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^{k + 1}}) = \hat L({\hat w^*})$ . 下面我们分两种情况证明定理3.2(1).
1a) 若存在整数${k_0}$ , 有$\hat L({\hat w^{{k_0}}}) = \hat L({\hat w^*})$ , 又由引理3.4可得, 对任意的$k > {k_0}$ , 使得
$\begin{eqnarray*} &&{\sigma _4}\left( {{{\left\| {x_1^{k + 1} - x_1^k} \right\|}^2} + {{\left\| {x_2^{k + 1} - x_2^k} \right\|}^2} + {{\left\| {{y^{k + 1}} - {y^k}} \right\|}^2}} \right) \\ & \le& \hat L({{\hat w}^k}) - \hat L({{\hat w}^{k + 1}}) \le \hat L({{\hat w}^{{k_0}}}) - \hat L({{\hat w}^*}) = 0 \end{eqnarray*} $
成立, 故对于任意的$k > {k_0}$ , 则$x_1^{k + 1} = x_1^k,x_2^{k + 1} = x_2^k,{y^{k + 1}} = {y^k}$ . 那么对于任意的$k > {k_0} + 1$ , 我们有${\hat w^{k + 1}} = {\hat w^k}$ , 故$\sum\limits_{k = 0}^{ + \infty } {{{\left\| {{w^{k + 1}} - {w^k}} \right\|}^2}} < + \infty. $
1b) 对任意${k}$ 均有$\hat L({\hat w^{{k}}}) = \hat L({\hat w^*})$ , 因为$\mathop {\lim }\limits_{k \to \infty } d({\hat w^k},S({\hat w^0})) = 0$ , 所以对于任意给定的$\varepsilon > 0$ , 存在${k_1} > 0$ , 当$k > {k_1}$ 时, 有$d({\hat w^k},S({\hat w^0})) < \varepsilon $ . 又因为$\mathop {\lim }\limits_{k \to \infty } \hat L({\hat w^k}) = \hat L({\hat w^*})$ , 所以对任意给定$\eta > 0$ , 存在${k_2} > 0$ , 且$k > {k_2}$ 时, 有$\hat L({\hat w^k}) < L({\hat w^*}) + \eta $ . 那么当$k > \tilde k = \max \left\{ {{k_1},{k_2}} \right\}$ , 有$d({\hat w^k},S({\hat w^0})) < \varepsilon,L({\hat w^*}) < \hat L({\hat w^k}) < L({\hat w^*}) + \eta $ . 因为$S({\hat w^0})$ 为紧集且非空时, 函数$\hat L( \cdot )$ 在集合$S({\hat w^0})$ 上为常数, 所以由一致K-L性质, 对任意$k > \tilde k$ , 有$\varphi '(\hat L({\hat w^k}) - L({\hat w^*}))d(0,\partial L({\hat w^*})) \ge 1,$ $\forall k > \tilde k$ , 再由引理3.4, 则有
(3.29) $\begin{matrix} \frac{1}{{\varphi '(\hat L({{\hat w}^k}) - \hat L({{\hat w}^*}))}}&\le& d(0,\partial \hat L({{\hat w}^*})) \\ & \le &\gamma (\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ && + \left\| {x_1^k - x_1^{k - 1}} \right\|+ \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|). \end{matrix}$
(3.30) $\begin{matrix} &&\varphi '(\hat L({\hat w^k}) - L({\hat w^*}))(\hat L({\hat w^k}) - L({\hat w^{k + 1}})) \\ &\le& \varphi (\hat L({\hat w^k}) - L({\hat w^*})) - \varphi (\hat L({\hat w^{k + 1}}) - L({\hat w^*})). \end{matrix}$
(3.31) $\begin{matrix} {\sigma _4}{\left\| {{u^{k + 1}} - {u^k}} \right\|^2} &\le &\hat L({{\hat w}^k}) - \hat L({{\hat w}^{k + 1}}) \\ &\le& \frac{{\varphi (\hat L({{\hat w}^k}) - \hat L({{\hat w}^*})) - \varphi (\hat L({{\hat w}^{k + 1}}) - \hat L({{\hat w}^*}))}}{{\varphi '(\hat L({{\hat w}^k}) - \hat L({{\hat w}^*}))}} \\ &\le& \gamma (\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ &&+ \left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|){\Delta _{k,k + 1}}. \end{matrix}$
其中${\Delta _{k,k + 1}} = \varphi (\hat L({\hat w^k}) - \hat L({\hat w^*})) - \varphi (\hat L({\hat w^{k + 1}}) - \hat L({\hat w^*}))$ , 上式等价于
(3.32) $\begin{matrix} &&{\left\| {x_1^{k + 1} - x_1^k} \right\|^2} + {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + {\left\| {{y^{k + 1}} - {y^k}} \right\|^2} \\ & \le& \frac{\gamma }{{{\sigma _4}}}(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| + \left\| {x_1^k - x_1^{k - 1}} \right\| \\ && + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|){\Delta _{k,k + 1}}. \end{matrix}$
又由${\left( {a + b + c} \right)^2} \le 3\left( {{a^2} + {b^2} + {c^2}} \right)$ , 可得
$\begin{eqnarray*} && 3(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|) \\ & \le& 3\sqrt 3 {({\left\| {x_1^{k + 1} - x_1^k} \right\|^2} + {\left\| {x_2^{k + 1} - x_2^k} \right\|^2} + {\left\| {{y^{k + 1}} - {y^k}} \right\|^2})^{\frac{1}{2}}} \\ &\le& 2(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ &&+\left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\|{)^{\frac{1}{2}}}{(\frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}})^{\frac{1}{2}}} \\ & \le& \left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\| \\ & &+\left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\| + \frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}}, \end{eqnarray*}$
结合(3.32)式和基本不等式$2\sqrt {ab} \le a + b(a,b > 0)$ 可以得到上述结果. 对上式中, 取$k = \tilde k + 1,\cdots,m$ , 并求和可得
$\begin{eqnarray*} && \sum\limits_{k = \tilde k + 1}^m {3(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|)} \\ & \le& \sum\limits_{k = \tilde k + 1}^m {(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|} \\ &&+ \left\| {x_1^k - x_1^{k - 1}} \right\| + \left\| {x_2^k - x_2^{k - 1}} \right\| + \left\| {{y^k} - {y^{k - 1}}} \right\| + \frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}}), \end{eqnarray*}$
由于$\varphi (\hat L({\hat w^{m + 1}}) - L({\hat w^*})) > 0$ , 左右移项再令$m \to \infty $ , 则有
(3.33) $\begin{matrix} &&\sum\limits_{k = \tilde k + 1}^m {\left\| {x_1^{k + 1} - x_1^k} \right\| + } \sum\limits_{k = \tilde k + 1}^m {\left\| {x_2^{k + 1} - x_2^k} \right\|} + \sum\limits_{k = \tilde k + 1}^m {\left\| {{y^{k + 1}} - {y^k}} \right\|} \\ & \le& \sum\limits_{k = \tilde k + 1}^m {(\left\| {x_1^k - x_1^{k - 1}} \right\| - \left\| {x_1^{k + 1} - x_1^k} \right\|)} + \sum\limits_{k = \tilde k + 1}^m {(\left\| {x_2^k - x_2^{k - 1}} \right\|} - \left\| {x_2^{k + 1} - x_2^k} \right\|) \\ & & + \sum\limits_{k = \tilde k + 1}^m {(\left\| {{y^k} - {y^{k - 1}}} \right\| - \left\| {{y^{k + 1}} - {y^k}} \right\|)} + \sum\limits_{k = \tilde k + 1}^m {\frac{{27\gamma }}{{4{\sigma _4}}}{\Delta _{k,k + 1}}} \\ &\le& \left\| {x_1^{\tilde k + 1} - x_1^{\tilde k}} \right\| - \left\| {x_1^{m + 1} - x_1^m} \right\| + \left\| {x_2^{\tilde k + 1} - x_2^{\tilde k}} \right\| - \left\| {x_2^{m + 1} - x_2^m} \right\| \\ & & + \left\| {{y^{\tilde k + 1}} - {y^{\tilde k}}} \right\| - \left\| {{y^{m + 1}} - {y^m}} \right\| + \frac{{27\gamma }}{{4{\sigma _4}}}( \varphi (\hat L({{\hat w}^{\tilde k + 1}}) - \hat L({{\hat w}^*})) \\ & \le& \left\| {x_1^{\tilde k + 1} - x_1^{\tilde k}} \right\| + \left\| {x_2^{\tilde k + 1} - x_2^{\tilde k}} \right\| + \left\| {{y^{\tilde k + 1}} - {y^{\tilde k}}} \right\|+ \frac{{27\gamma }}{{4{\sigma _4}}}\varphi (\hat L({{\hat w}^{\tilde k + 1}}) - \hat L({{\hat w}^*})). \end{matrix}$
$\sum\limits_{k = \tilde k + 1}^m {(\left\| {x_1^{k + 1} - x_1^k} \right\| + \left\| {x_2^{k + 1} - x_2^k} \right\| + \left\| {{y^{k + 1}} - {y^k}} \right\|)} < + \infty. $
$\sum\limits_{k = \tilde k + 1}^m {(\left\| {w_{}^{k + 1} - w_{}^k} \right\|)} < + \infty, $
$\sum\limits_{k = 0}^{ + \infty } {\left\| {{w^{k + 1}} - {w^k}} \right\|} < + \infty. $
(2) 根据上述证明以及$\left\{ {{w^k}} \right\}$ 是柯西序列, 可得$\left\{ {{w^k}} \right\}$ 收敛, 又由定理3.1(2) 可知该定理显然成立. 定理3.2得证.
参考文献
View Option
[1]
Banerjee A , Merugu S , Dhillon I , Ghosh J . Clustering with bregman divergences
Journal of Machine Learning Research , 2005 , 6 : 1705 -1749
[本文引用: 1]
[2]
Bolte J , Daniilidis A , Ley O , et al. Characterizations of Lojasiewicz inequalities and applications: subgradient flows, talweg, convexity
Transactions of the American Mathematical Society , 2010 , 362 (6 ): 3319 -3363
DOI:10.1090/S0002-9947-09-05048-X
URL
[本文引用: 1]
[3]
Boyd S , Parikh N , Chu E , et al. Distributed optimization and statistical learning via the alternating direction method of multipliers
Foundations Trends in Machine Learning , 2010 , 3 (1 ): 1 -122
DOI:10.1561/2200000016
URL
[本文引用: 1]
[4]
Bregman L M . The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming
USSR Computational Mathematics and Mathematical Physics , 1967 , 7 (3 ): 200 -217
[本文引用: 1]
[5]
Chao M T , Cheng C Z , Zhang H B . A linearized alternating direction method of multipliers with substitution procedure
Asia-Pacific Journal of Operational Research , 2015 , 32 (3 ): 1550011
DOI:10.1142/S0217595915500116
URL
[6]
Chao M T , Deng Z , Jian J B . Convergence of linear Bregman ADMM for nonconvex and nonsmooth problems with nonseparable structure
Complexity , 2020 , 2020 : Art ID 6237942
[本文引用: 5]
[7]
Chao M T , Zhang Y , Jian J B . An inertial proximal alternating direction method of multipliers for nonconvex optimization
International Journal of Computer Mathematics , 2021 , 98 (6 ): 1199 -1217
DOI:10.1080/00207160.2020.1812585
URL
[本文引用: 3]
[8]
Chen C H , He B S , Ye Y Y , et al. The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent
Mathematical Programming , 2016 , 155 : 57 -79
DOI:10.1007/s10107-014-0826-5
URL
[本文引用: 1]
[9]
Deng W , Yin W T . On the global and linear convergence of the generalized alternating direction method of multipliers
Journal of Scientific Computing , 2016 , 66 (3 ): 889 -916
DOI:10.1007/s10915-015-0048-x
URL
[本文引用: 1]
[10]
Feng J K , Zhang H B , Cheng C Z , et al. Convergence analysis of L-ADMM for multi-block linear-constrained separable convex minimization problem
Journal of the Operations Research Society of China , 2015 , 3 (4 ): 563 -579
DOI:10.1007/s40305-015-0084-0
URL
[本文引用: 2]
[11]
Gabay D , Mercier B . A dual algorithm for the solution of nonlinear variational problems via finite element approximation
Computers and Mathematics with Applications , 1976 , 2 (1 ): 17 -40
DOI:10.1016/0898-1221(76)90003-1
URL
[本文引用: 1]
[12]
Glowinski R , Marroco A . Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires
Journal of E-quine Veterinary Science , 1975 , 2 (2 ): 41 -76
[本文引用: 1]
[13]
Guo K , Han D R , Wang D Z W , et al. Convergence of ADMM for multi-block nonconvex separable optimization models
Frontiers of Mathematics in China , 2017 , 12 (5 ): 1139 -1162
DOI:10.1007/s11464-017-0631-6
[本文引用: 1]
For solving minimization problems whose objective function is the sum of two functions without coupled variables and the constrained function is linear, the alternating direction method of multipliers (ADMM) has exhibited its efficiency and its convergence is well understood. When either the involved number of separable functions is more than two, or there is a nonconvex function, ADMM or its direct extended version may not converge. In this paper, we consider the multi-block separable optimization problems with linear constraints and absence of convexity of the involved component functions. Under the assumption that the associated function satisfies the Kurdyka- Lojasiewicz inequality, we prove that any cluster point of the iterative sequence generated by ADMM is a critical point, under the mild condition that the penalty parameter is sufficiently large. We also present some sufficient conditions guaranteeing the sublinear and linear rate of convergence of the algorithm.
[14]
Guo K , Hand D R , Wu T T . Convergence of alternating direction method for minimizing sum of two nonconvex functions with linear constraints
International Journal of Computer Mathematics , 2017 , 94 (8 ): 1653 -1669
DOI:10.1080/00207160.2016.1227432
URL
[本文引用: 1]
[15]
Guo K , Wang X . Convergence of generalized alternating direction method of multipliers for nonseparable nonconvex objective with linear constraints
Journal of Mathematical Research with Applications , 2018 , 38 (5 ): 523 -540
[本文引用: 1]
[16]
Hong M Y , Lou Z Q , Razaviyayn M . Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems
SIAM Journal on Optimization , 2016 , 26 (1 ): 337 -364
DOI:10.1137/140990309
URL
[17]
Li M , Sun D F , Toh K C . A convergent 3-blocksemi-proximal ADMM for convex minimization problems with one strongly convex block
Asia-Pacific Journal of Operational Research , 2015 , 32 (4 ): 1550024
DOI:10.1142/S0217595915500244
URL
[本文引用: 1]
[18]
Rockafellar R T , Wets R J B . Variational Analysis
Berlin: Springer Science and Business Media , 2009
[19]
Wang F H , Cao W F , Xu Z B . Convergence of multi-block Bregman ADMM for nonconvex composite problems
Science China Information Sciences , 2018 , 61 (12 ): 1 -12
[本文引用: 1]
[20]
Wang F H , Xu Z B , Xu H K . Convergence of Bregman alternating direction method with multipliers for nonconvex composite problems
arXiv: 1410.8625
[本文引用: 3]
[21]
Wang H H , Banerjee A . Bregman alternating direction method of multipliers
//Ghahramani Z, Welling M, Cortes C, et al. Advances in Neural Information Processing Systems , 2014 : 2816 -2824
[本文引用: 3]
[22]
Xu Z B , Chang X Y , Xu F M , Zhang H . $L_\frac{1}{2}$ regularization: a thresholding representation theory and a fast solver
IEEE Transactions on Neural Networks and Learning Systems , 2012 , 23 : 1013 -1027
DOI:10.1109/TNNLS.2012.2197412
URL
[23]
Yang W H , Han D . Linear Convergence of the alternating direction method of multipliers for a class of convex optimization problems
SIAM Journal on Numerical Analysis , 2016 , 54 (2 ): 625 -640
DOI:10.1137/140974237
URL
[本文引用: 1]
[24]
Yashtini M . Multi-block nonconvex nonsmooth proximal ADMM: Convergence and rates Under Kurdyka-ojasiewicz property
Journal of Optimization Theory and Applications , 2021 , 190 (3 ): 966 -998
DOI:10.1007/s10957-021-01919-7
URL
[本文引用: 1]
[25]
Zeng J S , Fang J , Xu Z B . Sparse SAR imaging based on $L_\frac{1}{2}$ regularization
Sci China Infor Sci , 2012 , 55 : 1755 -1775
[26]
Zeng J S , Lin S B , Wang Y , et al. $L_\frac{1}{2}$ Regularization: Convergence of iterative half thresholding algorithm
IEEE Transactions on Signal Processing , 2014 , 62 (9 ): 2317 -2329
DOI:10.1109/TSP.2014.2309076
URL
[27]
Zeng J S , Xu Z B , Zhang B , et al. Accelerated $L_\frac{1}{2}$ regularization based SAR imaging via BCR and reduced Newton skills
Signal Processing , 2013 , 93 : 1831 -184
DOI:10.1016/j.sigpro.2012.12.017
URL
Clustering with bregman divergences
1
2005
... 引理2.2 [1 ] ${\Delta _\phi }$ 是可微的凸函数, 则有 ...
Characterizations of Lojasiewicz inequalities and applications: subgradient flows, talweg, convexity
1
2010
... 定义2.3 [2 ] 设函数$f: {\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$ 是真且下半连续, 如果存在 $\eta > 0, \delta > 0, \varphi \in {A_n}$ , 使得对所有 ...
Distributed optimization and statistical learning via the alternating direction method of multipliers
1
2010
... Feng等[10 ] 引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21 ]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3 ] 和广义ADMM[9 ] 通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6 ] 针对三块不可分的非凸优化问题提出一种线性化BADMM算法. ...
The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming
1
1967
... 定义2.4 [4 ] 对于一个凸可微的函数$\Phi $ , 定义Bregman散度如下 ...
A linearized alternating direction method of multipliers with substitution procedure
2015
Convergence of linear Bregman ADMM for nonconvex and nonsmooth problems with nonseparable structure
5
2020
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
... Feng等[10 ] 引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21 ]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3 ] 和广义ADMM[9 ] 通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6 ] 针对三块不可分的非凸优化问题提出一种线性化BADMM算法. ...
... 其中${\Delta_\psi },~{\Delta_{{\varphi _1}}},~{\Delta_{{\varphi _2}}},~{\Delta_\phi }$ 分别是函数$\psi,{\varphi _1},{\varphi _2},\phi $ 导出的Bregman散度. 文献[6 ]提出的L-BADMM算法采用的是二范数作为二次惩罚项, 而本文算法中的二次惩罚项则使用的是Bregman散度. 由定义2.4可知二范数是Bregman散度的特例, 因此这对于经典的ADMM 算法来说具有更广泛的意义. 目前, 在目标函数非凸情况下, 还没有二次惩罚项不是二范数的理论证明, 这也是本文将要解决的问题. ...
... 引理2.1 [6 ] (一致K-L性质) 设 $\Omega$ 是紧集, $f:{\Bbb R^n} \to \Bbb R\cup \left\{ { + \infty } \right\}$ 是真且下半连续函数, 若函数$f$ 在集合上取常数$f({x_0})$ , 并在$\Omega $ 中任意一点满足K-L性质, 则存在$\eta > 0, \delta > 0, \varphi \in {A_n}$ , 使得对所有 ...
... 引理2.3 [6 ] (下降性引理) 设$g:{\Bbb R^n} \to \Bbb R$ 是利普西茨可微函数, 其利普西茨系数为${l_g}$ , 则对任意$x,y \in {\Bbb R^n}$ 都有 ...
An inertial proximal alternating direction method of multipliers for nonconvex optimization
3
2021
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
... -7 ,13 ⇓ -15 ,19 ,24 ]. ...
The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent
1
2016
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
On the global and linear convergence of the generalized alternating direction method of multipliers
1
2016
... Feng等[10 ] 引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21 ]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3 ] 和广义ADMM[9 ] 通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6 ] 针对三块不可分的非凸优化问题提出一种线性化BADMM算法. ...
Convergence analysis of L-ADMM for multi-block linear-constrained separable convex minimization problem
2
2015
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
... Feng等[10 ] 引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21 ]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3 ] 和广义ADMM[9 ] 通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6 ] 针对三块不可分的非凸优化问题提出一种线性化BADMM算法. ...
A dual algorithm for the solution of nonlinear variational problems via finite element approximation
1
1976
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires
1
1975
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
Convergence of ADMM for multi-block nonconvex separable optimization models
1
2017
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
Convergence of alternating direction method for minimizing sum of two nonconvex functions with linear constraints
1
2017
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
Convergence of generalized alternating direction method of multipliers for nonseparable nonconvex objective with linear constraints
1
2018
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems
2016
A convergent 3-blocksemi-proximal ADMM for convex minimization problems with one strongly convex block
1
2015
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
Variational Analysis
2009
Convergence of multi-block Bregman ADMM for nonconvex composite problems
1
2018
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
Convergence of Bregman alternating direction method with multipliers for nonconvex composite problems
3
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
... 定义2.1 [20 ] $f:{\Bbb R^n} \to \Bbb R\bigcup{\left\{ + \infty \right\} }$ 为下半连续函数. ...
... 定义2.2 [20 ] (鞍点定理) ${w^ * } = ({x^ * }, {y^ * }, {\lambda ^ * })$ 是拉格朗日函数 $L$ 的稳定点, 则满足 ...
Bregman alternating direction method of multipliers
3
2014
... Wang等[21 ] 提出用Bregman散度代替增广拉格朗日函数中的二次惩罚项得到广义BADMM算法. 受文献[21 ]的启发, 对于问题(1.3)我们得到新的增广拉格朗日函数 ...
... 提出用Bregman散度代替增广拉格朗日函数中的二次惩罚项得到广义BADMM算法. 受文献[21 ]的启发, 对于问题(1.3)我们得到新的增广拉格朗日函数 ...
... Feng等[10 ] 引入了一种将算法中子问题的二次惩罚项进行线性化的技术, 并且文献[21 ]中举例说明对算法中不同部分进行线性化以后依然可以得到算法的收敛性. 不精确ADMM[3 ] 和广义ADMM[9 ] 通过线性化函数以及增加其他二次项来解决算法收敛性问题. Chao等[6 ] 针对三块不可分的非凸优化问题提出一种线性化BADMM算法. ...
$L_\frac{1}{2}$ regularization: a thresholding representation theory and a fast solver
2012
Linear Convergence of the alternating direction method of multipliers for a class of convex optimization problems
1
2016
... 和Marroco[12 ] , Gabay和Mercier [11 ] 提出了求解两分块凸优化问题的ADMM算法, 其收敛性发展的相对成熟, 参见文献[7 ,23 ]. 在一定的假设条件下, 文献[8 ,10 ,17 ]分析了具有三块可分离结构的凸优化问题的ADMM算法收敛性. 近年来, 对于非凸领域的研究成为了主要研究方向. 非凸情况下的ADMM算法收敛性是很难得到的, 困难性体现在该算法产生的迭代序列的Fejer单调性在非凸的情况下是不能成立的. 为了解决此类问题, Wang等人[20 ] 针对两块可分离非凸优化问题 ...
Multi-block nonconvex nonsmooth proximal ADMM: Convergence and rates Under Kurdyka-ojasiewicz property
1
2021
... 其中${L_\beta }(x,y,\lambda ) = f(x) + g(y) + \langle \lambda,Ax + By - c \rangle + \frac{\beta }{2}\left\| {Ax + By - c} \right\|_2^2$ 为增广拉格朗日函数, $\lambda $ 为乘子, $\beta > 0$ 为惩罚参数, ${\Delta _\psi },{\Delta _\phi }$ 分别是函数 $\psi,\phi $ 导出的Bregman散度. 文献[7 ]中不仅提出只要$\beta $ 足够大, 则算法(1.2)就能收敛到平稳解集合, 而且构造出的效益函数在满足K-L性质后, 该算法(1.2)产生的序列收敛到增广拉格朗日函数的稳定点. 这是一种解决非凸问题的有效方法, 并取得了良好的发展[6 -7 ,13 ⇓ -15 ,19 ,24 ] . ...
Sparse SAR imaging based on $L_\frac{1}{2}$ regularization
2012
$L_\frac{1}{2}$ Regularization: Convergence of iterative half thresholding algorithm
2014
Accelerated $L_\frac{1}{2}$ regularization based SAR imaging via BCR and reduced Newton skills
2013