## The Greedy Simplex Algorithm for Double Sparsity Constrained Optimization Problems

Pan Tingwei,, He Suxiang,

Department of Mathematics, School of Science, Wuhan University of Technology, Wuhan 430070

 基金资助: 国家自然科学基金.  11871153

 Fund supported: the NSFC.  11871153

Abstract

In view of the shortcomings of the alternating minimization method in which it needs to calculate the Lipschitz constant of the objective function gradient and the Lipschitz condition is needed to construct the L-stable point of the problem, this paper proposes a greedy simplex algorithm to solve the optimization problems with double sparse constraints. The CW optimality condition for double sparse constrained optimization problems is described. Based on the CW optimality condition, the iterative steps of the algorithm are designed, and the global convergence of the sequence of iterative points generated by the algorithm to the optimal solution of the problem is proved under the weaker assumptions.

Keywords： Dual sparse constrained optimization problems ; CW optimality condition ; Greedy simplex algorithm ; Global convergence

Pan Tingwei, He Suxiang. The Greedy Simplex Algorithm for Double Sparsity Constrained Optimization Problems. Acta Mathematica Scientia[J], 2022, 42(3): 920-933 doi:

## 1 引言

$\begin{eqnarray} &{ } \min\limits_{x\in {{\Bbb R}} ^n } {f(x)}\\ &\mbox{s.t.}\ G(x)\in K, \left \| x \right \| _{0} \le s, \end{eqnarray}$

## 2 CW最优性条件

$\begin{eqnarray} &\min f(x)\\ &\mbox{s.t.} \ x\in C_{s}, \; y\in C_{t}, \end{eqnarray}$

$\forall x\in {{\Bbb R}} ^n , \; y\in {{\Bbb R}} ^m, \; z\in {{\Bbb R}} ^{n+m}$, 定义支撑集如下

$(1) $$\left \| x^{*} \right \| _{0}< s, \; \left \| y^{*} \right \| _{0}< t 时, 对任意 i\in \left \{ 1, \cdots , n+m \right \} , 有 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +ae_{i} ) ; (2)$$ \left \| x^{*} \right \| _{0}= s, \; \left \| y^{*} \right \| _{0}= t$时, 对任意$i\in I_{1} (x^{*} ) $$j\in \left \{ 1, \cdots , n\right \}$$ i\in I_{1} (y^{*} ) $$j\in \left \{ n+1, \cdots , n+m\right \} , 有 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{*} -z_{i}^{*} e_{i}+ae_{j} ) ; (3)$$ \left \| x^{*} \right \| _{0}= s, \; \left \| y^{*} \right \| _{0}< t$时, 对任意$i\in I_{1} (x^{*} ) $$j\in \left \{ 1, \cdots , n\right \} , 有 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{*} -z_{i}^{*} e_{i}+ae_{j} ) , 或对任意 i\in \left \{ n+1, \cdots , n+m \right \} , 有 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +ae_{i} ) ; (4)$$ \left \| x^{*} \right \| _{0}< s, \; \left \| y^{*} \right \| _{0}= t$时, 对任意$i\in I_{1} (y^{*} ) $$j\in \left \{ n+1, \cdots , n+m\right \} , 有 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{*} -z_{i}^{*} e_{i}+ae_{j} ) , 或对任意 i\in \left \{ 1, \cdots , n \right \} , 有 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +ae_{i} ) ; 则称 z^{*} 具有CW最优性, 或称 z^{*} 为问题 (2.1) 的CW最优解. 同时称条件(1)–(4)为问题 (2.1)$$ z^{*}$处的CW最优性条件.

(a) 当$j\in I_{1}(x^{*} ) $$j=i 时, 由于 z^{*} 为问题(2.1)的最优解, 故有 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +(a-z_{i}^{*} )e_{i} ) . (b) 当 j\in I_{1}(x^{*} )$$ j\ne i$时, 由于$\left \| x^{*}-x_{i}^{*}e_{i}+ae_{j} \right \| _{0} =s-1$, 故有$f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(x^{*}-x_{i}^{*}e_{i} +ae_{j} , y^{*} )$.

(c) 当$j\notin I_{1}(x^{*} )$时, 由于$\left \| x^{*}-x_{i}^{*}e_{i}+ae_{j} \right \| _{0} =s$, 故有$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(x^{*}-x_{i}^{*}e_{i} +ae_{j} , y^{*} )$.

$i\in I_{1} (y^{*} ) $$j\in \left \{ n+1, \cdots , n+m\right \} 时, 根据 j 的不同取值可分为如下三种情形进行证明. (a) 当 j\in I_{1}(y^{*} )$$ j=i$时, 由于$z^{*}$为问题(2.1)的最优解, 故有$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +(a-z_{i}^{*} )e_{i} )$.

(b) 当$j\in I_{1}(y^{*} ) $$j\ne i 时, 由于 \left \| y^{*}-y_{i}^{*}e_{i}+ae_{j} \right \| _{0} =t-1 , 故有 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(x^{*}, y^{*}-y_{i}^{*}e_{i} +ae_{j} ) . (c) 当 j\notin I_{1}(y^{*} ) 时, 由于 \left \| y^{*}-y_{i}^{*} e_{i}+ae_{j} \right \| _{0} =t , 故有 f(z^{*} )= \min\limits_{a\in{{\Bbb R}} } f(x^{*}, y^{*}-y_{i}^{*}e_{i} +ae_{j} ) . (ⅲ)若 \left \| x^{*} \right \| _{0}= s, \left \| y^{*} \right \| _{0}< t , 则进行如下分析. i\in I_{1} (x^{*} )$$ j\in \left \{ 1, \cdots , n\right \}$时, 根据$j$的不同取值可分为如下三种情形进行证明.

(a) 当$j\in I_{1}(x^{*} ) $$j=i 时, 由于 z^{*} 为问题(2.1)的最优解, 故有 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +(a-z_{i}^{*} )e_{i} ) . (b) 当 j\in I_{1}(x^{*} )$$ j\ne i$时, 由于$\left \| x^{*}-x_{i}^{*}e_{i}+ae_{j} \right \| _{0} =s-1$, 故有$f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(x^{*}-e_{i} x_{i}^{*} +ae_{j} , y^{*} )$.

(c) 当$j\notin I_{1}(x^{*} )$时, 由于$\left \| x^{*}-e_{i}x_{i}^{*}+ae_{j} \right \| _{0} =s$, 故有$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(x^{*}-z_{i}^{*}e_{i} +ae_{j} , y^{*})$.

$i\in \left \{ n+1, \cdots , n+m\right \}$时, 则令$g(a)=f(z^{*} +ae_{i} )$, 其中$a\in{{\Bbb R}}$. 由于$z^{*}$为问题(2.1)的最优解, 故有$f(z^{*} )=g(0)\le g(a)=f(z^{*} +ae_{i} )$, 反之, 则存在一个非零的常数$a_{0} \in{{\Bbb R}}$, 使得$f(z^{*} )=g(0)> g(a_{0} )=f(z^{*} +a_{0} e_{i} )$, 这与$z^{*}$为问题(2.1)的最优解相矛盾. 故$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +ae_{i} )$.

(ⅳ) 若$\left \| x^{*} \right \| _{0}< s, \left \| y^{*} \right \| _{0}= t$, 则进行如下分析.

$i\in I_{1} (y^{*} ) $$j\in \left \{ n+1, \cdots , n+m\right \} 时, 根据 j 的不同取值可分为如下三种情形进行证明. (a) 当 j\in I_{1}(y^{*} )$$ j=i$时, 由于$z^{*}$为问题(2.1)的最优解, 故有$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +(a-z_{i}^{*} )e_{i} )$.

(b) 当$j\in I_{1}(y^{*} ) $$j\ne i 时, 由于 \left \| y^{*}-y_{i}^{*}e_{i}+ae_{j} \right \| _{0} =t-1 , 故有 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(x^{*}, y^{*}-y_{i}^{*}e_{i} +ae_{j} ) . (c) 当 j\notin I_{1}(y^{*} ) 时, 由于 \left \| y^{*}-y_{i}^{*}e_{i}+ae_{j} \right \| _{0} =t , 故有 f(z^{*} )= \min\limits_{a\in{{\Bbb R}} } f(x^{*}, y^{*}-y_{i}^{*}e_{i} +ae_{j} ) . i\in \left \{ 1, \cdots , n\right \} 时, 则令 g(a)=f(z^{*} +ae_{i} ) , 其中 a\in{{\Bbb R}} . 由于 z^{*} 为问题(2.1)的最优解, 故有 f(z^{*} )=g(0)\le g(a)=f(z^{*} +ae_{i} ) , 反之, 则存在一个非零的常数 a_{0} \in{{\Bbb R}} , 使得 f(z^{*} )=g(0)> g(a_{0})=f(z^{*} +a_{0}e_{i} ) , 这与 z^{*} 为问题(2.1)的最优解相矛盾. 故 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +ae_{i} ) . 注2.2 由定理 2.1 可知: CW最优性条件可作为问题 (2.1) 最优解的一种刻画形式. 由于后文需要借助基本可行性条件研究CW最优解与文献[17]中 L -稳定点之间的关系, 下面给出问题(2.1)的基本可行性条件的定义. 定义2.2 若 f(z)$$ z$处可微且问题$(2.1)$的可行解$z^{*}$满足如下条件之一$:$

$(1) $$\left \| x^{*} \right \| _{0}=s, \; \left \| y^{*} \right \| _{0}= t 时, 对任意 i\in I_{1}(z^{*} ) , 有 \bigtriangledown _{i} f(z)=0 ; (2)$$ \left \| x^{*} \right \| _{0}< s, \; \left \| y^{*} \right \| _{0}< t$时, 有$\bigtriangledown f(z)=0$;

$(3) $$\left \| x^{*} \right \| _{0}< s, \; \left \| y^{*} \right \| _{0}= t 时, 对任意 i\in I_{1}(y^{*} ) \cup \left \{ 1, \cdots , n \right \} , 有 \bigtriangledown _{i} f(z)=0 ; (4)$$ \left \| x^{*} \right \| _{0}= s, \; \left \| y^{*} \right \| _{0}< t$时, 对任意$i\in I_{1}(x^{*} ) \cup \left \{n+ 1, \cdots , n +m\right \}$, 有$\bigtriangledown _{i} f(z)=0$; 则称$z^{*}$具有基本可行性, 或称$z^{*}$为问题$(2.1)$的基本可行性向量. 同时称条件(1)–(4)为问题$(2.1) $$z^{*} 处的基本可行性条件. 下面研究基本可行性向量与最优解的关系. 定理2.2 若 z^{*} 为问题 (2.1) 的最优解且 f(z^{*})$$ z^{*}$处可微, 则$z^{*}$满足基本可行性条件.

若$z^{*}$为问题(2.1)的最优解, 则令$g(a)=f(z^{*} +ae_{i} )$, 其中$i\in \left \{ 1, \cdots , n+m\right \}$, $a\in{{\Bbb R}}$. 显然$0\in \arg\min g(a)$, 否则一定存在一个非零常数$a_{0}$, 使得

根据定义$2.1$和定义$2.2$对CW最优性条件和基本可行性条件的表述, 这里对$z^{*}$分为四种情形进行证明.

(ⅰ)若$\left \| x^{*} \right \| _{0}= s, \left \| y^{*} \right \| _{0}= t$, 则当$i\in I_{1}(z^{*} ) $$j=i 时, 可得 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{*} -z_{i}^{*} e_{i}+ae_{j} ) . 由于 f(z^{*} )= f(z^{*} -z_{i}^{*} e_{i}+z_{i}^{*}e_{i} ) , 上述不等式等价为 $$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} -z_{i}^{*} e_{i}+ae_{i} ).$$ u=a-z_{i}^{*} , 则(2.3)式可写为 f(z^{*} )=\min\limits_{u\in{{\Bbb R}} } f(z^{*} +ue_{i} ) , 故有 \bigtriangledown _{i} f(z^{*} )=0. (ⅱ)若 \left \| x^{*} \right \| _{0}< s, \left \| y^{*} \right \| _{0}< t , 则对任意 i\in \left \{ 1, \cdots , n+m\right \}$$ f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{*} +ae_{i} )$, 故有$\bigtriangledown _{i} f(z^{*} )=0$.

(ⅲ)若$\left \| x^{*} \right \| _{0}< s, \left \| y^{*} \right \| _{0}= t$, 则进行如下分析.

$i\in I_{1}(y^{*} ) $$j=i 时, 由情形(ⅰ)可得 \bigtriangledown _{i} f(z^{*} )=0 . i\in \left \{ 1, \cdots , n \right \} 时, 由情形(ⅱ)可得 \bigtriangledown _{i} f(z^{*} )=0 . 因此, 当 i\in I_{1}(y^{*} ) \cup \left \{ 1, \cdots , n \right \} 时, 有 \bigtriangledown _{i} f(z^{*} ) =0. (ⅳ)若 \left \| x^{*} \right \| _{0}= s, \left \| y^{*} \right \| _{0}< t , 则进行如下分析. i\in I_{1}(x^{*} )$$ j=i$时, 由情形(ⅰ)可得$\bigtriangledown _{i} f(z^{*} )=0$.

$i\in \left \{n+ 1, \cdots , n +m\right \}$时, 由情形(ⅱ)可得$\bigtriangledown _{i} f(z^{*} )=0$. 因此, 当$i\in I_{1}(x^{*} ) \cup \left \{n+ 1, \cdots , n +m\right \}$时, 有$\bigtriangledown _{i} f(z^{*} )=0$.

$$$f(x^{*}+d_{1} , y^{*} )\le f(z^{*} )+\left \langle \bigtriangledown _{x}f(z^{*} ) , d_{1} \right \rangle +\frac{L^{1} }{2}\left \| d_{1} \right \| , \; d_{1}\in {{\Bbb R}} ^n,$$$

$$$f(x^{*} , y^{*}+d_{2} )\le f(z^{*} )+\left \langle \bigtriangledown _{y}f(z^{*} ) , d_{2} \right \rangle +\frac{L^{2} }{2}\left \| d_{2} \right \| , \; d_{2}\in {{\Bbb R}} ^m.$$$

$$$\left | \bigtriangledown _{i}f(z^{*} ) \right | \le LM_{s} (x^{*} );$$$

$$$\left | \bigtriangledown _{i}f(z^{*} ) \right | \le LM_{t} (y^{*} );$$$

$$$\left | \bigtriangledown _{i}f(z^{*} ) \right | =0.$$$

(ⅰ) 若$\left \| x^{*} \right \| _{0}< s, \left \| y^{*} \right \| _{0}< t$, 则由$M_{s} (x^{*} )=M_{t} (y^{*} )=0$与(2.16)式、(2.17)式和(2.18)式可得$\bigtriangledown f(z^{*} )=0$, 进一步可得$\left | x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) \right | =x^{*} $$\left | y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) \right | =y^{*} , 从而可得 P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) )=P_{C_{s} }(x^{*})=\left \{ x^{*} \right \}$$ P_{C_{t} } (y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) )=P_{C_{t} }(y^{*})=\left \{ y^{*} \right \}$. 因此, 可得

(ⅱ) 若$\left \| x^{*} \right \| _{0}= s, \left \| y^{*} \right \| _{0}= t$, 则由$M_{s} (x^{*} )\ne 0 $$I_{1} (x^{*} )=s 、(2.16)式和(2.18)式可得对任意 i\in I_{0} (x^{*} )$$ \left | x^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | \le M_{s} (x^{*})$, 或对任意$i\in I_{1} (x^{*} ) $$\left | x^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | =\left |x_{i}^{*} \right | . M_{t}(y^{*} ) \ne 0$$ I_{1} (y^{*} )=t$、(2.17)式和(2.18) 式可得对任意$i\in I_{0} (y^{*} ) $$\left | y^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | \le M_{t} (y^{*}) , 或对任意 i\in I_{1} (y^{*} )$$ \left | y^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | =\left |y_{i}^{*} \right |$. 由于$P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) )$中的向量由$x^{*}$中最大绝对值分量与其它都小于或等于的$M_{s}(x^{*} )$分量组成, 因此, $x^{*} \in P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) )$进而$0\in \left \{ x^{*} - P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) ) \right \}$. 同理可得$0\in \left \{ y^{*} - P_{C_{t} } (y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) ) \right \}$. 因此, 可得

(ⅲ) 若$\left \| x^{*} \right \| _{0}< s, \left \| y^{*} \right \| _{0}= t$, 则进行如下分析.

$i\in \left \{ 1, \cdots , n \right \}$时, 由$M_{s} (x^{*} )=0$, (2.16) 式和(2.18)式可得$\bigtriangledown _{i} f(z^{*} )=0$, 进一步可得$\left | x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) \right |=x^{*}$, 从而可得$P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) )=P_{C_{s} }(x^{*})=\left \{ x^{*} \right \}$, 故$0\in \left \{ x^{*} - P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) ) \right \}$.

$\left \| y^{*} \right \| _{0} =t$, 从而可得$M_{t}(y^{*} ) \ne 0 $$I_{1} (y^{*} )=t . 由(2.17)式和(2.18)式可得对任意 i\in I_{0} (y^{*} )$$ \left | y^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | \le M_{t} (y^{*})$, 或对任意$i\in I_{1} (y^{*} ) $$\left | y^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | =\left |y_{i}^{*} \right | . 由于 P_{C_{t} } (y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) ) 中的向量由 y^{*} 中最大绝对值分量与其它都小于或等于 M_{t}(y^{*} ) 的分量组成, 所以 0\in \left \{ y^{*} - P_{C_{t} } (y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) ) \right \} . 因此 (ⅳ) 若 \left \| x^{*} \right \| _{0}= s, \left \| y^{*} \right \| _{0}< t , 则进行如下分析. i\in \left \{ n+1, \cdots , n+m \right \} 时, 由 M_{s} (y^{*} ) , (2.17)式及(2.18)式可得 \bigtriangledown _{i} f(z^{*} ) , 进一步可得 \left | y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) \right | =y^{*} , 从而可得 P_{C_{t} } (y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) )=P_{C_{t} }(y^{*})=\left \{ y^{*} \right \} . 0\in \left \{ y^{*} - P_{C_{t} } (y^{*}-\frac{1}{L} \bigtriangledown _{y}f(z^{*} ) ) \right \} . \left \| x^{*} \right \| _{0} =s , 可得 M_{s}(x^{*} ) \ne 0$$ I_{1} (x^{*} )=s$. 由(2.16)式和(2.18)式可得对任意$i\in I_{0} (x^{*} ) $$\left | x^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | \le M_{s} (x^{*}) , 或对任意 i\in I_{1} (x^{*} )$$ \left | x^{*}-\frac{1}{L} \bigtriangledown _{i}f(z^{*} ) \right | =\left |x_{i}^{*} \right |$. 由于$P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) )$中向量由$x^{*}$中最大绝对值分量与其它都小于或等于$M_{s}(x^{*} )$的分量组成, 所以$0\in \left \{ x^{*} - P_{C_{s} } (x^{*}-\frac{1}{L} \bigtriangledown _{x}f(z^{*} ) ) \right \}$. 因此, 可得

## 3 贪婪单纯形算法及其收敛性

(a) 若$f(z^{l+\frac{1}{2} } )<f(z^{l} )$, 则$z^{l+1}=z^{l+\frac{1}{2} } =z^{l}-z_{i_{l} }^{l} e _{i_{l} } +a_{i_{l} , j_{l} } e_{j_{l} }$, 其中$i_{l}\in I_{1} (x^{l} )$, $j_{l}\in \left \{ 1, \cdots , n \right \}$.$a_{i_{l} , j_{l} }=0$, 则$\left \| x^{l+1} \right \| _{0} <s, \; \left \| y^{l+1} \right \| =t$, 反之, 则$\left \| x^{l+1} \right \| _{0} \le s, \; \left \| y^{l+1} \right \| =t$, 即$z^{l+1}\in C_{s}\times C_{t}$.

(b) 若$f(z^{l+\frac{1}{2} } )\ge f(z^{l} )$, 则停止迭代, 且$z^{l}=z^{l+1}$.

(ⅲ) 若$\left \| x ^{l}\right \| _{0}= s, \; \left \| y ^{l}\right \| _{0}< t$, 则进行如下分析.

$i_{l}\in I_{1} (x^{l} ) $$j_{l}\in \left \{ 1, \cdots , n \right \} 时, 若 f_{i_{l} , j_{l} } <f(z^{l+\frac{1}{2} } ) , 则 z^{l+1}=z^{l}-z_{i_{l} }^{l} e _{i_{l} } +a_{i_{l} , j_{l} } e_{j_{l} } . a_{i_{l} , j_{l} }=0 , 则 \left \| x^{l+1} \right \|<s, \; \left \| y^{l+1} \right \|<t , 即 z^{l+1}\in C_{s}\times C_{t} . a_{i_{l} , j_{l} }\ne 0 , 则 \left \| x^{l+1} \right \|\le s,$$ \left \| y^{l+1} \right \|<t$, 即$z^{l+1}\in C_{s}\times C_{t}$.$f_{i_{l} , j_{l} }\ge f(z^{l+\frac{1}{2} } )$, 则根据$f(z^{l+\frac{1}{2} } )<f(z^{l} )$能否成立, 可分为如下两种情形进行证明.

(a) 若$f(z^{l+\frac{1}{2} } )<f(z^{l} )$, 则$z^{l+1}=z^{l+\frac{1}{2} } =z^{l} +a_{i_{l} } e_{i_{l} }$, 其中$i_{l}\in \left \{ n+1, \cdots , n+m \right \}$.$a_{i_{l} } =0$, 则$f(z^{l+\frac{1}{2} } )=f(z^{l} )$, 这与$f(z^{l+\frac{1}{2} } )<f(z^{l} )$相矛盾, 因此, $a_{i_{l} } \ne 0$, 则$\left \| x^{l+1} \right \|=s, $$\left \| y^{l+1} \right \|\le t , 即 z^{l+1}\in C_{s}\times C_{t} . (b) 若 f(z^{l+\frac{1}{2} } )\ge f(z^{l} ) , 则停止迭代, 且 z^{l}=z^{l+1} . (ⅳ) 若 \left \| x ^{l}\right \| _{0}< s, \; \left \| y ^{l}\right \| _{0}= t , 则进行如下分析. i_{l}\in I_{1} (y^{l} )$$ j_{l}\in \left \{ n+1, \cdots , n+m \right \}$时, 若$f_{i_{l} , j_{l} } <f(z^{l+\frac{1}{2} } )$, 则$z^{l+1}=z^{l}-z_{i_{l} }^{l} e _{i_{l} } +a_{i_{l} , j_{l} } e_{j_{l} }$.$a_{i_{l} , j_{l} }=0$, 则$\left \| x^{l+1} \right \|<s, \left \| y^{l+1} \right \|<t$, 即$z^{l+1}\in C_{s}\times C_{t}$.$a_{i_{l} , j_{l} }\ne 0$, 则$\left \| x^{l+1} \right \|<s, $$\left \| y^{l+1} \right \|\le t , 即 z^{l+1}\in C_{s}\times C_{t} . f_{i_{l} , j_{l} }\ge f(z^{l+\frac{1}{2} } ) , 则根据 f(z^{l+\frac{1}{2} } )<f(z^{l} ) 能否成立, 可分为如下两种情形进行证明. (a) 若 f(z^{l+\frac{1}{2} } )<f(z^{l} ) , 则 z^{l+1}=z^{l+\frac{1}{2} } =z^{l} +a_{i_{l} } e_{i_{l} } , 其中 i_{l}\in \left \{ 1, \cdots , n\right \} . a_{i_{l} } =0 , 则 f(z^{l+\frac{1}{2} } )=f(z^{l} ) , 这与 f(z^{l+\frac{1}{2} } )<f(z^{l} ) 相矛盾, 因此, a_{i_{l} } \ne 0,$$ \left \| x^{l+1} \right \|\le s, $$\left \| y^{l+1} \right \|\ =t , 即 z^{l+1}\in C_{s}\times C_{t} . (b) 若 f(z^{l+\frac{1}{2} } )\ge f(z^{l} ) , 则停止迭代, 且 z^{l}=z^{l+1} . 注3.1 定理 3.1 说明了由算法 3.1 迭代产生的点列是可行的, 这是收敛性分析中所必需的前提条件. 下面根据算法3.1的设计思路分为有限步迭代终止与无限步迭代终止两种情况探讨其收敛性. 定理3.2 若 \left \{ z^{k} \right \} 是由算法 3.1 生成的迭代点列, 则 f(z^{k+1} )\le f(z^{k} ) . 其中 f(z^{k+1} )= f(z^{k} ) 当且仅当 z^{k+1}= z^{k} , 并且 z^{k} 满足CW最优性条件. 由定理 3.1 可知: 由算法 3.1 生成的迭代点 z^{k}\in C_{s}\times C_{t} . 由算法 3.1 的迭代定义可得函数列是单调非增的, 即 f(z^{k+1} )\le f(z^{k} ) . 充分性显然, 下面证明必要性. 若 f(z^{k+1} )= f(z^{k} ) , 则算法迭代停止且 z^{k+1}= z^{k} . 根据定义2.1对CW最优性条件的表述, 这里分如下四种情形进行证明. (ⅰ) 若 \left \| x ^{k} \right \| _{0}< s, \; \left \| y^{k} \right \| _{0}< t , 则对任意 i\in \left \{ 1, \cdots , n+m \right \} , 由算法迭代定义可得 f(z^{k+1} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{k} +ae_{i} )\le f(z^{k} ) . 又由于 f(z^{k+1} )= f(z^{k} ) , 故有 \min\limits_{a\in{{\Bbb R}} } f(z^{k} +ae_{i} )=f(z^{k} ) . (ⅱ) 若 \left \| x ^{k} \right \| _{0}= s, \; \left \| y^{k} \right \| _{0}= t , 则对任意, 或对任意 i\in I(x^{k} ), j\in \left \{ 1, \cdot \cdot \cdot , n \right \} , 由算法迭代定义可得 f(z^{k+1} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{k}-z_{i}^{k} e_{i} +ae_{j} )\le f(z^{k} ) . 又由 f(z^{k+1} )= f(z^{k} ) 可得 \min\limits_{a\in{{\Bbb R}} } f(z^{k} -z_{i}^{k} e_{j} +ae_{i} )=f(z^{k} ) . (ⅲ) 若 \left \| x ^{k} \right \| _{0}= s, \; \left \| y^{k} \right \| _{0}< t , 则进行如下分析. 对任意 i\in I_{1} (x^{k} ), j\in \left \{ 1, \cdots , n \right \} , 由算法迭代定义可得 f(z^{k+1} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{k}-z_{i}^{k} e_{i} +ae_{j} )\le f(z^{k} ) . 对任意 i\in \left \{ n+1, \cdots , n+m \right \} , 由算法迭代定义可得 f(z^{k+1} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{k} +ae_{i} )\le f(z^{k} ) . 又由于 f(z^{k+1} )= f(z^{k} ) , 故有 \min\limits_{a\in{{\Bbb R}} } f(z^{k} -z_{i}^{k} e_{j} +ae_{i} )=f(z^{k} ) ( i\in I(x^{k} ), j\in \left \{ 1, \cdots , n \right \} ) \min\limits_{a\in{{\Bbb R}} } f(z^{k} +ae_{i} )=f(z^{k} ) ( i\in \left \{ n+1, \cdots , n+m \right \} ). (ⅳ)若 \left \| x ^{k} \right \| _{0}< s, \; \left \| y^{k} \right \| _{0}=t , 则进行如下分析. 对任意 i\in I(y^{k} ), j\in \left \{ n+1, \cdots , n+m \right \} , 由算法迭代定义可得 f(z^{k+1} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{k}-z_{i}^{k} e_{i} +ae_{j} )\le f(z^{k} ) . 对任意 i\in \left \{ 1, \cdots , n \right \} 时, 由算法迭代定义可得 f(z^{k+1} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{k} +ae_{i} )\le f(z^{k} ) . 又由于 f(z^{k+1} )= f(z^{k} ) , 故有 \min\limits_{a\in{{\Bbb R}} } f(z^{k} -z_{i}^{k} e_{j} +ae_{i} )=f(z^{k} ) ( i\in I(y^{k} ), j\in \{ n+1, \cdots ,$$ n+m \}$)$\min\limits_{a\in{{\Bbb R}} } f(z^{k} +ae_{i} )=f(z^{k} )$ ($i\in \left \{ 1, \cdots , n \right \}$).

由定理$3.1$可知: 由算法3.1生成的迭代点$z^{k} \in C_{s} \times C_{t}$. 由定理$3.2$可知: $\left \{ f(z^{k}) \right \}$为非增的函数列, 且$f(z)$有下界, 因此$\left \{ f(z^{k}) \right \}$收敛. 令$z^{*} $$\left \{ z^{k} \right \} 中的一个聚点, 则存在 \left \{ z^{k} \right \} 的子列 \left \{ z^{p_{n} } \right \} 收敛到 z^{*} . 根据定义 2.1 对CW最优性条件的表述, 这里对 z^{*} 分四种情形进行证明. 然而, 由于 \left \| x^{*} \right \| _{0}< s, \left \| y ^{*} \right \| _{0}= t$$ \left \| x^{*} \right \| _{0}= s, \left \| y ^{*} \right \| _{0}< t$证明过程相似, 所以下面只需分三种情况证明聚点$z^{*}$满足CW最优性条件.

(ⅰ)若$\left \| x^{*} \right \| _{0}= s, \; \left \| y ^{*} \right \| _{0}= t$, 则进行如下分析. 由于$\left \{ z^{p_{n} } \right \}$收敛到$z^{*}$, 故存在$\left \{ z^{p_{n} } \right \}$的子序列$\left \{ z^{k_{n} } \right \}$, $z^{k_{n}} \rightarrow z^{*} $$I_{1} (z^{k_{n} } )=I_{1} (z^{*} ) . i\in I_{1} (x^{*} )$$ j\in \left \{ 1, \cdots , n \right \} $$i\in I_{1} (y^{*} )$$ j\in \left \{ n+1, \cdots , n+m \right \}$时, 由算法3.1迭代可得

$$$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} }-z_{i}^{*} e_{i} +ae _{j} ),$$$

(ⅱ)若$\left \| x^{*} \right \| _{0}< s, \; \left \| y ^{*} \right \| _{0}< t$, 则进行如下分析.

$$$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} } +ae _{i} ),$$$

$i\in I_{0} (z^{*} )$, 则根据$z^{k_{n} }$的四种取值情形进行证明, 但是由于$\left \| x^{k_{n} } \right \| _{0}< s, \; \left \| y ^{k_{n} } \right \| _{0}= t $$\left \| x^{k_{n} } \right \| _{0}= s, \; \left \| y ^{k_{n} } \right \| _{0}< t 证明方式相似, 因此只对 \left \| x^{k_{n} } \right \| _{0}= s, \; \left \| y ^{k_{n} } \right \| _{0}< t 进行证明, 即对如下三种情形证明. (ⅱ-1)若 \left \| x^{k_{n} } \right \| _{0}< s, \; \left \| y ^{k_{n} } \right \| _{0}< t , 则当 i\in I_{0} (z^{*} ) 时, 由算法3.1的迭代可得 $$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} } +ae _{i} ),$$ 其中 a\in{{\Bbb R}} . n\rightarrow \infty 时, 由(3.3)式可得 f(z^{*} )\le f(z^{* } +ae _{i} ) , 即 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} ) . (ⅱ-2)若 \left \| x^{k_{n} } \right \| _{0}= s, \; \left \| y ^{k_{n} } \right \| _{0}< t , 则根据的不同取值又分为如下两种情形进行证明. (a) 当 i\in I_{0} (x^{*} ) 时, 由于 I_{1} (x^{k_{n} } )\setminus I_{1} (x^{*} ) 非空, 故令 j_{n} \in I_{1} (x^{k_{n} } )\setminus I_{1} (x^{*} ) , 且由算法3.1迭代定义可得 $$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} }-z_{j_{n} }^{*} e_{j_{n} } +ae _{i} ),$$ 其中 a\in{{\Bbb R}} . n\rightarrow \infty 时, 有 z_{j_{n} }^{*} \rightarrow 0 , 且由(3.4)式可得 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} ) . 由于对任意的 a\in{{\Bbb R}} 不等式恒成立, 所以 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} ) . (b) 当 i\in I_{0} (y^{*} ) 时, 由算法3.1迭代定义可以得到 f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} } +ae _{i} ) , 其中 a\in{{\Bbb R}} . n\rightarrow \infty 时, 有 f(z^{*} )\le f(z^{* } +ae _{i} ) , 由于对任意 a\in{{\Bbb R}} 不等式恒成立, 所以 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} ) . (ⅱ-3)若 \left \| x^{k_{n} } \right \| _{0}= s, \; \left \| y ^{k_{n} } \right \| _{0}= t , 则进行如下证明. i\in I_{0} (x^{*} ) 时, 由于 I_{1} (x^{k_{n} } )\setminus I_{1} (x^{*} ) 非空, 故令 j_{n} \in I_{1} (x^{k_{n} } )\setminus I_{1} (x^{*} ) , 且由算法3.1迭代定义可得 f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} }-z_{j_{n} }^{*} e_{j_{n} } +ae _{i} ) , 其中 a\in{{\Bbb R}} . n\rightarrow \infty 时, 有 z_{j_{n} }^{*} \rightarrow 0 , 故 f(z^{*} )\le f(z^{* } +ae _{i} ) , 由于对任意 a\in{{\Bbb R}} 不等式一直成立, 所以 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} ) . i\in I_{0} (y^{*} ) 时, 由于 I_{1} (y^{k_{n} } )\setminus I_{1} (y^{*} ) 非空, 故令 j_{n} \in I_{1} (y^{k_{n} } )\setminus I_{1} (y^{*} ) , 且由算法3.1迭代定义可得 f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} }-z_{j_{n} }^{*} e_{j_{n} } +ae _{i} ) , 其中 a\in{{\Bbb R}} . n\rightarrow \infty 时, 有 z_{j_{n} }^{*} \rightarrow 0 , 故 f(z^{*} )\le f(z^{* } +ae _{i} ) , 由于对任意 a\in{{\Bbb R}} 不等式一直成立, 所以 f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} ) . 综上所述, 情形(ⅱ)中的 z^{* } 满足CW最优性条件中的条件(1). (ⅲ)若 \left \| x^{*} \right \| _{0}= s, \; \left \| y ^{*} \right \| _{0}< t , 则根据 i 的不同取值分为如下三种情形进行证明. (ⅲ-1)当 i\in I_{1} (x^{*} )$$ j=\left \{ 1, \cdots , n \right \}$时, 由于$\left \{ z^{p_{n} } \right \}$收敛到$z^{*}$, 故存在$\left \{ z^{p_{n} } \right \}$的子序列$\left \{ z^{k_{n} } \right \}$, $z^{k} \rightarrow z^{*} $$I_{1} (z^{k_{n} } )=I_{1} (z^{*} ) . 由算法3.1迭代定义可得 f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} }-z_{i}^{*} e_{i} +ae _{j} ) , 其中 a\in{{\Bbb R}} . n\rightarrow \infty 时, 有 f(z^{*} )\le f(z^{* }-z_{i}^{*} e_{i} +ae _{j} ) , 即 f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{* }-z_{i}^{*} e_{i} +ae _{j} ) . (ⅲ-2)当 i\in I_{1} (y^{*} ) 时, 由于 \left \{ z^{p_{n} } \right \} 收敛到 z^{*} , 故存在子序列 \left \{ z^{k_{n} } \right \}$$ \left \{ z^{k_{n} } \right \}$, $z^{k} \rightarrow z^{*}$. 存在正整数$N$, 当$n>N$时, 可得$I_{1} (y^{*} )\subseteq I_{1} (y^{k_{n} } )$.$i\in I_{1} (y^{k_{n} } )$时, 由算法3.1迭代定义可得$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} } +ae _{i} )$, 其中$a\in{{\Bbb R}}$.

$n\rightarrow \infty$时, 则有$f(z^{*} )\le f(z^{* } +ae _{i} )$, 由于对任意的$a\in{{\Bbb R}}$不等式恒成立, 所以$f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} )$.

(ⅲ-3)当$i\in I_{0} (y^{*} )$时, 根据$y^{k_{n} }$的取值分为如下两种情形进行证明.

(a) 当$\left \| y^{k_{n} } \right \| _{0} <t$时, 由算法3.1的迭代定义可得$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} } +ae _{i} )$, 其中$a\in{{\Bbb R}}$.

$n\rightarrow \infty$时, $f(z^{*} )\le f(z^{* } +ae _{i} )$, 即$f(z^{*} )\le \min\limits_{a\in{{\Bbb R}} } f(z^{* }-z_{i}^{*} e_{i} +ae _{j} )$.

(b) 当$\left \| y^{k_{n} } \right \| _{0} =t$时, 由于$I_{1} (y^{k_{n} } )\setminus I_{1} (y^{*} )$非空, 故令$j_{n} \in I_{1} (y^{k_{n} } )\setminus I_{1} (y^{*} )$, 且由算法3.1迭代定义可得$f(z^{k_{n} } )-f(z^{k_{n+1} } )\ge f(z^{k_{n} } )-f(z^{k_{n} }-z_{j_{n} }^{*} e_{j_{n} } +ae _{i} )$, 其中$a\in{{\Bbb R}}$.

$n\rightarrow \infty$时, $z_{j_{n} }^{*} \rightarrow 0$, 故$f(z^{*} )\le f(z^{* } +ae _{i} )$, 由于对任意$a\in{{\Bbb R}}$不等式一直成立, 所以$f(z^{*} )=\min\limits_{a\in{{\Bbb R}} } f(z^{* } +ae _{i} )$.

## 参考文献 原文顺序 文献年度倒序 文中引用次数倒序 被引期刊影响因子

Elad M , Figueiredo M A T , Ma Y .

On the role of sparse, redundant representations in image processing

Proceedings of the IEEE, 2010, 98 (6): 972- 982

Fletcher R , Matthews S P J .

Stable modification of explicit LU factors for simplex updates

Mathematical Programming, 1984, 30 (3): 267- 284

Wang R , Xiu N , Zhang C .

Greedy projected gradient-newton method for sparse logistic regression

IEEE Transactions on Neural Networks and Learning Systems, 2019, 31 (2): 527- 538

Goodfellow I , Bengio Y , Courville A .

Machine learning basics

Deep Learning, 2016, 1, 98- 164

Jordan M I , Mitchell T M .

Machine learning: trends, perspectives, and prospects

Science, 2015, 349 (6245): 255- 260

Wright J , Ma Y , Mairal J , et al.

Sparse representation for computer vision and pattern recognition

Proceedings of the IEEE, 2010, 98 (6): 1031- 1044

Bienstock D .

Computational study of a family of mixed-integer quadratic programming problems

Mathematical Programming, 1996, 74 (2): 121- 140

Misra J , Schmitt W , Hwang D , et al.

Interactive exploration of microarray gene expression patterns in a reduced dimensional space

Genome Research, 2002, 12 (7): 1112- 1120

Zou H , Hastie T , Tibshirani R .

Sparse principal component analysis

Journal of Computational and Graphical Statistics, 2006, 15 (2): 265- 286

Bertsimas D , Shioda R .

Computational Optimization and Applications, 2009, 43 (1): 1- 22

Nikolova M .

Description of the minimizers of least squares regularized with l0-norm uniqueness of the global minimizer

SIAM Journal on Imaging Sciences, 2013, 6 (2): 904- 937

Pan L L , Xiu N H , Fan J .

Optimality conditions for sparse nonlinear programming

Science China Mathematics, 2017, 60 (5): 759- 776

Beck A , Eldar Y C .

Sparsity constrained nonlinear optimization: optimality conditions and algorithms

SIAM Journal on Optimization, 2013, 23 (3): 1480- 1509

Beck A , Hallak N .

On the minimization over sparse symmetric sets: projections, optimality conditions and algorithms

Mathematics of Operations Research, 2016, 41 (1): 196- 223

Beck A , Vaisbourd Y .

The sparse principal component analysis problem: optimality conditions and algorithms

Journal of Optimization Theory and Applications, 2016, 170 (1): 119- 143

Beck A , Hallak N .

Optimization problems involving group sparsity terms

Mathematical Programming, 2019, 178 (1): 39- 67

Gao H , Li Y , Zhang H .

The analysis of alternating minimization method for double sparsity constrained optimization problem

Asia-Pacific Journal of Operational Research, 2020, 37 (4): 2- 13

/

 〈 〉