## A Golden Ratio Primal-Dual Algorithm for a Class of Nonsmooth Saddle Point Problems

Nie Jialin,, Long Xianjun,*

School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing 400067

 基金资助: 国家自然科学基金(11471059)重庆市自然科学基金(cstc2021jcyj-msxmX0721)重庆市研究生导师团队建设项目(yds223010)重庆市研究生创新型科研项目(CYS240567)重庆工商大学团队项目(ZDPTTD201908)

 Fund supported: NSFC(11471059)NSF of Chongqing(cstc2021jcyj-msxmX0721)Team Building Project for Graduate Tutors in Chongqing(yds223010)Innovation Project for Graduate Students of Chongqing(CYS240567)Project of Chongqing Technology and Business University(ZDPTTD201908)

Abstract

In this paper, we present a new golden ratio primal-dual algorithm to solve the nonsmooth saddle point problems, which is full-splitting.Under some appropriate conditions, we prove the sequence generated by the algorithm iteration converges to the solution of the problem, as well as an $O(1/N)$ ergodic convergence rate result. Finally, with comparisons to Zhu, Liu and Tran-Ding's algorithms, we give some numerical experiments to show the less iterate numbers and CPU time of the proposed method.

Keywords： Saddle point problems; Golden ratio; Primal-dual algorithm; Iterate convergence; Ergodic convergence

Nie Jialin, Long Xianjun. A Golden Ratio Primal-Dual Algorithm for a Class of Nonsmooth Saddle Point Problems[J]. Acta Mathematica Scientia, 2024, 44(4): 1080-1091

$\mathbb{R}^{p} $$\mathbb{R}^{q} 是有限维欧式空间, 并且每个空间上的內积和诱导范数分别表示为 \langle \cdot,\cdot \rangle$$ \|\cdot\|=\sqrt{\langle \cdot,\cdot \rangle}.$ 假设 $h:\mathbb{R}^{q}\rightarrow \mathbb{R}$ 是凸函数, f:\mathbb{R}^{p}\rightarrow \mathbb{R}\cup\{+\infty\} g:\mathbb{R}^{q}\rightarrow \mathbb{R}\cup\{+\infty\} 是扩展实值下半连续凸函数, \Phi :\mathbb{R}^{q} \times \mathbb{R}^{p} \rightarrow \mathbb{R} 是连续可微函数, K:\mathbb{R}^{q} \rightarrow \mathbb{R}^{p} 是有界线性算子. 函数 f 的共轭函数定义为 f^*(y)=\sup\limits_{x\in\mathbb{R}^{p}}\{\langle{y,x}\rangle-f(x)\} . 考虑如下带有双线性耦合项的鞍点问题 $$\min_{x\in \mathbb{R}^{q}} \max_{y\in \mathbb{R}^{p}}\{ g(x) + \langle Kx,y \rangle - f^{*}(y)\},$$ 问题 (1.1) 在许多领域有着广泛的应用, 比如信号与图像处理、压缩感知、机器学习、博弈论等, 许多学者研究了问题 (1.1), 详情见文献 [1-3]. 求解问题 (1.1) 的相关算法已经有很多, 其中最经典的是 Arrow-Hurwicz 方法[4], 即原始对偶算法. 从初始点 (x_{0},y_{0})\in \mathbb{R}^{q}\times \mathbb{R}^{p} 开始, 具体迭代为 \begin{equation*} \begin{aligned} \left\{ \begin{array}{ll} x_{n}=\mbox{Prox}_{\tau g}(x_{n-1}-\tau K^{\top}y_{n-1}),\\ y_{n}=\mbox{Prox}_{\sigma f^{*}}(y_{n-1}+\sigma Kx_{n}), \end{array} \right. \end{aligned} \end{equation*} 其中 \tau>0, \sigma>0 是步长参数. 该方法在图像处理中也被称作原始对偶混合梯度算法. 若 f^{*} 的定义域有界且步长很小时, 可基于原始对偶间隙函数得到 O(\frac{1}{N}) 的遍历收敛率, 具体的收敛性分析[2,5]. 但是 Arrow-Hurwicz 方法在一般情况下不收敛[6]. 为了克服这一缺点, 许多学者提出了不同的改进方法[7-10]. 受文献 [9] 的启发, 周丹青等[11] 提出了一类黄金比率原始对偶算法求解如下鞍点问题 $$\min_{x\in \mathbb{R}^{q}} \max_{y\in \mathbb{R}^{p}}\{ h(x) + g(x) + \langle Kx,y \rangle - f^{*}(y)\},$$ 这里 h 是可微函数. 具体迭代为 \begin{aligned} \left\{ \begin{array}{ll} z_{n} = \frac{\psi-1}{\psi}x_{n-1} + \frac{1}{\psi}z_{n-1},\\ x_{n} = \mbox{Prox}_{\tau g}(z_{n} -\tau K^{\top}y_{n-1}- \tau \nabla h(x_{n-1})),\\ y_{n} = \mbox{Prox}_{\sigma f^{*}}(y_{n-1} + \sigma Kx_{n}), \end{array} \right. \end{aligned} 其中 \tau, \sigma>0 . \tau(\sigma\|K\|^{2}+2L_{h})\leq\psi , \psi\in(1,\phi] ( \phi 是黄金比率), 周丹青等证明了算法 (1.3) 的收敛性以及 O(\frac{1}{N}) 遍历收敛率. 注意到, 在问题 (1.1) 和 (1.2) 中涉及到一个双线性函数, 但是在实际应用中许多函数往往不满足双线性性. 因此, 如何研究带非线性二元函数的鞍点问题显得尤为重要. 最近, Zhu 等[12] 考虑了如下鞍点问题 $$\min_{x\in \mathbb{R}^{q}} \max_{y\in \mathbb{R}^{p}}\{\mathcal{L}(x,y)=h(x) + g(x) + \Phi(x,y) - f^{*}(y)\},$$ 这里 h 是可微函数. Zhu 等[12] 提出了如下原始对偶算法 \begin{aligned} \left\{ \begin{array}{ll} u_{n} = \mbox{Prox}_{\rho_{n-1}(f^{*}(\cdot)-\Phi(x_{n-1},\cdot))}(y_{n-1}),\\[2mm] s_{n} = \nabla_{y}\Phi(x_{n-1},u_{n})-\frac{1}{\rho_{n-1}}(u_{n}-y_{n-1}),\\[3mm] x_{n} = \mbox{Prox}_{\frac{g}{L_{n-1}}}(x_{n-1} -\frac{1}{L_{n-1}}[\nabla h(x_{n-1})+ \nabla_{x}\Phi(x_{n-1},u_{n})]),\\[3mm] y_{n} = y_{n-1} + \frac{\eta_{n-1}}{\rho_{n-1}}(\mbox{Prox}_{-\rho_{n-1} \Phi(x_{n},\cdot)}(y_{n-1} - \rho_{n-1} s_{n}) - y_{n-1}), \end{array} \right. \end{aligned} 其中 \rho_{n-1} > 0, \eta_{n-1} = \frac{\rho_{n-1}}{2} , L_{n-1} = L_{11} + L_{h} + L_{21}^{2}(2+\rho_{n-1}L_{22})\rho_{n-1} ( L_{11} L_{21} $$L_{22} 是各个 Lipschitz 常数, 见假设 2.2). 在函数 \nabla_{x}\Phi(x,y)$$ \nabla_{y}\Phi(x,y) 关于 $x $$y 是 Lipschitz 连续的假设下证明了算法 (1.5) 的收敛性以及 O(\frac{1}{N}) 遍历收敛率. 值得注意的是, 算法 (1.5) 对于邻近算子参数 L_{n} 的取值范围严重依赖于 Lipschitz 常数, 且单次循环迭代步骤较多, 这将严重影响算法的收敛速度. 此外, 函数 h 可微这一假设比较强. 事实上, 许多函数并不满足可微性. 因此, 如何在不可微情况下提出加速的算法求解问题 (1.4) 是本文讨论的重点. 本文受到文献 [11,12] 的启发, 我们提出了求解问题 (1.4) 的黄金比率原始对偶算法, 证明了由算法产生的序列收敛到问题 (1.4) 的解, 并用数值实验验证了算法的优势. 本文所得的结果改进和推广了文献 [11,12] 中相应结果. ## 2 预备知识 全文用 \mathbb{N} 表示正整数集, 用 \phi 表示黄金比率, 即 \phi=\frac{1+\sqrt{5}}{2} . w:\mathbb{R}^{q}\rightarrow \mathbb{R}\bigcup \{+\infty\} 为扩展实值下半连续凸函数, 其有效域记为 dom(w):=\{x\in \mathbb{R}^{q}:w(x)<+\infty\} . x\in \mathbb{R}^{q} , w 在点 x 的次微分记为 \partial w(x) , 即 \partial w(x):=\{\xi \in \mathbb{R}^{q}:w(y)\geq w(x)+\langle\xi,y-x\rangle, \forall y\in \mathbb{R}^{q}\} , w 在点 x 的邻近点映射定义为 \begin{equation*} \mbox{Prox}_{\lambda w}(x):=\arg\min_{y\in \mathbb{R}^{q}}\{w(y)+\frac{1}{2\lambda}\|y-x\|^{2}\}, x\in \mathbb{R}^{q}, \lambda >0. \end{equation*} 问题 (1.4) 的原始问题与对偶问题分别为 \begin{equation*} P^{*}:=\min_{x\in \mathbb{R}^{q}}\{h(x)+g(x)+p(x)\}, p(x):=\max_{y\in \mathbb{R}^{p}}\{\Phi(x,y) - f^{*}(y)\}, \end{equation*} \begin{equation*} D^{*}:=\max_{y\in \mathbb{R}^{p}}\{d(y)-f^{*}(y)\}, d(y):=\min_{x\in \mathbb{R}^{q}}\{h(x) + g(x) + \Phi(x,y)\}. \end{equation*} 为了证明本文的主要结果, 我们需要如下假设. 假设 2.1 对于问题 (1.4), 设存在 (\bar{x},\bar{y})\in ({\rm dom}(h)\cap {\rm dom}(g))\times {\rm dom}(f) 使得 \begin{equation*} \mathcal{L}(\bar{x},y)\leq \mathcal{L}(\bar{x},\bar{y})\leq \mathcal{L}(x,\bar{y}), \end{equation*} 并且 ({\rm dom}(h)\cap {\rm dom}(g))\times {\rm dom}(f)\subseteq {\rm dom}(\Phi) 以及 \mathcal{L}(\bar{x},\bar{y}) 是有界的. 注 2.1 假设 2.1 是凸凹极小极大问题中的标准假设. 它保证了强对偶性 (P^{*}=D^{*}=\mathcal{L}(\bar{x},\bar{y})) 以及问题 P^{*}$$ D^{*}$ 解的存在性[14].

\begin{equation*} \begin{aligned} \left\{ \begin{array}{ll} \|\nabla_{x}\Phi(x_{2},y_{2})-\nabla_{x}\Phi(x_{1},y_{1})\|\leq L_{11}\|x_{2}-x_{1}\|+L_{12}\|y_{2}-y_{1}\|,\\ \|\nabla_{y}\Phi(x_{1},y_{2})-\nabla_{y}\Phi(x_{1},y_{1})\|\leq L_{21}\|y_{2}-y_{1}\|. \end{array} \right. \end{aligned} \end{equation*}

$\begin{equation*} \|u_{n} - u_{n-1}\| \leq L_{h}\|x_{n} - x_{n-1}\|. \end{equation*}$

$\begin{equation*} S:=\{(\bar{x},\bar{y})\in \mathbb{R}^{q} \times \mathbb{R}^{p}:0\in \partial g(\bar{x})+\nabla_{x}\Phi(\bar{x},\bar{y})+\partial h(\bar{x}), 0\in \nabla_{y}\Phi(\bar{x},\bar{y})-\partial f^{*}(\bar{y})\}. \end{equation*}$

$\begin{equation*} p=\mbox{Prox}_{\lambda w}(x)\Leftrightarrow \langle p-x,y-p\rangle\geq \lambda(w(p)-w(y)), \forall y\in \mathbb{R}^{q}. \end{equation*}$

$\begin{equation*} \langle x-y,x-z\rangle=\frac{1}{2}\|x-y\|^{2}+\frac{1}{2}\|x-z\|^{2}-\frac{1}{2}\|y-z\|^{2}. \end{equation*}$
$\begin{equation*} \|\alpha x+(1-\alpha) y\|^{2}=\alpha\|x\|^{2}+(1-\alpha)\|y\|^{2}-\alpha(1-\alpha)\|x-y\|^{2}. \end{equation*}$

$(\bar{x},\bar{y})$ 是问题 (1.4) 的解, 则

$\begin{equation*} 0\in \partial g(\bar{x})+\nabla_{x}\Phi(\bar{x},\bar{y})+\partial h(\bar{x}), 0\in \nabla_{y}\Phi(\bar{x},\bar{y})-\partial f^{*}(\bar{y}). \end{equation*}$

$\bar{u} \in \partial h(\bar{x})$, 则上式等价于

\begin{equation*} \begin{aligned} \left\{ \begin{array}{ll} P(x):=g(x)-g(\bar{x})+\langle \nabla_{x}\Phi(\bar{x},\bar{y})+\bar{u},x-\bar{x}\rangle \geq 0, \forall x\in \mathbb{R}^{q},\\ D(y):=f^{*}(y)-f^{*}(\bar{y})-\langle \nabla_{y}\Phi(\bar{x},\bar{y}),y-\bar{y}\rangle \geq 0, \forall y\in \mathbb{R}^{p}. \end{array} \right. \end{aligned} \end{equation*}

$\begin{equation*} G(x,y):=P(x)+D(y), \forall (x,y)\in \mathbb{R}^{q}\times \mathbb{R}^{p}. \end{equation*}$

\begin{aligned} \tau G\left(x_{n}, y_{n}\right) \leq & \left\langle x_{n+1}-z_{n+1}, \bar{x}-x_{n+1}\right\rangle+\frac{\tau}{\sigma}\left\langle y_{n}-y_{n-1}, \bar{y}-y_{n}\right\rangle \\ & +\psi\left\langle x_{n}-z_{n+1}, x_{n+1}-x_{n}\right\rangle+\tau\left\langle u_{n}-u_{n-1}, x_{n}-x_{n+1}\right\rangle \\ & +\tau\left\langle\nabla_{x} \Phi\left(x_{n}, y_{n}\right)-\nabla_{x} \Phi\left(x_{n-1}, y_{n-1}\right), x_{n}-x_{n+1}\right\rangle \\ & +\tau\left\langle\nabla_{y} \Phi\left(x_{n}, y_{n}\right)-\nabla_{y} \Phi\left(x_{n}, y_{n-1}\right), y_{n-1}-y_{n}\right\rangle \end{aligned}

\begin{aligned} \left\{ \begin{array}{ll} \langle x_{n+1}-z_{n+1}+\tau\nabla_{x}\Phi(x_{n},y_{n})+\tau u_{n},\bar{x}-x_{n+1}\rangle\geq\tau(g(x_{n+1})-g(\bar{x})),\\ \langle \frac{\tau}{\sigma}(y_{n}-y_{n-1})-\tau\nabla_{y}\Phi(x_{n},y_{n-1}),\bar{y}-y_{n}\rangle\geq\tau(f^{*}(y_{n})-f^{*}(\bar{y})),\\ \langle x_{n}-z_{n}+\tau\nabla_{x}\Phi(x_{n-1},y_{n-1})+\tau u_{n-1},x_{n+1}-x_{n}\rangle\geq\tau(g(x_{n})-g(x_{n+1})). \end{array} \right. \end{aligned}

$G(x,y)$ 的定义知

\begin{align*} \tau G(x_{n},y_{n})=\ &\tau(P(x_{n})+D(y_{n}))\nonumber\\ =\ &\tau(g(x_{n})-g(\bar{x})+\langle\nabla_{x}\Phi(\bar{x},\bar{y})+ \bar{u},x_{n}-\bar{x}\rangle)\nonumber\\ &+\tau(f^{*}(y_{n})-f^{*}(\bar{y})-\langle\nabla_{y}\Phi(\bar{x},\bar{y}),y_{n}-\bar{y}\rangle). \end{align*}

\begin{align*} \tau G(x_{n},y_{n})\leq\ &\langle x_{n+1}-z_{n+1},\bar{x}-x_{n+1}\rangle+\langle x_{n}-z_{n},x_{n+1}-x_{n}\rangle+ \frac{\tau}{\sigma}\langle y_{n}-y_{n-1},\bar{y}-y_{n}\rangle\nonumber\\ &+\tau\langle\nabla_{x}\Phi(x_{n},y_{n}),\bar{x}-x_{n+1}\rangle+\tau\langle u_{n},\bar{x}-x_{n+1}\rangle\nonumber\\ &+\tau\langle\nabla_{x}\Phi(x_{n-1},y_{n-1}),x_{n+1}-x_{n}\rangle+\tau\langle u_{n-1},x_{n+1}-x_{n}\rangle\nonumber\\ &+\tau\langle\nabla_{y}\Phi(x_{n},y_{n-1}),y_{n}-\bar{y}\rangle+\tau\langle\nabla_{x}\Phi(\bar{x},\bar{y}),x_{n}-\bar{x}\rangle\nonumber\\ &+\tau\langle \bar{u},x_{n}-\bar{x}\rangle-\tau\langle\nabla_{y}\Phi(\bar{x},\bar{y}),y_{n}-\bar{y}\rangle. \end{align*}

\begin{align*} \tau G(x_{n},y_{n})\leq\ &\langle x_{n+1}-z_{n+1},\bar{x}-x_{n+1}\rangle+\psi\langle x_{n}-z_{n+1},x_{n+1}-x_{n}\rangle+ \frac{\tau}{\sigma}\langle y_{n}-y_{n-1},\bar{y}-y_{n}\rangle\nonumber\\ &+\tau\langle\nabla_{x}\Phi(x_{n},y_{n}),\bar{x}-x_{n}\rangle+\tau\langle\nabla_{x}\Phi(x_{n},y_{n})-\nabla_{x}\Phi(x_{n-1},y_{n-1}),x_{n}-x_{n+1}\rangle\nonumber\\ &+\tau\langle u_{n}-\bar{u},\bar{x}-x_{n}\rangle+\tau\langle u_{n}-u_{n-1},x_{n}-x_{n+1}\rangle\nonumber\\ &+\tau\langle\nabla_{y}\Phi(x_{n},y_{n-1}),y_{n}-y_{n-1}\rangle-\tau\langle\nabla_{y}\Phi(x_{n},y_{n-1}),\bar{y}-y_{n-1}\rangle\nonumber\\ &+\tau\langle\nabla_{x}\Phi(\bar{x},\bar{y}),x_{n}-\bar{x}\rangle-\tau\langle\nabla_{y}\Phi(\bar{x},\bar{y}),y_{n}-\bar{y}\rangle\nonumber\\ \leq\ &\langle x_{n+1}-z_{n+1},\bar{x}-x_{n+1}\rangle+\psi\langle x_{n}-z_{n+1},x_{n+1}-x_{n}\rangle+ \frac{\tau}{\sigma}\langle y_{n}-y_{n-1},\bar{y}-y_{n}\rangle\nonumber\\ &+\tau(\Phi(\bar{x},y_{n})-\Phi(x_{n},y_{n}))+\tau\langle\nabla_{x}\Phi(x_{n},y_{n})-\nabla_{x}\Phi(x_{n-1},y_{n-1}),x_{n}-x_{n+1}\rangle\nonumber\\ &+\tau\langle u_{n}-\bar{u},\bar{x}-x_{n}\rangle+\tau\langle u_{n}-u_{n-1},x_{n}-x_{n+1}\rangle\nonumber\\ &+\tau\langle\nabla_{y}\Phi(x_{n},y_{n-1}),y_{n}-y_{n-1}\rangle-\tau(\Phi(x_{n},\bar{y})-\Phi(x_{n},y_{n-1}))\nonumber\\ &+\tau(\Phi(x_{n},\bar{y})-\Phi(\bar{x},\bar{y}))-\tau(\Phi(\bar{x},y_{n})-\Phi(\bar{x},\bar{y}))\nonumber\\ \leq\ &\langle x_{n+1}-z_{n+1},\bar{x}-x_{n+1}\rangle+\psi\langle x_{n}-z_{n+1},x_{n+1}-x_{n}\rangle+ \frac{\tau}{\sigma}\langle y_{n}-y_{n-1},\bar{y}-y_{n}\rangle\nonumber\\ &+\tau\langle\nabla_{x}\Phi(x_{n},y_{n})-\nabla_{x}\Phi(x_{n-1},y_{n-1}),x_{n}-x_{n+1}\rangle\nonumber\\ &-\tau\langle u_{n}-\bar{u},x_{n} - \bar{x}\rangle+\tau\langle u_{n}-u_{n-1},x_{n}-x_{n+1}\rangle\nonumber\\ &+\tau\langle\nabla_{y}\Phi(x_{n},y_{n})-\nabla_{y}\Phi(x_{n},y_{n-1}),y_{n-1}-y_{n}\rangle. \end{align*}

\begin{aligned} & \left\|x_{n+1}-\bar{x}\right\|^{2}+\frac{\tau}{\sigma}\left\|y_{n}-\bar{y}\right\|^{2}+2 \tau G\left(x_{n}, y_{n}\right) \\ \leq & \left\|z_{n+1}-\bar{x}\right\|^{2}+\frac{\tau}{\sigma}\left\|y_{n-1}-\bar{y}\right\|^{2}-\frac{\tau}{\sigma}\left\|y_{n}-y_{n-1}\right\|^{2} \\ & -\psi\left\|x_{n}-x_{n+1}\right\|^{2}-\psi\left\|z_{n+1}-x_{n}\right\|^{2}+(\psi-1)\left\|x_{n+1}-z_{n+1}\right\|^{2} \\ & +2 \tau\left(L_{11}\left\|x_{n}-x_{n-1}\right\|+L_{12}\left\|y_{n}-y_{n-1}\right\|\right)\left\|x_{n}-x_{n+1}\right\| \\ & +2 \tau L_{h}\left\|x_{n}-x_{n-1}\right\|\left\|x_{n+1}-x_{n}\right\|+2 \tau L_{21}\left\|y_{n}-y_{n-1}\right\|^{2}. \end{aligned}

\begin{align*} &\frac{\psi}{\psi-1}\|z_{n+2}-\bar{x}\|^{2}+\frac{\tau}{\sigma}\|y_{n}-\bar{y}\|^{2}+2\tau G(x_{n},y_{n})+(\tau L_{h}+\tau L_{11})\|x_{n+1}-x_{n}\|^{2}\nonumber\\ \leq\ &\frac{\psi}{\psi-1}\|z_{n+1}-\bar{x}\|^{2}+\frac{\tau}{\sigma}\|y_{n-1}-\bar{y}\|^{2}-\psi\|z_{n+1}-x_{n}\|^{2}\nonumber\\ &+(\tau L_{h}+\tau L_{11})\|x_{n}-x_{n-1}\|^{2}-(\frac{\tau}{\sigma}-\tau L_{12}-2\tau L_{21})\|y_{n}-y_{n-1}\|^{2}. \end{align*}

\begin{aligned} \left\{ \begin{array}{ll} a_{n}=\frac{\psi}{\psi-1}\|z_{n+1}-\bar{x}\|^{2}+\frac{\tau}{\sigma}\|y_{n-1}-\bar{y}\|^{2}+(\tau L_{h}+ \tau L_{11})\|x_{n}-x_{n-1}\|^{2},\\[3mm] b_{n}=\psi\|z_{n+1}-x_{n}\|^{2}+(\frac{\tau}{\sigma}-\tau L_{12}-2\tau L_{21})\|y_{n}-y_{n-1}\|^{2}. \end{array} \right. \end{aligned}

$$$a_{n+1}\leq a_{n+1}+2\tau G(x_{n},y_{n})\leq a_{n}-b_{n}.$$$

$\begin{equation*} \lim\limits_{n\rightarrow\infty}\|y_{n}-y_{n-1}\|=\lim\limits_{n\rightarrow\infty}\|z_{n+1}-x_{n}\|=\lim\limits_{n\rightarrow\infty}\|z_{n}-x_{n}\|=0. \end{equation*}$

## 4 数值实验

\begin{align*} \min_{x\in \mathbb{R}^{q}}\{h(x):= &\frac{1}{2}x^{T}A_{0}x+b_{0}^{T}x s.t. \|x\| \leq D,\nonumber\\ &\frac{1}{2}x^{T}A_{i}x+b_{i}^{T}x+c_{i} \leq 0, i=1,\cdots,p\}, \end{align*}

### 图4

(i) 算法 1 与文献 [12,算法 1] 的变体 1 都收敛到问题的解.

(ii) 从 CPU 运行时间来看 (针对例 4.1 与例 4.2), 在适当参数的选取下, 算法 1 优于文献 [12,算法 1], 见表1-4.

(iii) 从迭代次数来看, 在适当参数的选取下, 例 4.2 中算法 1 随着维度和精度的增加仍然表现出很强的稳定性, 较文献 [12,算法 1] 的变体 1 数值效果更佳, 见表3表4.

(iv) 例 4.3 中, 在 $err=10^{-10}$, $p=q=50$, $p=q=100$, $p=q=150$ 的情况下, 我们选取了多组步长, 得到 $\tau=0.8$, $\sigma=0.7$ 时的数值效果最佳, 见表5图1, 图2; 以及在 $p=q=100$, $err=10^{-10}$, $err=10^{-20}$, $err=10^{-30}$ 的情况下, 我们选取了多组步长, 同样得到 $\tau=0.8$, $\sigma=0.7$ 时的数值效果最佳, 见表6图3, 图4.

(v) 算法 1 在 $h$ 光滑与非光滑的条件下同样适用.

## 参考文献 原文顺序 文献年度倒序 文中引用次数倒序 被引期刊影响因子

Bauschke H H, Combettes P. Convex Analysis and Monotone Operators Theory in Hilbert Spaces. New York: Springer-Verlag, 2020

Chambolle A, Pock T.

A first-order primal-dual algorithm for convex problems with applications to imaging

J Math Imaging Vis, 2011, 40: 120-145

Combettes P, Pesquet J C. Proximal splitting methods in signal processing// Bauschke H, Burachik R, Combetles P, et al. Fixed-Point Algorithms for Inverse Problems in Science and Engineering. New York: Springer-Verlag, 2011: 185-212

Uzawa H. Iterative methods for concave programming// Arrow K J, Hurwicz L, Uzawa H, eds. Studies in Linear and Nonlinear Programming. Stanford, CA: Stanford University Press, 1958, 6: 154-165

Esser E, Zhang X, Chan T F.

A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science

SIAM J Imaging Sci, 2010, 3: 1015-1046

He B, You Y, Yuan X.

On the convergence of primal-dual hybrid gradient algorithm

SIAM J Imaging Sci, 2014, 7(4): 2526-2537

Chambolle A, Pock T.

On the ergodic convergence rates of a first-order primal-dual algorithm

Math Program, 2016, 159(1): 253-287

Condat L.

A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms

J Optim Theory Appl, 2013, 158(2): 460-479

Malitsky Y.

Golden ratio algorithms for variational inequalities

Math Program, 2020, 184(1): 383-410

Xu Y.

Iteration complexity of inexact augmented Lagrangian methods for constrained convex programming

Math Program, 2021, 185: 89-117

Zhou D Q, Chang X K, Yang J F.

A new Golden ratio primal-dual algorithm

Numerical Math A J Chinese Univ, 2022, 44: 97-106

Zhu Y, Liu D, Tran-Ding Q.

New primal-dual algorithms for a class of nonsmooth and nonlinear convex-concave minimax problems

SIAM J Optim, 2022, 32(4): 2580-2611

Hamedani E Y, Aybat N S.

A primal-dual algorithm with line search for general convex-concave saddle point problems

SIAM J Optim, 2021, 31: 1299-1329

Rockafellar R, Wets R. Variational Analysis. Berlin: Springer, 2004

/

 〈 〉