In this section, we suggest the extragradient thresholding algorithm (ETA) to solve \(\ell_{1}\) regularization projection minimization problem (5) and give the convergence analysis of ETA.
First we state some basic operator concepts as regards monotonicity and some properties of the projection operator. Let \(P_{K}(\cdot)\) denote the projection operator from \(\mathbb{R}^{n}\) onto K, a nonempty closed convex subset of \(\mathbb{R}^{n}\). From the definition of projection operator, it follows that
$$\begin{aligned} \bigl\langle y-P_{K}(x), P_{K}(x)-x \bigr\rangle \geq0, \quad \forall y\in K, x\in\mathbb{R}^{n}. \end{aligned}$$
(9)
Consequently, we have
$$\begin{aligned} & \bigl\langle P_{K}(x)-P_{K}(y), x-y \bigr\rangle \geq\bigl\| P_{K}(x)-P_{K}(y)\bigr\| ^{2}, \quad \forall x, y\in\mathbb{R}^{n}, \end{aligned}$$
(10)
$$\begin{aligned} &\bigl\| P_{K}(x)-P_{K}(y)\bigr\| \leq \|x-y\|,\quad \forall x, y\in\mathbb{R}^{n}, \end{aligned}$$
(11)
$$\begin{aligned} &\bigl\| P_{K}(x)-y\bigr\| ^{2}\leq\|x-y\|^{2}- \bigl\| P_{K}(x)-x\bigr\| ^{2},\quad \forall y\in K, x\in \mathbb{R}^{n}. \end{aligned}$$
(12)
Lemma 3.1
[18]
Define a residue function
$$\begin{aligned} e(x,\alpha)=x-P_{K}\bigl[x-\alpha F(x)\bigr], \quad \alpha\geq0. \end{aligned}$$
The following statements are valid.
-
(a)
\(\forall\alpha>0\), \(F(x)^{\top}e(x,\alpha)\geq\frac {\|e(x,\alpha)\|^{2}}{\alpha}\);
-
(b)
for any
\(\alpha>0\), \(\frac{\|e(x,\alpha)\|}{\alpha}\)
is non-increasing;
-
(c)
for any
\(\alpha\geq0\), \(\|e(x,\alpha)\|\)
is non-decreasing.
In this paper, we suppose the mapping \(F:\mathbb{R}^{n}\rightarrow\mathbb {R}^{n}\) is co-coercive on a subset K of \(\mathbb{R}^{n}\). That is, there exists a constant \(c>0\) such that
$$\bigl\langle F(x)-F(y), x-y \bigr\rangle \geq c\bigl\| F(x)-F(y)\bigr\| ^{2}, \quad \forall x, y \in K. $$
It is clear that the co-coercive mapping is monotone, namely,
$$\begin{aligned} \bigl\langle F(x)-F(y), x-y \bigr\rangle \geq0,\quad \forall x, y \in K, \end{aligned}$$
but not necessarily strongly monotone, i.e., there is a constant \(c>0\) such that
$$\begin{aligned} \bigl\langle F(x)-F(y), x-y \bigr\rangle \geq c\|x-y\| ^{2}, \quad \forall x, y \in K. \end{aligned}$$
Remark 3.1
Every affine monotone function which is also symmetric must be co-coercive (on \(\mathbb{R}^{n}\)). The Euclidean projector \(P_{K}\) and \(I-P_{K}\) are both ‘co-coercive‘ functions [2, 19].
Lemma 3.2
Suppose that
\(F(\cdot)\)
is co-coercive on
K
with modulus
\(c>0\). Then for any given positive real number
α, when
\(c>\alpha/2\), the operator
\(I-\alpha F\)
is nonexpansive, that is, for any
\(x, y \in K\),
$$\bigl\| (I-\alpha F) (x)-(I-\alpha F) (y)\bigr\| \leq\|x-y\|. $$
Proof
For any \(x, y \in K\), when \(c>\alpha/2\), using the co-coercivity of F, it follows that
$$\begin{aligned} &\bigl\| (I-\alpha F) (x)-(I-\alpha F) (y)\bigr\| ^{2} \\ &\quad= \bigl\| (x-y)-\alpha\bigl[F(x)-F(y)\bigr]\bigr\| ^{2} \\ &\quad=\|x-y\|^{2}-2\alpha \bigl\langle x-y, F(x)-F(y) \bigr\rangle + \alpha ^{2}\bigl\| F(x)-F(y)\bigr\| ^{2} \\ &\quad\leq \|x-y\|^{2}-\alpha(2c-\alpha) \bigl\| F(x)-F(y)\bigr\| ^{2} \\ &\quad\leq\|x-y\|^{2}, \end{aligned}$$
which shows \(I-\alpha F\) is nonexpansive. □
For giving \(z^{k}\in\mathbb{R}^{n}_{+}\) and \(\lambda_{k}>0\), we consider an unconstrained minimization subproblem:
$$\begin{aligned} \min_{x\in\mathbb{R}^{n}} f_{\lambda_{k}} \bigl(x,z^{k}\bigr):=\bigl\| x-z^{k}\bigr\| ^{2} + \lambda_{k}\|x\|_{1}. \end{aligned}$$
(13)
Evidently, the minimizer \(x^{s}\) of the model (13) must satisfy the corresponding optimality condition
$$\begin{aligned} x^{s}=S_{\lambda_{k}}\bigl(z^{k}\bigr), \end{aligned}$$
(14)
where the shrinkage operator \(S_{\lambda}\) is defined by
$$\begin{aligned} \bigl(S_{\lambda}(z)\bigr)_{i}= \left \{ \begin{array}{@{}l@{\quad}l} z_{i}-\frac{\lambda}{2}, & z_{i}\geq\frac{\lambda}{2},\\ 0, & 0\leq z_{i}< \frac{\lambda}{2}. \end{array} \right . \end{aligned}$$
(15)
Evidently, the shrinkage operator \(S_{\lambda}\) is component-wise, i.e., \((S_{\lambda}(z))_{i}=S_{\lambda}(z_{i})\). Moreover, it is nonexpansive; i.e., \(\|S_{\lambda}(x)-S_{\lambda}(y)\|\leq\|x-y\|\), for any \(x, y\in \mathbb{R}_{+}^{n}\), see [20]. It demonstrates that a solution \(x\in\mathbb{R}^{n}\) of the subproblem (13) can be analytically expressed by (14).
By the solution representation, we construct the following extragradient thresholding algorithm (ETA) to solve the \(\ell_{1}\) regularized projection minimization problem (5).
-
Input:
c-the co-coercive modulus of F.
-
Step 0: Choose \(0\ne z^{0}\in\mathbb{R}_{+}^{n}\), \(\lambda_{0},\beta>0\), \(\tau,\gamma,\mu\in(0,1)\), \(\beta\gamma<2c\), \(\epsilon>0\) and integers \(n_{\max}>K_{0}>0\). Set \(k=0\).
-
Step 1: Compute
$$\begin{aligned}[b] &x^{k}=S_{\lambda_{k}} \bigl(z^{k} \bigr), \\ &y^{k}= \bigl[x^{k}-\alpha_{k}F \bigl(x^{k} \bigr) \bigr]_{+}, \end{aligned} $$
where \(\alpha_{k}=\beta\gamma^{m_{k}}\) with \(m_{k}\) being the smallest nonnegative integer satisfying
$$ \bigl\| F \bigl(x^{k} \bigr)-F \bigl(y^{k} \bigr) \bigr\| \leq\mu\frac{\|x^{k}-y^{k}\|}{\alpha_{k}}. $$
(16)
-
Step 2: If \(\|x^{k}-z^{k}\|\le\epsilon\) or the number of iterations is greater than \(n_{\max}\), then return \(z^{k}\), \(x^{k}\), \(y^{k}\) and stop. Otherwise, compute
$$\begin{aligned} z^{k+1}= \bigl[x^{k}-\alpha_{k}F \bigl(y^{k} \bigr) \bigr]_{+} \end{aligned}$$
and update \(\lambda_{k+1}\) by
$$\begin{aligned} \lambda_{k+1}=\left \{ \begin{array}{l@{\quad}l} \tau\lambda_{k}, & \mbox{if } k+1 \mbox{ is a multiple of } K_{0}, \\ \lambda_{k}, & \mbox{otherwise}, \end{array} \right . \end{aligned}$$
and \(k=k+1\), then go to Step 1.
Before analyzing the convergence of ETA, we first present a key lemma as regards co-coercive mapping.
Lemma 3.3
Suppose that mapping
F
is co-coercive and
\(\operatorname{SOL}(F)\neq\emptyset \). If
\(x^{k}\)
generated by ETA is not a solution of
\(\operatorname{NCP}(F)\), then for any
\(\widehat{x}\in\operatorname{SOL}(F)\), we have
$$\begin{aligned} \bigl\langle F\bigl(y^{k}\bigr), x^{k}- \widehat{x}\bigr\rangle \geq\bigl\langle F\bigl(y^{k}\bigr), x^{k}-y^{k}\bigr\rangle \geq(1-\mu)\frac{\|x^{k}-y^{k}\| ^{2}}{\beta}. \end{aligned}$$
(17)
Proof
For any \(\widehat{x}\in\operatorname{SOL}(F)\), we have \(F(\widehat{x})^{\top }\widehat{x}=0\). Since \(y^{k}\in\mathbb{R}^{n}_{+}\), it follows that \(\langle F(\widehat{x}), y^{k}-\widehat{x}\rangle\geq0\). It is clear that the co-coercive mapping is pseudo-monotone, that is,
$$\begin{aligned} \bigl\langle x-y, F(y) \bigr\rangle \geq0\quad\Rightarrow\quad \bigl\langle x-y, F(x) \bigr\rangle \geq0, \quad \forall x, y\in K \mbox{ and } x\neq y. \end{aligned}$$
By the definition of pseudo-monotonicity, it follows that \(\langle F(y^{k}), y^{k}-\widehat{x}\rangle\geq0\). Hence,
$$\begin{aligned} \bigl\langle F\bigl(y^{k}\bigr), x^{k}-\widehat{x}\bigr\rangle =&\bigl\langle F\bigl(y^{k}\bigr), x^{k}-y^{k}+y^{k}- \widehat{x}\bigr\rangle \\ \geq&\bigl\langle F\bigl(y^{k}\bigr), x^{k}-y^{k} \bigr\rangle \\ =&\bigl\langle F\bigl(x^{k}\bigr), x^{k}-y^{k} \bigr\rangle -\bigl\langle F\bigl(x^{k}\bigr)-F\bigl(y^{k} \bigr), x^{k}-y^{k}\bigr\rangle \\ \geq& \frac{1}{\alpha_{k}}\bigl\| x^{k}-y^{k}\bigr\| ^{2}- \frac{\mu}{\alpha_{k}}\bigl\| x^{k}-y^{k}\bigr\| ^{2} \\ \geq& \frac{1-\mu}{\beta}\bigl\| x^{k}-y^{k}\bigr\| ^{2}, \end{aligned}$$
where the last inequality but one follows from Lemma 3.1 and (16). □
We now begin to analyze the convergence of the proposed ETA.
Theorem 3.1
Suppose that the mapping
F
is co-coercive with modulus
\(c>\beta\gamma /2\)
and
\(\operatorname{SOL}(F)\neq\emptyset\). Let
\(\{(z^{k}, x^{k}, y^{k})\}\)
and
\(\{\lambda_{k}\}\)
be sequences generated by ETA, then
-
(i)
the sequences
\(\{z^{k}\}\), \(\{x^{k}\}\), and
\(\{y^{k}\}\)
are all bounded;
-
(ii)
any cluster point of the sequence
\(\{x^{k}\}\)
is a solution of
\(\operatorname{NCP}(F)\).
Proof
(i) Let \(\widehat{x}\in\operatorname{SOL}(F)\). By the definition (15) of operator \(S_{\lambda}\), we have
$$\begin{aligned} \bigl\| x^{k}-\widehat{x}\bigr\| =\bigl\| S_{\lambda} \bigl(z^{k}\bigr)-\widehat{x}\bigr\| \leq\bigl\| z^{k}-\widehat {x}\bigr\| + \sqrt{n}\lambda_{k}/{2}\leq\bigl\| z^{k}-\widehat{x}\bigr\| +\sqrt{n} \lambda_{0}/{2}. \end{aligned}$$
(18)
In view of \(\widehat{x}\in\operatorname{SOL}(F)\), we have \(\widehat{x}=[\widehat {x}-\alpha_{k}F(\widehat{x})]_{+}\). Since \(c>\beta\gamma/2>\alpha_{k}/2\), by Lemma 3.2, we see that \(I-\alpha_{k} F\) is nonexpansive. Together with the nonexpansive property of the projection operator, it follows that
$$\begin{aligned} \bigl\| y^{k}-\widehat{x}\bigr\| =&\bigl\| \bigl[x^{k}- \alpha_{k}F\bigl(x^{k}\bigr)\bigr]_{+} -\widehat{x}\bigr\| \\ =&\bigl\| \bigl[x^{k}-\alpha_{k}F\bigl(x^{k}\bigr) \bigr]_{+}-\bigl[\widehat{x}-\alpha_{k}F(\widehat {x})\bigr]_{+}\bigr\| \\ \leq&\bigl\| (I-\alpha_{k} F) \bigl(x^{k}-\widehat{x}\bigr)\bigr\| \\ \leq& \bigl\| x^{k}-\widehat{x}\bigr\| \\ \leq&\bigl\| z^{k}-\widehat{x}\bigr\| +\sqrt{n}\lambda_{k}/{2} \\ \leq&\bigl\| z^{k}-\widehat{x}\bigr\| +\sqrt{n}\lambda_{0}/{2}. \end{aligned}$$
(19)
From (12) and (17), we obtain
$$\begin{aligned} \bigl\| z^{k+1}-\widehat{x}\bigr\| ^{2} =&\bigl\| \bigl[x^{k}-\alpha_{k}F\bigl(y^{k}\bigr)\bigr]_{+}- \widehat{x}\bigr\| ^{2} \\ \leq&\bigl\| x^{k}-\alpha_{k}F\bigl(y^{k}\bigr)- \widehat{x}\bigr\| ^{2}-\bigl\| z^{k+1}-x^{k}+ \alpha_{k} F\bigl(y^{k}\bigr)\bigr\| ^{2} \\ =& \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-2\alpha_{k} \bigl\langle F\bigl(y^{k}\bigr), z^{k+1}-\widehat {x} \bigr\rangle - \bigl\| z^{k+1}-x^{k}\bigr\| ^{2} \\ \leq& \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-2\alpha_{k} \bigl\langle F\bigl(y^{k}\bigr), z^{k+1}-y^{k} \bigr\rangle -\bigl\| z^{k+1}-x^{k}\bigr\| ^{2} \\ =& \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-\bigl\| z^{k+1}-y^{k} \bigr\| ^{2}-\|y^{k}-x^{k}\|^{2} \\ &{}+2\bigl\langle x^{k}-y^{k}-\alpha_{k} F \bigl(y^{k}\bigr), z^{k+1}-y^{k} \bigr\rangle . \end{aligned}$$
(20)
By \(y^{k}=[x^{k}-\alpha_{k}F(x^{k})]_{+}\) and (9), it follows that
$$\begin{aligned} &2 \bigl\langle x^{k}-y^{k}- \alpha_{k} F\bigl(y^{k}\bigr), z^{k+1}-y^{k} \bigr\rangle \\ &\quad\leq2 \bigl\langle x^{k}-y^{k}-\alpha_{k} F \bigl(y^{k}\bigr), z^{k+1}-y^{k} \bigr\rangle +2 \bigl\langle y^{k}-x^{k}+\alpha_{k} F \bigl(x^{k}\bigr), z^{k+1}-y^{k} \bigr\rangle \\ &\quad= 2\alpha_{k} \bigl\langle F\bigl(x^{k}\bigr)-F \bigl(y^{k}\bigr), z^{k+1}-y^{k} \bigr\rangle \\ &\quad\leq \alpha_{k}^{2}\bigl\| F\bigl(x^{k}\bigr)-F \bigl(y^{k}\bigr)\bigr\| ^{2}+\bigl\| z^{k+1}-y^{k} \bigr\| ^{2}. \end{aligned}$$
(21)
Replacing (21) into (20), by (16) and (18), we deduce
$$\begin{aligned} &\bigl\| z^{k+1}-\widehat{x}\bigr\| ^{2} \\ &\quad\leq \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-\bigl\| y^{k}-x^{k} \bigr\| ^{2}+\alpha_{k}^{2}\bigl\| F\bigl(x^{k} \bigr)-F\bigl(y^{k}\bigr)\bigr\| ^{2} \\ &\quad\leq \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-\bigl\| y^{k}-x^{k} \bigr\| ^{2}+\mu^{2}\bigl\| x^{k}-y^{k} \bigr\| ^{2} \\ &\quad= \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-\bigl(1-\mu^{2} \bigr)\bigl\| y^{k}-x^{k}\bigr\| ^{2} \\ &\quad\leq\bigl\| x^{k}-\widehat{x}\bigr\| ^{2}. \end{aligned}$$
(22)
Hence, by definition of \(\lambda_{k}\), it follows that
$$\begin{aligned} \bigl\| z^{k+1}-\widehat{x}\bigr\| \leq& \bigl\| x^{k}- \widehat{x}\bigr\| \leq \bigl\| z^{k}-\widehat{x}\bigr\| +\frac{\sqrt{n}}{2}\lambda_{k} \leq \bigl\| x^{k-1}-\widehat{x}\bigr\| +\frac{\sqrt{n}}{2}\lambda_{k} \\ \leq& \bigl\| z^{k-1}-\widehat{x}\bigr\| +\frac{\sqrt{n}}{2}(\lambda_{k}+ \lambda _{k-1})\leq \cdots \\ \leq& \bigl\| z^{0}-\widehat{x}\bigr\| +\frac{\sqrt{n}}{2}\sum _{i=0}^{k}\lambda _{i} \\ \leq& \bigl\| z^{0}-\widehat{x}\bigr\| +\frac{\sqrt{n}}{2}\frac{\lambda _{0}K_{0}}{1-\tau}:=C, \end{aligned}$$
(23)
which shows \(\{z^{k}\}\) is bounded. Together with (18) and (19), we see that \(\{x^{k}\}\) and \(\{y^{k}\}\) are both bounded.
(ii) Now we prove \(\lim_{k\rightarrow\infty}\|x^{k}-y^{k}\|=0\). By (22) and (18), it follows that
$$\begin{aligned} \bigl(1-\mu^{2}\bigr)\bigl\| y^{k}-x^{k} \bigr\| ^{2} \leq&\bigl\| x^{k}-\widehat{x}\bigr\| ^{2}- \bigl\| z^{k+1}-\widehat {x}\bigr\| ^{2} \\ \leq&\bigl\| x^{k}-\widehat{x}\bigr\| ^{2}- \bigl(\bigl\| x^{k+1}- \widehat{x}\bigr\| -\sqrt {n}\lambda_{k+1}/{2} \bigr)^{2} \\ =&\bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-\bigl\| x^{k+1}-\widehat{x} \bigr\| ^{2}+\sqrt{n}\lambda _{k+1}\bigl\| x^{k+1}-\widehat{x}\bigr\| -n \lambda_{k+1}^{2}/4 \\ \leq& \bigl\| x^{k}-\widehat{x}\bigr\| ^{2}-\bigl\| x^{k+1}- \widehat{x}\bigr\| ^{2}+\sqrt{n}\lambda _{k+1}\bigl\| x^{k+1}- \widehat{x}\bigr\| , \end{aligned}$$
which leads to the following inequality:
$$\begin{aligned} \bigl(1-\mu^{2}\bigr)\sum_{k=0}^{\infty} \bigl\| y^{k}-x^{k}\bigr\| ^{2} \leq& \sum _{k=0}^{\infty} \bigl(\bigl\| x^{k}-\widehat{x} \bigr\| ^{2}-\bigl\| x^{k+1}-\widehat{x}\bigr\| ^{2}+\sqrt{n} \lambda_{k+1}\bigl\| x^{k+1}-\widehat{x}\bigr\| \bigr) \\ \leq& \bigl\| x^{0}-\widehat{x}\bigr\| ^{2}+\sqrt{n} \sum _{k=0}^{\infty}\lambda _{k+1}\bigl\| x^{k+1}- \widehat{x}\bigr\| \\ \leq& \bigl\| x^{0}-\widehat{x}\bigr\| ^{2}+\sqrt{n}C \sum _{k=0}^{\infty}\lambda _{k+1} \\ =& \bigl\| x^{0}-\widehat{x}\bigr\| ^{2}+\sqrt{n}C \frac{\lambda_{0}K_{0}}{1-\tau}< + \infty, \end{aligned}$$
where the third inequality holds from (23), and thus we have \(\lim_{k\rightarrow\infty}\|x^{k}-y^{k}\|=0\).
Since \(\{x^{k}\}\) is bounded, \(\{x^{k}\}\) has at least one cluster point. Let \(x^{*}\) be a cluster point of \(\{x^{k}\}\) and a subsequence \(\{ x^{k_{j}}\}\) converge to \(x^{*}\). Next we will show \(x^{*}\in\operatorname{SOL}(F)\).
If there is a positive low bounded \(\alpha_{\min}\) such that \(\alpha _{k_{i}}\geq\alpha_{\min} >0\), from Lemma 3.1(b) and (c), we get
$$\begin{aligned} \min\{1, \alpha\}\bigl\| e(x,1)\bigr\| \leq\bigl\| e(x,\alpha)\bigr\| \leq\max\{1,\alpha \}\bigl\| e(x,1)\bigr\| , \end{aligned}$$
(24)
where \(e(x,\alpha)=x-[x-\alpha F(x)]_{+}\). Together with the continuity of \(e(x,\alpha)\) for x and \(\lim_{k\rightarrow\infty}\|x^{k}-y^{k}\|=0\), we have
$$ \begin{aligned}[b] \bigl\| e\bigl(x^{*}, 1\bigr)\bigr\| &=\lim_{k_{i}\rightarrow\infty}\bigl\| e\bigl(x^{k_{i}}, 1\bigr)\bigr\| \leq\lim_{k_{i}\rightarrow\infty} \frac{\|e(x^{k_{i}}, \alpha_{k_{i}})\|}{\min\{1, \alpha_{k_{i}}\}}\\ &\leq \lim_{k_{i}\rightarrow\infty}\frac{\|e(x^{k_{i}}, \alpha _{k_{i}})\|}{\min\{1, \alpha_{\min}\}} =\lim_{k_{i}\rightarrow\infty} \frac{\|x^{k_{i}}-y^{k_{i}}\|}{\min\{ 1, \alpha_{\min}\}}=0. \end{aligned} $$
(25)
If \(\lim_{k_{i}\rightarrow\infty}\alpha_{k_{i}}=0\), for enough large \(k_{i}\), by Lemma 3.1(b) and (16), we get
$$\begin{aligned} \bigl\| e\bigl(x^{k_{i}}, 1\bigr)\bigr\| \leq\frac{\|e(x^{k_{i}}, \frac{1}{\beta}\alpha_{k_{i}})\| }{\frac{1}{\beta}\alpha_{k_{i}}}< \frac{1}{\mu}\bigl\| F \bigl(x^{k_{i}}\bigr)-F\bigl(\overline {y}^{k_{i}}\bigr)\bigr\| , \end{aligned}$$
(26)
where \(\overline{y}^{k_{i}}=[x^{k_{i}}-\frac{1}{\beta}\alpha_{k_{i}}F(x^{k_{i}})]_{+}\). Taking the limit in the above inequality, we have
$$\begin{aligned} \bigl\| e\bigl(x^{*}, 1\bigr)\bigr\| =\lim_{k_{i}\rightarrow\infty}\bigl\| e\bigl(x^{k_{i}}, 1 \bigr)\bigr\| \leq \lim_{k_{i}\rightarrow\infty}\frac{1}{\mu}\bigl\| F \bigl(x^{k_{i}}\bigr)-F\bigl(\overline {y}^{k_{i}}\bigr)\bigr\| =0. \end{aligned}$$
(27)
It means \(x^{*}=[x^{*} -F(x^{*})]_{+}\). Hence we get \(x^{*}\in\operatorname{SOL}(F)\). The proof is thus complete. □