Split proximal linearized algorithm and convergence theorems for the split DC program

Chuang, Chih-Sheng; Chen, Chi-Ming

doi:10.1186/s13660-019-2084-9

Research
Open access
Published: 22 May 2019

Split proximal linearized algorithm and convergence theorems for the split DC program

Chih-Sheng Chuang¹ &
Chi-Ming Chen²

Journal of Inequalities and Applications volume 2019, Article number: 148 (2019) Cite this article

775 Accesses
1 Citations
Metrics details

Abstract

In this paper, we study the split DC program by using the split proximal linearized algorithm. Further, linear convergence theorem for the proposed algorithm is established under suitable conditions. As applications, we first study the DC program (DCP). Finally, we give numerical results for the proposed convergence results.

1 Introduction

First, we recall the minimization problem for convex functions:

$$ \text{Find }\bar{x}\in \operatorname{arg}\min\bigl\{ f(x): x\in H\bigr\} , $$

(MP1)

where H is a real Hilbert space and $f: H\rightarrow(-\infty, \infty]$ is a proper, lower semicontinuous, and convex function. This is a classical convex minimization problem with many applications. To study this problem, Martinet [11] introduced the proximal point algorithm

$$ x_{n+1}=\operatorname{arg}\min_{y\in H} \biggl\{ f(y)+ \frac{1}{2\beta_{n}} \Vert y-x_{n} \Vert ^{2} \biggr\} ,\quad n\in\mathbb{N}, $$

(PPA)

and showed that $\{x_{n}\}_{n\in\mathbb{N}}$ converges weakly to a minimizer of f under suitable conditions. This algorithm is useful, however, only for convex problems, because the idea for this algorithm is based on the monotonicity of subdifferential operators of convex functions. So, it is important to consider the relation between nonconvex problems and a proximal point algorithm.

The following is a well-known nonconvex problem, known as DC program:

$$ \text{Find }\bar{x}\in \operatorname{arg}\min_{x\in\mathbb{R}^{n}}\bigl\{ f(x)=g(x)-h(x) \bigr\} , $$

(DCP)

where $g,h:\mathbb{R}^{n}\rightarrow \mathbb{R}$ are proper lower semicontinuous and convex functions. Here, the function f is called a DC function, and functions g and h are called DC components of f. In the DC program, the convention $(+\infty)-(+\infty)=+\infty$ has been adopted to avoid the ambiguity $(+\infty)-(+\infty)$ that does not present any interest. It is well known that a necessary condition for $x\in\operatorname{dom}(f):=\{x\in\mathbb{R}^{n}: f(x)<\infty\}$ to be a local minimizer of f is $\partial h(x)\subseteq \partial g(x)$, where $\partial g(x)$ and $\partial h(x)$ are the subdifferentials of g and h, respectively (see Definition 2.4). But this condition is hard to be reached. So, many researchers focus their attention on finding points such that $\partial h(x)\cap\partial g(x) \neq\emptyset$, where x is called a critical point of f [8].

It is worth mentioning the richness of the class of DC functions which is a subspace containing the class of lower-$\mathcal{C}^{2}$ functions. In particular, $\mathcal{DC}(\mathbb{R}^{n})$ contains the space $\mathcal{C}^{1,1}$ of functions whose gradient is locally Lipschitz continuous. Further, $\mathcal{DC}(\mathbb{R}^{n})$ is closed under the operations usually considered in optimization. For example, a linear combination, a finite supremum, or the product of two DC functions remain DC. It is also known that the set of DC functions defined on a compact convex set of $\mathbb{R}^{n}$ is dense in the set of continuous functions on this set.

We also observed that the interest in the theory of DC functions has much increased in the last years. Some interesting optimality conditions and duality theorems related to the DC program have been given (for example, see [6, 7, 14]). Some algorithms for the DC program are proposed to analyze and solve a variety of highly structured and practical problems (for example, see [13]).

In 2003, Sun, Sampaio, and Candido [16] gave the following algorithm to study problem (DCP).

Algorithm 1.1

(Proximal point algorithm for (DCP) [16])

Let $\{\beta_{n}\} _{n\in\mathbb{N}}$ be a sequence in $(0,\infty)$, and let $g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}$ be proper lower semicontinuous and convex functions. Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ \text{compute }w_{n}\in\partial h(x_{n})\text{ and set }y_{n}=x_{n}+ \beta_{n}w_{n}, \\ x_{n+1}:=(I+\beta_{n}\partial g)^{-1}(y_{n}), \quad n\in\mathbb{N}, \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$

In 2016, Souza, Oliveira, and Soubeyran [15] gave the following algorithm to study the DC program.

Algorithm 1.2

(Proximal linearized algorithm for (DCP) [15])

Let $\{\beta_{n}\} _{n\in\mathbb{N}}$ be a sequence in $(0,\infty)$, and let $g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}$ be proper lower semicontinuous and convex functions. Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ \text{compute }w_{n}\in\partial h(x_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle w_{n},u-x_{n}\rangle\},\quad n\in\mathbb{N}, \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$

In fact, if h is differentiable, then it is reduced to the following:

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),u-x_{n}\rangle\},\quad n\in\mathbb{N}. \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$

Further, Souza, Oliveira, and Soubeyran [15] gave the following convergence theorem for problem (DCP).

Theorem 1.1

([15, Theorem 3])

Let $g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\cup\{+\infty\}$ be proper, lower semicontinuous, and convex functions, and $g-h$ be bounded from below. Suppose that g is ρ-strongly convex, h is differentiable, and $\nabla h(x)$ is L-Lipschitz continuous. Let $\{\beta_{n}\}_{n\in \mathbb{N}}$ be a bounded sequence with $\liminf_{n\rightarrow \infty}\beta _{n}>0$. Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by Algorithm 1.2. If $\rho>2L$, then $\{x_{n}\}_{n\in\mathbb {N}}$ converges linearly to a critical point x̄ of problem (DCP).

In this paper, we want to study the split DC program:

$$ \text{Find } \bar{x}\in H_{1} \text{ such that }\bar{x}\in \operatorname{arg}\min _{x\in H_{1}}f_{1}(x) \text{ and } A\bar{x}\in \operatorname{arg}\min _{y\in H _{2}}f_{2}(y), $$

(SDCP)

where $H_{1}$ and $H_{2}$ are real Hilbert spaces, $A:H_{1}\rightarrow H_{2}$ is a nonzero linear and bounded mapping with adjoint operator $A^{*}$, $g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}$ are proper lower semicontinuous and convex functions, and $g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}$ are proper lower semicontinuous and convex functions, and $f_{1}(x)=g_{1}(x)-h_{1}(x)$ for all $x\in H_{1}$, and $f_{2}(y)=g_{2}(y)-h_{2}(y)$ for all $y\in H_{2}$.

Clearly, (SDCP) is a generalization of problem (DCP). Indeed, if $H_{1}=H_{2}=\mathbb{R}^{n}$, $A:\mathbb{R}^{n}\rightarrow \mathbb{R}^{n}$ is the identity mapping, $g_{1}=g_{2}$, and $h_{1}=h_{2}$, then problem (SDCP) is reduced to problem (DCP).

If $h_{1}(x)=0$ and $h_{2}(y)=0$ for all $x\in H_{1}$ and $y\in H_{2}$, then (SDCP) is reduced to the split minimization problems (SMP) for convex functions:

$$\begin{aligned}& \mbox{Find }\bar{x}\in H_{1}\mbox{ such that }g_{1}( \bar{x})= \min_{u\in H_{1}}g_{1}(u)\mbox{ and }g_{2}(A\bar{x})=\min_{v\in H_{2}}g _{2}(v), \end{aligned}$$

(SMP)

where $H_{1}$ and $H_{2}$ are real Hilbert spaces, $A:H_{1}\rightarrow H_{2}$ is a linear and bounded mapping with adjoint $A^{*}$, $g_{1}:H_{1}\rightarrow \mathbb{R}$ and $g_{2}:H_{2}\rightarrow \mathbb{R}$ are proper, lower semicontinuous, and convex functions. For example, one can see [4] and the related references.

If $H_{1}=H_{2}=H$ and $A:H\rightarrow H$ is the identity mapping, then problem (SMP) is reduced to the common minimization problem for convex functions:

$$\begin{aligned}& \mbox{Find }\bar{x}\in H\mbox{ such that }g_{1}(\bar{x})= \min _{u\in H}g_{1}(u)\mbox{ and }g_{2}(\bar{x})= \min_{v\in H}g_{2}(v), \end{aligned}$$

(CMP)

where H is a real Hilbert space, $g_{1},g_{2}:H\rightarrow \mathbb {R}$ are proper, lower semicontinuous, and convex functions. Further, if the solution set of problem (CMP) is nonempty, then problem (CMP) is equivalent to the following problem:

$$\begin{aligned}& \mbox{Find }\bar{x}\in H\mbox{ such that }g_{1}(\bar{x})+g _{2}(\bar{x})=\min_{u\in H}g_{1}(u)+g_{2}(u), \end{aligned}$$

(MP2)

where H is a real Hilbert space, $g_{1},g_{2}:H\rightarrow \mathbb {R}$ are proper, lower semicontinuous, and convex functions. This problem is well known and it has many important applications, including multiresolution sparse regularization, Fourier regularization, hard-constrained inconsistent feasibility, and alternating projection signal synthesis problems. For example, one can refer to [5, 9] and the related references.

On the other hand, Moudafi [12] introduced the split variational inclusion problem, which is a generalization of problem (SMP):

$$\begin{aligned}& \mbox{Find }\bar{x}\in H_{1}\mbox{ such that }0_{H_{1}}\in B_{1}( \bar{x})\mbox{ and }0_{H_{2}}\in B_{2}(A \bar{x}), \end{aligned}$$

(SVIP)

where $H_{1}$ and $H_{2}$ are real Hilbert spaces, $B_{1}:H_{1}\multimap H _{1}$ and $B_{2}:H_{2}\multimap H_{2}$ are set-valued maximal monotone mappings, $A:H_{1}\rightarrow H_{2}$ is a linear and bounded operator, and $A^{*}$ is the adjoint of A. Here, $0_{H_{1}}$ and $0_{H_{2}}$ are zero elements of real Hilbert spaces $H_{1}$ and $H_{2}$, respectively. To study problem (SVIP), Byrne et al. [3] gave the following algorithm and related convergence theorem in infinite dimensional Hilbert space.

Theorem 1.2

([3])

Let $H_{1}$ and $H_{2}$ be real Hilbert spaces, $A:H_{1}\rightarrow H_{2}$ be a nonzero linear and bounded operator, and $A^{*}$ denote the adjoint operator of A. Let $B_{1}:H_{1}\multimap H_{1}$ and $B_{2}:H_{2}\multimap H_{2}$ be set-valued maximal monotone mappings, $\beta>0$, and $\gamma\in(0,\frac {2}{\Vert A\Vert ^{2}})$. Let Ω be the solution set of (SVIP), and suppose that $\varOmega\neq\emptyset$. Let $\{x_{n}\} _{n\in\mathbb{N}}$ be defined by

$$ x_{n+1}:=J_{\beta}^{B_{1}}\bigl[x_{n}-\gamma A^{*}\bigl(I-J_{\beta}^{B_{2}}\bigr)Ax_{n} \bigr],\quad n\in\mathbb{N}. $$

Then $\{x_{n}\}$ converges weakly to an element $\bar{x}\in\varOmega$.

If $B_{1}=\partial g_{1}$ and $B_{2}=\partial g_{2}$ (the subdifferential of $g_{i}$, $i=1,2$), then the algorithm given by Theorem 1.2 is reduced to the following algorithm:

$$ \textstyle\begin{cases} y_{n}=\operatorname{arg}\min_{z\in H_{2}}\{g(z)+\frac{1}{2\beta_{n}} \Vert z-Ax_{n} \Vert ^{2} \}, \\ z_{n}=x_{n}-\gamma A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}=\operatorname{arg}\min_{y\in H_{1}}\{g(y)+\frac{1}{2\beta_{n}} \Vert y-z_{n} \Vert ^{2} \},\quad n\in\mathbb{N}. \end{cases} $$

In this paper, motivated by the above works on DC programs and related problems, we want to study problem (SDCP) by using the split proximal linearized algorithm:

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta_{n}} \Vert v-Ax _{n} \Vert ^{2}-\langle\nabla h_{2}(Ax_{n}),v-Ax_{n}\rangle\}, \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta_{n}} \Vert u-z _{n} \Vert ^{2}-\langle\nabla h_{1}(z_{n}),u-z_{n}\rangle\},\quad n\in \mathbb{N}, \end{cases} $$

where $H_{1}$ and $H_{2}$ are real Hilbert spaces, $A:H_{1}\rightarrow H_{2}$ is a linear and bounded mapping with adjoint $A^{*}$, $g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}$ are proper lower semicontinuous and convex functions, and $g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}$ are proper lower semicontinuous and convex functions, $g_{1}$ and $g_{2}$ are strongly convex, $h_{1}$ and $h_{2}$ are Fréchet differentiable, $\nabla h_{1}$ and $\nabla h_{2}$ are L-Lipschitz continuous, and $f_{1}(x)=g_{1}(x)-h_{1}(x)$ for all $x\in H_{1}$, and $f_{2}(y)=g _{2}(y)-h_{2}(y)$ for all $y\in H_{2}$. Further, linear convergence theorems for the proposed algorithms are established under suitable conditions. Finally, we give numerical results for the proposed convergence theorems.

2 Preliminaries

Let H be a (real) Hilbert space with the inner product $\langle \cdot,\cdot\rangle$ and the norm $\Vert \cdot\Vert $. We denote the strong and weak convergence of $\{x_{n}\}_{n\in\mathbb{N}}$ to $x\in H$ by $x_{n}\rightarrow x$ and $x_{n}\rightharpoonup x$, respectively. For each $x,y,u,v\in H$ and $\lambda\in\mathbb{R}$, we have

$$\begin{aligned}& \Vert x+y \Vert ^{2}= \Vert x \Vert ^{2}+2\langle x,y\rangle+ \Vert y \Vert ^{2}, \end{aligned}$$

(2.1)

$$\begin{aligned}& \bigl\Vert \lambda x+(1-\lambda)y \bigr\Vert ^{2}= \lambda \Vert x \Vert ^{2}+(1-\lambda) \Vert y \Vert ^{2}- \lambda(1-\lambda) \Vert x-y \Vert ^{2}, \end{aligned}$$

(2.2)

$$\begin{aligned}& 2\langle x-y,u-v\rangle= \Vert x-v \Vert ^{2}+ \Vert y-u \Vert ^{2}- \Vert x-u \Vert ^{2}- \Vert y-v \Vert ^{2}. \end{aligned}$$

(2.3)

Definition 2.1

Let H be a real Hilbert space, $B:H\rightarrow H$ be a mapping, and $\rho>0$. Thus,

(i)
B is monotone if $\langle x-y,Bx-By\rangle\geq0$ for all $x,y\in H$.
(ii)
B is ρ-strongly monotone if $\langle x-y,Bx-By \rangle\geq\rho\Vert x-y\Vert ^{2}$ for all $x,y\in H$.

Definition 2.2

Let H be a real Hilbert space and $B:H\multimap H$ be a set-valued mapping with domain $\mathcal{D}(B):=\{x\in H:B(x)\neq\emptyset\}$. Thus,

(i)
B is called monotone if $\langle u-v,x-y\rangle\geq0$ for any $u\in B(x)$ and $v\in B(y)$.
(ii)
B is maximal monotone if its graph $\{(x,y):x\in \mathcal{D}(B), y\in B(x)\}$ is not properly contained in the graph of any other monotone mapping.
(iii)
B is ρ-strongly monotone if $\langle x-y,u-v \rangle\geq\rho\Vert x-y\Vert ^{2}$ for all $x,y\in H$ and all $u\in B(x)$, and $v\in B(y)$.

Definition 2.3

Let H be a real Hilbert space, and $f:H\rightarrow\mathbb{R}$ be a function. Thus,

(i)
f is proper if $\operatorname{dom}(f):=\{x\in H: f(x)<\infty\}\neq \emptyset$.
(ii)
f is lower semicontinuous if $\{x\in H:f(x)\leq r\}$ is closed for each $r\in\mathbb{R}$.
(iii)
f is convex if $f(tx+(1-t)y)\leq t f(x)+(1-t)f(y)$ for every $x,y\in H$ and $t\in[0,1]$.
(iv)
f is ρ-strongly convex ($\rho>0$) if
$$ f\bigl(tx+(1-t)y\bigr)+\frac{\rho}{2}\cdot t(1-t) \Vert x-y \Vert ^{2}\leq tf(x)+(1-t)f(y) $$
for all $x,y\in H$ and $t\in(0,1)$.
(v)
f is Gâteaux differentiable at $x\in H$ if there is $\nabla f(x)\in H$ such that
$$ \lim_{t\rightarrow0}\frac{f(x+ty)-f(x)}{t}=\bigl\langle y, \nabla f(x) \bigr\rangle $$
for each $y\in H$.
(vi)
f is Fréchet differentiable at x if there is $\nabla f(x)$ such that
$$ \lim_{y\rightarrow0} \frac{f(x+y)-f(x)-\langle\nabla f(x),y\rangle}{ \Vert y \Vert }=0. $$

Remark 2.1

Let H be a real Hilbert space. Then $f(x):=\Vert x\Vert ^{2}$ is a 2-strongly convex function. Besides, we know $g:H\rightarrow \mathbb {R}$ is ρ-strongly convex if and only if $g-\frac{\rho}{2}\Vert \cdot \Vert ^{2}$ is convex [1, Proposition 10.6].

Example 2.1

Let $g(x):=\frac{1}{2}\langle Qx,x\rangle-\langle x,b\rangle$, where $Q\in\mathbb{R}^{n\times n}$ is a real symmetric positive definite matrix and $b\in\mathbb{R}^{n}$. Then g is a strongly convex function.

Definition 2.4

Let $f:H\rightarrow(-\infty,\infty]$ be a proper lower semicontinuous and convex function. Then the subdifferential ∂f of f is defined by

$$ \partial f(x):=\bigl\{ x^{*}\in H: f(x)+\bigl\langle y-x,x^{*} \bigr\rangle \leq f(y)\text{ for each }y\in H\bigr\} $$

for each $x\in H$.

Lemma 2.1

Let $f:H\rightarrow(-\infty,\infty]$ be a proper lower semicontinuous and convex function. Then the following are satisfied:

(i)
∂f is a set-valued maximal monotone mapping.
(ii)
f is Gâteaux differentiable at $x\in \operatorname{int}( \mathcal{D})$ if and only if $\partial f(x)$ consists of a single element. That is, $\partial f(x)=\{\nabla f(x)\}$ [2, Proposition 1.1.10].
(iii)
Suppose that f is Fréchet differentiable. Then f is convex if and only if ∇f is a monotone mapping.

Lemma 2.2

([1, Example 22.3(iv)])

Let $\rho>0$, H be a real Hilbert space and $f:H\rightarrow \mathbb {R}$ be a proper, lower-semicontinuous, and convex function. If f is ρ-strongly convex, then ∂f is ρ-strongly monotone.

Lemma 2.3

([1, Proposition 16.26])

Let H be a real Hilbert space and $f:H\rightarrow (\infty,\infty]$ be a proper, lower semicontinuous, and convex function. If $\{u_{n}\}_{n\in \mathbb{N}}$ and $\{x_{n}\}_{n\in\mathbb{N}}$ are sequences in H with $u_{n}\in\partial f(x_{n})$ for all $n\in\mathbb{N}$, and $x_{n}\rightharpoonup x$ and $u_{n}\rightarrow u$, then $u\in\partial f(x)$.

Lemma 2.4

([17, p. 114])

Let $\{a_{n}\}_{n\in \mathbb{N}}$ and $\{b_{n}\}_{n\in\mathbb{N}}$ be sequences of nonnegative real numbers. If $\sum_{n=1}^{\infty}a_{n}=\infty$ and $\sum_{n=1}^{\infty}a_{n} b_{n}<\infty$, then $\liminf_{n\rightarrow \infty }b_{n}=0$.

Lemma 2.5

([10])

Let H be a real Hilbert space, $B:H\multimap H$ be a set-valued maximal monotone mapping, $\beta >0$, and $J_{\beta}^{B}$ be defined by $J_{\beta}^{B}(x):=(I+\beta B)^{-1}(x)$ for each $x\in H$. Then $J_{\beta}^{B}$ is a single-valued mapping.

3 Split proximal linearized algorithm

Throughout this section, we use the following notations and assumptions. Let $\rho\geq L>0$. Let $H_{1}$ and $H_{2}$ be finite dimensional real Hilbert spaces, $A:H_{1}\rightarrow H_{2}$ be a nonzero linear and bounded mapping, $A^{*}$ be the adjoint of A, $g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}$ be proper lower semicontinuous and convex functions, $g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}$ be proper lower semicontinuous and convex functions, $f_{1}(x)=g_{1}(x)-h_{1}(x)$ for all $x\in H_{1}$, and $f_{2}(y)=g_{2}(y)-h_{2}(y)$ for all $y\in H_{2}$. Further, we assume that $f_{1}$ and $f_{2}$ are bounded from below, $h_{1}$ and $h_{2}$ are Fréchet differentiable, $\nabla h_{1}$ and $\nabla h _{2}$ are L-Lipschitz continuous, $g_{1}$ and $g_{2}$ are ρ-strongly convex. Let $\beta\in(0,\infty)$, and let $\{\beta_{n}\}_{n\in\mathbb{N}}$ be a sequence in $[a,b]\subseteq(0, \infty)$. Let $r\in(0,\frac{1}{\Vert A\Vert ^{2}})$ and $\{r_{n}\}_{n\in \mathbb{N}}$ be a sequence in $(0,\frac{1}{\Vert A\Vert ^{2}})$. Let $\varOmega_{\text{SDCP}}$ be defined by

$$\varOmega_{\text{SDCP}}:=\bigl\{ x\in H_{1}: \nabla h_{1}(x)\in\partial g _{1}(x), \nabla h_{2}(Ax) \in\partial g_{2}(Ax)\bigr\} , $$

and assume that $\varOmega_{\text{SDCP}}\neq\emptyset$.

Proposition 3.1

If $\rho>L$ and $\varOmega_{\mathrm{SDCP}}\neq\emptyset$, then the set $\varOmega_{\mathrm{SDCP}}$ is a singleton.

Proof

If $x,y\in\varOmega_{\mathrm{SDCP}}$, then

$$\begin{aligned}& \nabla h_{1}(x)\in\partial g_{1}(x),\qquad \nabla h_{1}(y)\in\partial g_{1}(y), \\& \nabla h_{2}(Ax)\in\partial g_{2}(Ax),\qquad \nabla h_{2}(Ay) \in\partial g_{2}(Ay). \end{aligned}$$

Since $g_{1}$ is ρ-strongly convex, we know $\partial g_{1}$ is ρ-strongly monotone. Thus,

$$ \rho \Vert x-y \Vert ^{2}\leq\bigl\langle x-y,\nabla h_{1}(x)-\nabla h_{1}(y) \bigr\rangle \leq \Vert x-y \Vert \cdot \bigl\Vert \nabla h_{1}(x)-\nabla h_{1}(y) \bigr\Vert . $$

Since $\nabla h_{1}$ is L-Lipschitz continuous, we have

$$ \rho \Vert x-y \Vert ^{2}\leq \Vert x-y \Vert \cdot \bigl\Vert \nabla h_{1}(x)-\nabla h_{1}(y) \bigr\Vert \leq L \Vert x-y \Vert ^{2}. $$

Since $\rho>L$, we have $x=y$. The proof is completed. □

In this section, we study the split DC program by the following algorithm.

Algorithm 3.1

(Split proximal linearized algorithm)

$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta_{n}} \Vert v-Ax _{n} \Vert ^{2}-\langle\nabla h_{2}(Ax_{n}),v-Ax_{n}\rangle\}, \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta_{n}} \Vert u-z _{n} \Vert ^{2}-\langle\nabla h_{1}(z_{n}),u-z_{n}\rangle\},\quad n\in \mathbb{N}. \end{cases} $$

Theorem 3.1

Let $\{r_{n}\}_{n\in\mathbb{N}}$ be a sequence in $(0,\frac {1}{\Vert A\Vert ^{2}})$ with $0<\liminf_{n\rightarrow \infty}r_{n} \leq \limsup_{n\rightarrow \infty}r_{n}<\frac{1}{\Vert A\Vert ^{2}}$. Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by Algorithm 3.1. Then $\{x_{n}\}_{n\in\mathbb{N}}$ converges to an element $\bar{x}\in\varOmega_{\mathrm{SDCP}}$.

Proof

Take any $w\in\varOmega_{\text{SDCP}}$ and $n\in\mathbb{N}$, and let w and n be fixed. First, from the second line of Algorithm 3.1, we get

$$ 0\in\partial g_{2}(y_{n})+ \frac{1}{\beta_{n}}(y_{n}-Ax_{n})-\nabla h _{2}(Ax_{n}). $$

(3.1)

By (3.1), there exists $u_{n}\in\partial g _{2}(y_{n})$ such that

$$ \nabla h_{2}(Ax_{n})=u_{n}+ \frac{1}{\beta_{n}}(y_{n}-Ax_{n}). $$

(3.2)

Since $w\in\varOmega_{\text{SDCP}}$, we know that $\nabla h_{2}(Aw) \in\partial g_{2}(Aw)$. By Lemma 2.2, $\partial g_{2}$ is ρ-strongly monotone, and then

$$ 0\leq\bigl\langle y_{n}-Aw,u_{n}-\nabla h_{2}(Aw)\bigr\rangle -\rho \Vert y_{n}-Aw \Vert ^{2}. $$

(3.3)

By (3.2) and (3.3),

$$ 0\leq\biggl\langle y_{n}-Aw, \nabla h_{2}(Ax_{n})+\frac{1}{\beta_{n}}(Ax _{n}-y_{n})- \nabla h_{2}(Aw)\biggr\rangle -\rho \Vert y_{n}-Aw \Vert ^{2}. $$

(3.4)

Hence, by (3.4), we have

$$\begin{aligned} 0 \leq&2\beta_{n}\bigl\langle y_{n}-Aw,\nabla h_{2}(Ax_{n})-\nabla h_{2}(Aw) \bigr\rangle +2 \langle y_{n}-Aw,Ax_{n}-y_{n}\rangle \\ & {}-2\beta_{n}\rho \Vert y_{n}-Aw \Vert ^{2} \\ \leq&2\beta_{n}L \Vert y_{n}-Aw \Vert \cdot \Vert Ax_{n}-Aw \Vert -2\beta_{n}\rho \Vert y _{n}-Aw \Vert ^{2} \\ &{}+ \Vert Ax_{n}-Aw \Vert ^{2}- \Vert y_{n}-Ax_{n} \Vert ^{2}- \Vert y_{n}-Aw \Vert ^{2} \\ \leq& \beta_{n} L \Vert y_{n}-Aw \Vert ^{2}+ \beta_{n} L \Vert Ax_{n}-Aw \Vert ^{2}-2 \beta_{n}\rho \Vert y_{n}-Aw \Vert ^{2} \\ &{}+ \Vert Ax_{n}-Aw \Vert ^{2}- \Vert y_{n}-Ax_{n} \Vert ^{2}- \Vert y_{n}-Aw \Vert ^{2}. \end{aligned}$$

(3.5)

By (3.5), we obtain

$$\begin{aligned} \Vert y_{n}-Aw \Vert ^{2} \leq& \frac{\beta_{n}L+1}{1+2\beta_{n}\rho-\beta _{n}L} \Vert Ax_{n}-Aw \Vert ^{2}- \frac{ \Vert y _{n}-Ax_{n} \Vert ^{2}}{1+2\beta_{n}\rho-\beta_{n}L} \\ \leq& \Vert Ax_{n}-Aw \Vert ^{2}-\frac{ \Vert y_{n}-Ax_{n} \Vert ^{2}}{1+2\beta_{n} \rho-\beta_{n}L}. \end{aligned}$$

(3.6)

In the same way, one obtains

$$ \Vert x_{n+1}-w \Vert ^{2}\leq \Vert z_{n}-w \Vert ^{2} -\frac{1}{1+2\beta_{n}\rho- \beta_{n}L} \Vert x_{n+1}-z_{n} \Vert ^{2}\leq \Vert z_{n}-w \Vert ^{2}. $$

(3.7)

Next, we have

$$\begin{aligned} 2 \Vert z_{n}-w \Vert ^{2} =& 2\bigl\langle z_{n}-w,x_{n}-r_{n} A^{*}(Ax_{n}-y_{n})-w \bigr\rangle \\ =& 2\langle z_{n}-w,x_{n}-w\rangle-2r_{n} \bigl\langle z_{n}-w,A^{*}(Ax _{n}-y_{n})\bigr\rangle \\ =& 2\langle z_{n}-w,x_{n}-w\rangle-2r_{n} \langle Az_{n}-Aw,Ax_{n}-y _{n}\rangle \\ =& \Vert z_{n}-w \Vert ^{2}+ \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y _{n} \Vert ^{2} \\ &{}-r_{n} \Vert Ax_{n}-Aw \Vert ^{2}+r_{n} \Vert Az_{n}-Ax_{n} \Vert ^{2}+r_{n} \Vert y_{n}-Aw \Vert ^{2}. \end{aligned}$$

(3.8)

By (3.6), (3.7), and (3.8),

$$\begin{aligned}& \Vert x_{n+1}-w \Vert ^{2} \\& \quad \leq \Vert z_{n}-w \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n} \Vert Ax_{n}-Aw \Vert ^{2}+r_{n} \Vert Az_{n}-Ax_{n} \Vert ^{2}+r_{n} \Vert y_{n}-Aw \Vert ^{2} \\& \quad \leq \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2}-r _{n} \Vert Ax_{n}-Aw \Vert ^{2} \\& \qquad {}+r_{n} \Vert A \Vert ^{2}\cdot \Vert z_{n}-x_{n} \Vert ^{2}+r_{n}\cdot \frac{\beta_{n}L+1}{1+2 \beta_{n}\rho-\beta_{n}L} \Vert Ax_{n}-Aw \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az _{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n}\biggl(1-\frac{\beta_{n}L+1}{1+2\beta_{n}\rho-\beta _{n}L}\biggr) \Vert Ax_{n}-Aw \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az _{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n}\biggl(\frac{2\beta_{n}(\rho-L)}{1+2\beta_{n}\rho-\beta_{n}L}\biggr) \Vert Ax _{n}-Aw \Vert ^{2} \\& \quad \leq \Vert x_{n}-w \Vert ^{2}. \end{aligned}$$

(3.9)

By (3.9), $\lim_{n\rightarrow \infty}\Vert x_{n}-w\Vert $ exists and $\{x_{n}\}_{n\in\mathbb{N}}$ is a bounded sequence. Further, $\{Ax_{n}\}_{n\in\mathbb{N}}$, $\{y_{n}\}_{n\in\mathbb{N}}$, $\{z_{n}\}_{n\in\mathbb{N}}$ are bounded sequences. By (3.9) again, we know that

$$ \lim_{n\rightarrow \infty} \Vert x_{n}-w \Vert = \lim_{n\rightarrow \infty } \Vert z_{n}-w \Vert , $$

(3.10)

and

$$ \lim_{n\rightarrow \infty}\frac{ \Vert x_{n+1}-z_{n} \Vert ^{2}}{1+2\beta _{n}\rho- \beta_{n}L}=\lim _{n\rightarrow \infty}r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} = \lim_{n\rightarrow \infty}\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}=0. $$

(3.11)

It follows from $\{\beta_{n}\}_{n\in\mathbb{N}}\subseteq (a,b)$, $0<\liminf_{n\rightarrow \infty}r_{n}\leq\limsup_{n\rightarrow \infty}r_{n}< \frac{1}{\Vert A\Vert ^{2}}$, and (3.11) that

$$ \lim_{n\rightarrow \infty} \Vert x_{n+1}-z_{n} \Vert =\lim_{n\rightarrow \infty} \Vert Az_{n}-y _{n} \Vert =\lim_{n\rightarrow \infty} \Vert x_{n}-z_{n} \Vert =0. $$

(3.12)

Since $\{x_{n}\}_{n\in\mathbb{N}}$ is bounded, there exists a subsequence $\{x_{n_{k}}\}_{k\in\mathbb{N}}$ of $\{x_{n}\}_{n\in \mathbb{N}}$ such that $x_{n_{k}}\rightarrow \bar{x}\in H_{1}$. Clearly, $Ax_{n_{k}}\rightarrow A\bar{x}$, $z_{n_{k}}\rightarrow \bar{x}$, $Az_{n_{k}}\rightarrow A \bar{x}$, $y_{n_{k}}\rightarrow A\bar{x}$, and $x_{n_{k}+1}\rightarrow \bar{x}$. Further, by (3.2), we obtain

$$ \nabla h_{2}(Ax_{n_{k}})\in\partial g_{2}(y_{n_{k}})+\frac{1}{ \beta_{n_{k}}}(y_{n_{k}}-Ax_{n_{k}}). $$

(3.13)

By (3.12), (3.13), Lemma 2.3, and $\{\beta_{n}\}_{n\in \mathbb{N}}\subseteq (a,b)$, we determine that

$$ \nabla h_{2}(A\bar{x})\in\partial g_{2}(A \bar{x}). $$

(3.14)

Similarly, we have

$$ \nabla h_{1}(\bar{x})\in\partial g_{1}( \bar{x}). $$

(3.15)

By (3.14) and (3.15), we know that $\bar{x}\in\varOmega_{\text{SDCP}}$. Further, $\lim_{n\rightarrow \infty}\Vert x_{n}-\bar{x}\Vert =\lim_{k\rightarrow \infty }\Vert x_{n_{k}}-\bar{x}\Vert =0$. Therefore, the proof is completed. □

Remark 3.1

(i)
In Algorithm 3.1, if $y_{n}=Ax _{n}$ and $x_{n+1}=z_{n}$, then $x_{n}=z_{n}$, and this implies that $\nabla h_{1}(x_{n})\in\partial g_{1}(x_{n})$ and $\nabla h_{2}(Ax _{n})\in\partial g_{2}(Ax_{n})$. Thus, $x_{n}\in\varOmega_{\text{SDCP}}$.
(ii)
In Algorithm 3.1, if $x_{n+1} \neq z_{n}$, then $f_{1}(x_{n+1})< f_{1}(z_{n})$. Indeed, it follows from $\partial h_{1}(z_{n})=\{\nabla h_{1}(z_{n})\}$ and the definition of $x_{n+1}$ that
$$ g_{1}(x_{n+1})-h_{1}(x_{n+1})+ \frac{1}{2\beta_{n}} \Vert x_{n+1}-z_{n} \Vert ^{2} \leq g_{1}(z_{n})-h_{1}(z_{n}). $$
So, if $x_{n+1}\neq z_{n}$, then $f_{1}(x_{n+1})< f_{1}(z_{n})$.
(iii)
In Algorithm 3.1, if $y_{n} \neq Ax_{n}$, then $f_{2}(y_{n})< f_{2}(Ax_{n})$. Indeed, it follows from $\partial h_{2}(Ax_{n})=\{\nabla h_{2}(Ax_{n})\}$ and the definition of $y_{n}$ that
$$ g_{2}(y_{n})-h_{2}(y_{n})+ \frac{1}{2\beta_{n}} \Vert y_{n}-Ax_{n} \Vert ^{2} \leq g_{2}(Ax_{n})-h_{2}(Ax_{n}). $$
So, if $y_{n}\neq Ax_{n}$, then $f_{2}(y_{n})< f_{2}(Ax_{n})$.
(iv)
If $\rho> L$, then it follows from (3.7) that (3.9) can be rewritten as
$$ \Vert x_{n+1}-w \Vert ^{2}\leq k_{n} \Vert z_{n}-w \Vert ^{2}\leq k_{n} \Vert x_{n}-w \Vert ^{2}, $$
where
$$ k_{n}:=\frac{1+\beta_{n}L}{1+2\beta_{n}\rho-\beta_{n}L}\in(0,1). $$
Hence, $\{x_{n}\}_{n\in\mathbb{N}}$ converges linearly to x̄, where $\varOmega_{\text{SDCP}}=\{\bar{x}\}$.

Remark 3.2

From the proof of Theorem 3.1, we know that

$$ \nabla h_{2}(Ax_{n})+\frac{1}{\beta_{n}}(Ax_{n}-y_{n}) \in\partial g _{2}(y_{n}), $$

(3.16)

and this implies that

$$ Ax_{n}+\beta_{n}\nabla h_{2}(Ax_{n}) \in y_{n}+\beta_{n}\partial g_{2}(y _{n})=(I_{H_{2}}+\beta_{n}\partial g_{2}) (y_{n}), $$

(3.17)

where $I_{H_{2}}$ is the identity mapping on $H_{2}$. Since $g_{2}$ is proper, lower semicontinuous, and convex, we know that $\partial g_{2}$ is maximal monotone. So, by Lemma 2.5, we determine that

$$ y_{n}=(I_{H_{2}}+\beta_{n}\partial g_{2})^{-1}\bigl(Ax_{n}+\beta_{n}\nabla h_{2}(Ax_{n})\bigr). $$

(3.18)

Similarly, we have

$$ x_{n+1}=(I_{H_{1}}+\beta_{n}\partial g_{1})^{-1}\bigl(z_{n}+\beta_{n} \nabla h_{1}(z_{n})\bigr), $$

(3.19)

where $I_{H_{1}}$ is the identity mapping on $H_{1}$. Therefore, Algorithm 3.1 can be rewritten as the following algorithm:

$$ \textstyle\begin{cases} y_{n}:=(I_{H_{2}}+\beta_{n}\partial g_{2})^{-1}(Ax_{n}+\beta_{n} \nabla h_{2}(Ax_{n})), \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=(I_{H_{1}}+\beta_{n}\partial g_{1})^{-1}(z_{n}+\beta_{n} \nabla h_{1}(z_{n})), \quad n\in\mathbb{N}. \end{cases} $$

(Algorithm 3.2)

In fact, we observe that the idea of Algorithm 3.2 is the same as the proposed algorithm by Sun, Sampaio, and Candido [16, Algorithm 2.3]. Hence, a modified algorithm and related convergence theorems could be presented by using the idea of [16, Algorithm 5.3].

Remark 3.3

Under the assumptions in this section, consider the following:

$$ \textstyle\begin{cases} y:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta} \Vert v-Ax \Vert ^{2}- \langle\nabla h_{2}(Ax),v-Ax\rangle\}, \\ z:=x-rA^{*}(Ax-y), \\ w:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta} \Vert u-z \Vert ^{2}- \langle\nabla h_{1}(z),u-z\rangle\}, \end{cases} $$

(3.20)

that is,

$$ \textstyle\begin{cases} y:=(I_{H_{2}}+\beta\partial g_{2})^{-1}(Ax+\beta\nabla h_{2}(Ax)), \\ z:=x-r A^{*}(Ax-y), \\ w:=(I_{H_{1}}+\beta\partial g_{1})^{-1}(z+\beta\nabla h_{1}(z)), \end{cases} $$

(3.21)

we know that

$$ Ax=y\quad \text{and}\quad z=w \quad \Leftrightarrow\quad x=z\in \varOmega_{\text{SDCP}}. $$

(3.22)

Proof

For this equivalence relation, we only need to show that $x=z\in \varOmega_{\text{SDCP}}$ implies that $Ax=y$ and $z=w$. Indeed, since $x=z\in\varOmega_{\text{SDCP}}$, we know that $\nabla h_{1}(z)\in \partial g_{1}(z)$ and $\nabla h_{2}(Ax)\in\partial g_{2}(Ax)$. By Lemma 2.5,

$$ \textstyle\begin{cases} (I_{H_{2}}+\beta\partial g_{2})^{-1}(Ax+\beta\nabla h_{2}(Ax))=Ax, \\ (I_{H_{1}}+\beta\partial g_{1})^{-1}(z+\beta\nabla h_{1}(z))=z. \end{cases} $$

(3.23)

By (3.21) and (3.23), we know that $Ax=y$ and $z=w$. □

Remark 3.4

In Algorithm 3.1, if $\beta_{n}=\beta$ and $r_{n}=r$ for each $n\in\mathbb{N}$, and $x_{N+1}=x_{N}$ for some $N\in\mathbb{N}$, then $x_{n}=x_{N}$, $y_{n}=y_{N}$, and $z_{n}=z _{N}$ for each $n\in\mathbb{N}$ with $n\geq N$. By Theorem 3.1, we know that $\lim_{n\rightarrow \infty}x_{n}=x _{N}\in\varOmega_{\text{SDCP}}$. So, $x_{n+1}=x_{n}$ could be set as a stop criterion in Algorithm 3.1. Further, from (3.21), we have

$$\begin{aligned} x=w\quad \Rightarrow& \quad x\in\varOmega_{\text{SDCP}} \\ \Rightarrow&\quad x\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=z\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=z=w\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=w. \end{aligned}$$

This equivalence relation is important for the split DC program.

By Remark 3.4, we give the following result.

Proposition 3.2

Under the assumptions in this section, and

$$ \textstyle\begin{cases} y:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta} \Vert v-Ax \Vert ^{2}- \langle\nabla h_{2}(Ax),v-Ax\rangle\}, \\ z:=x-rA^{*}(Ax-y), \\ w:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta} \Vert u-z \Vert ^{2}- \langle\nabla h_{1}(z),u-z\rangle\}. \end{cases} $$

(3.24)

Then $x\in\varOmega_{\mathrm{SDCP}}$ if and only if $x=w$.

Next, we give another convergence theorem for the split proximal linearized algorithm under different assumptions on $\{r_{n}\}_{n \in\mathbb{N}}$.

Theorem 3.2

Let $\{r_{n}\}_{n\in\mathbb{N}}$ be a sequence in $(0,\frac {1}{\Vert A\Vert ^{2}})$ with $\lim_{n\rightarrow \infty}r_{n}=0$ and $\sum_{n=1}^{\infty }r_{n}=\infty$. Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by Algorithm 3.1. Then $\{x_{n}\}_{n\in\mathbb{N}}$ converges to an element $\bar{x}\in\varOmega_{\mathrm{SDCP}}$.

Proof

Take any $w\in\varOmega_{\mathrm{SDCP}}$ and $n\in\mathbb{N}$, and let w and n be fixed. From the proof of Theorem 3.1, we have

$$\begin{aligned}& \nabla h_{2}(Ax_{n})-\frac{1}{\beta_{n}}(y_{n}-Ax_{n}) \in\partial g _{2}(Ax_{n}), \end{aligned}$$

(3.25)

$$\begin{aligned}& \nabla h_{1}(z_{n})-\frac{1}{\beta_{n}}(x_{n+1}-z_{n}) \in\partial g _{1}(x_{n+1}), \end{aligned}$$

(3.26)

and

$$\begin{aligned} \Vert x_{n+1}-w \Vert ^{2} \leq& \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z _{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} \\ &{}-r_{n}\biggl(\frac{2\beta_{n}(\rho-L)}{1+2\beta_{n}\rho-\beta_{n}L}\biggr) \Vert Ax _{n}-Aw \Vert ^{2}-\frac{1}{1+2\beta_{n}\rho-\beta_{n}L} \Vert x_{n+1}-z_{n} \Vert ^{2} \\ \leq& \Vert x_{n}-w \Vert ^{2}. \end{aligned}$$

(3.27)

Further, the following are satisfied:

$$ \textstyle\begin{cases} \lim_{n\rightarrow \infty} \Vert x_{n}-w \Vert \text{ exists,} \\ \{x_{n}\}_{n\in\mathbb{N}}, \{Ax_{n}\}_{n\in\mathbb{N}}, \{y_{n}\} _{n\in\mathbb{N}}, \{z_{n}\}_{n\in\mathbb{N}} \text{ are bounded sequences,} \\ \lim_{n\rightarrow \infty} \Vert x_{n+1}-z_{n} \Vert =0, \\ \lim_{n\rightarrow \infty}(1-r_{n} \Vert A \Vert ^{2}) \Vert x_{n}-z_{n} \Vert ^{2}=0. \end{cases} $$

Since $\lim_{n\rightarrow \infty}r_{n}=0$, we know that

$$ \lim_{n\rightarrow \infty} \Vert x_{n}-z_{n} \Vert =0. $$

(3.28)

By (3.27), we have

$$ \sum_{n=1}^{\infty}r_{n} \Vert Az_{n}-y_{n} \Vert ^{2}\leq\sum _{n=1}^{\infty }\bigl( \Vert x_{n}-w \Vert ^{2}- \Vert x_{n+1}-w \Vert ^{2}\bigr)< \infty. $$

(3.29)

By (3.29) and Lemma 2.4, we determine that

$$ \liminf_{n\rightarrow \infty} \Vert Az_{n}-y_{n} \Vert ^{2}=0. $$

(3.30)

Then there exist a subsequence $\{y_{n_{k}}\}_{k\in\mathbb{N}}$ of $\{y_{n}\}_{n\in\mathbb{N}}$, a subsequence $\{z_{n_{k}}\}_{k\in \mathbb{N}}$ of $\{z_{n}\}_{n\in\mathbb{N}}$, and $\bar{x}\in H_{1}$ such that $z_{n_{k}}\rightarrow \bar{x}$, $y_{n_{k}}\rightarrow A\bar{x}$, and

$$ \liminf_{n\rightarrow \infty} \Vert Az_{n}-y_{n} \Vert ^{2}=\lim_{k\rightarrow \infty} \Vert Az _{n_{k}}-y_{n_{k}} \Vert ^{2}=0. $$

(3.31)

Clearly, $x_{n_{k}}\rightarrow \bar{x}$, $x_{n_{k}+1}\rightarrow \bar{x}$, and $Ax_{n_{k}}\rightarrow A\bar{x}$. By (3.25) and (3.26), we know that $\bar{x}\in \varOmega_{\text{SDCP}}$. Thus, $\bar{x}=\hat{x}$. Since $\lim_{n\rightarrow \infty}\Vert x_{n}-\bar{x}\Vert $ exists, we know $\lim_{n\rightarrow \infty}\Vert x _{n}-\bar{x}\Vert =\lim_{k\rightarrow \infty}\Vert x_{n_{k}}-\bar{x}\Vert =0$. Therefore, the proof is completed. □

4 Application to the DC program and numerical results

Let $\rho>L\geq0$. Let H be a finite dimensional Hilbert space, $g,h:H\rightarrow \mathbb{R}$ be proper, lower semicontinuous, and convex functions. Besides, we also assume that h is Fréchet differentiable, ∇h is L-Lipschitz continuous, g is ρ-strongly convex. Let $\{\beta_{n}\}_{n\in\mathbb{N}}$ be a sequence in $(a,b)\subseteq(0,\infty)$. Let $\{r_{n}\}_{n\in \mathbb{N}}$ be a sequence in $(0,1)$ with $0<\liminf_{n\rightarrow \infty}r _{n}\leq \limsup_{n\rightarrow \infty}r_{n}<1$.

Now, we recall the DC program:

$$ \text{Find }\bar{x}\in \operatorname{arg}\min_{x\in H}\bigl\{ f(x)=g(x)-h(x)\bigr\} . $$

(DCP)

Let $\varOmega_{\text{DCP}}$ be defined by

$$\varOmega_{\text{DCP}}:=\bigl\{ x\in H: \nabla h(x)\in\partial g(x)\bigr\} , $$

and assume that $\varOmega_{\text{DCP}}\neq\emptyset$. If $H_{1}=H_{2}=H$, $g_{1}=g_{2}=g$, and $h_{1}=h_{2}=h$, then we get the following algorithm and convergence theorem from Algorithm 3.1 and Theorem 3.1, respectively.

Algorithm 4.1

$$ \textstyle\begin{cases} x_{1}\in H\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H}\{g(v)+\frac{1}{2\beta_{n}} \Vert v-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),v-x_{n}\rangle\}, \\ z_{n}:=(1-r_{n})x_{n}+r_{n}y_{n}, \\ x_{n+1}:=\operatorname{arg}\min_{u\in H}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-z_{n} \Vert ^{2}- \langle\nabla h(z_{n}),u-z_{n}\rangle\},\quad n\in\mathbb{N}. \end{cases} $$

Theorem 4.1

Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by Algorithm 4.1. Then $\{x_{n}\}_{n\in\mathbb{N}}$ converges to an element $\bar {x}\in\varOmega_{\mathrm{DCP}}$.

In fact, we can get the following algorithm and convergence theorem by Algorithm 4.1 and Theorem 4.1, respectively.

Algorithm 4.2

$$ \textstyle\begin{cases} x_{1}\in H\quad \text{is given}, \\ z_{n}:=\operatorname{arg}\min_{u\in H}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),u-x_{n}\rangle\}, \\ y_{n}:=\operatorname{arg}\min_{v\in H}\{g(v)+\frac{1}{2\beta_{n}} \Vert v-z_{n} \Vert ^{2}- \langle\nabla h(z_{n}),v-z_{n}\rangle\}, \\ x_{n+1}:=(1-r_{n})z_{n}+r_{n}y_{n}, \quad n\in\mathbb{N}. \end{cases} $$

Theorem 4.2

Let $\{x_{n}\}_{n\in\mathbb{N}}$ be generated by Algorithm 4.2. Then $\{x_{n}\}_{n\in\mathbb{N}}$ converges to an element $\bar {x}\in\varOmega_{\mathrm{DCP}}$.

Example 4.1

Let $g,h:\mathbb{R}^{3}\rightarrow \mathbb{R}$ be defined by $g(x_{1},x_{2},x_{3}):=2x_{1}^{2}+2x_{2}^{2}+2x_{3}^{2}$ and $h(x_{1},x_{2},x_{3}):=4x_{1}+8x_{2}+12x_{3}$ for all $(x_{1},x_{2},x_{3})\in\mathbb{R}^{3}$. Then $\varOmega_{\text{DCP}}:=\{x\in H: \nabla h(x)\in\partial g(x)\}=\{ (1,2,3)\}$.

Example 4.2

Let $g_{1},h_{1}:\mathbb{R}^{3}\rightarrow \mathbb{R}$ be defined by $g_{1}(x_{1},x_{2},x_{3}):=2x_{1}^{2}+2x_{2}^{2}+2x_{3}^{2}$ and $h_{1}(x_{1},x_{2},x_{3}):=4x_{1}+8x_{2}+12x_{3}$ for all $(x_{1},x_{2},x_{3})\in\mathbb{R}^{3}$. Let $g_{2},h_{2}:\mathbb{R}^{2}\rightarrow \mathbb{R}$ be defined by $g_{2}(y_{1},y_{2}):=y_{1}^{2}+y_{2}^{2}$ and $h_{2}(y_{1},y_{2}):=28y_{1}+64y_{2}$ for all $(y_{1},y_{2})\in\mathbb {R}^{2}$. Let $A:\mathbb{R}^{3}\rightarrow \mathbb{R}^{2}$ be defined by $A(x_{1},x_{2},x_{3})=(x_{1}+2x_{2}+3x_{3},4x_{1}+5x_{2}+6x_{3})$ for all $(x_{1},x_{2},x_{3})\in\mathbb{R}^{3}$. Here, $\Vert A\Vert \approx0.10517$ and $\varOmega_{\text{SDCP}}=\{(1,2,3)\}$.

From Table 1, we see that Algorithm 4.2 reaches the required errors only need six iterations if $\beta_{n}=500$ for all $n\in \mathbb{N}$, but Algorithm 4.2 reaches the required errors need 73 iterations if $\beta_{n}=0.1$ for all $n\in\mathbb{N}$.

Table 1 Numerical results for Example 4.1.

Full size table

From Table 2, we see that Algorithm 4.2 reaches the required errors only need seven iterations if $\beta_{n}=100$ for all $n\in\mathbb{N}$, but Algorithm 4.2 reaches the required errors need 72 iterations if $\beta_{n}=0.1$ for all $n\in\mathbb{N}$.

Table 2 Numerical results for Example 4.1.

Full size table

From Table 3, we see that Algorithm 3.1 reaches the required errors only need seven iterations if $\beta_{n}=700$ for all $n\in\mathbb{N}$, but Algorithm 3.1 reaches the required errors need 99 iterations if $\beta_{n}=0.1$ for all $n\in\mathbb{N}$.

Table 3 Numerical results for Example 4.2.

Full size table

From Table 3 and Table 4, we see that Algorithm 3.1 reaches the required errors need 283 iterations if $\beta_{n}=1$ and $r_{n}=0.05$ for all $n\in\mathbb{N}$, but Algorithm 3.1 reaches the required errors need 39 iterations if $\beta_{n}=1$ and $r_{n}=0.09$ for all $n\in\mathbb{N}$. On the other hand, for other settings of $\beta_{n}$, we know the numerical results in Table 3 and Table 4 show that there are no significant differences in the setting of $\{r_{n}\}_{n\in\mathbb{N}}$.

Table 4 Numerical results for Example 4.2.

Full size table

However, in Table 5, if $\beta_{n}=1$ for all $n\in\mathbb{N}$, then we know the numerical results have big differences in the setting of $\{r_{n}\}_{n\in\mathbb{N}}$.

Table 5 Numerical results for Example 4.2.

Full size table

References

Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)
Book Google Scholar
Butnariu, D., Iusem, A.N.: Totally Convex Functions for Fixed Points Computation and Infinite Dimensional Optimization. Kluwer Academic, Norwell (2000)
Book Google Scholar
Byrne, C., Censor, Y., Gibali, A., Reich, S.: Weak and strong convergence of algorithms for the split common null point problem. J. Nonlinear Convex Anal. 13, 759–775 (2011)
MATH Google Scholar
Chuang, C.S.: Strong convergence theorems for the split variational inclusion problem in Hilbert spaces. Fixed Point Theory Appl. 2013, 350 (2013)
Article MathSciNet Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)
Article MathSciNet Google Scholar
Fujikara, Y., Kuroiwa, D.: Lagrange duality in canonical DC programming. J. Math. Anal. Appl. 408, 476–483 (2013)
Article MathSciNet Google Scholar
Harada, R., Kuroiwa, D.: Lagrange-type in DC programming. J. Math. Anal. Appl. 418, 415–424 (2014)
Article MathSciNet Google Scholar
Hiriart-Urruty, J.B., Tuy, H.: Essays on Nonconvex Optimization. Mathematical Programming, vol. 41. North-Holland, Amsterdam (1988)
Google Scholar
Levy, A.J.: A fast quadratic programming algorithm for positive signal restoration. IEEE Trans. Acoust. Speech Signal Process. 31, 1337–1341 (1983)
Article Google Scholar
Marino, G., Xu, H.K.: Convergence of generalized proximal point algorithm. Commun. Pure Appl. Anal. 3, 791–808 (2004)
Article MathSciNet Google Scholar
Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Rev. Fr. Inform. Rech. Opér. 4(Ser R–3), 154–158 (1970)
MATH Google Scholar
Moudafi, A.: Split monotone variational inclusions. J. Optim. Theory Appl. 150, 275–283 (2011)
Article MathSciNet Google Scholar
Pham, D.T., An, L.T.H., Akoa, F.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005)
Article MathSciNet Google Scholar
Saeki, Y., Kuroiwa, D.: Optimality conditions for DC programming problems with reverse convex constraints. Nonlinear Anal. 80, 18–27 (2013)
Article MathSciNet Google Scholar
Souza, J.C.O., Oliveira, R.P., Soubeyran, A.: Global convergence of a proximal linearized algorithm for difference of convex functions. Optim. Lett. 10, 1529–1539 (2016)
Article MathSciNet Google Scholar
Sun, W., Sampaio, R.J.B., Candido, M.A.B.: Proximal point algorithm for minimization of DC functions. J. Comput. Math. 21, 451–462 (2003)
MathSciNet MATH Google Scholar
Takahashi, W.: Introduction to Nonlinear and Convex Analysis. Yokohoma Publishers, Yokohoma (2009)
MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank referee(s) for many useful comments and suggestions for the improvement of the article. Prof. Chi-Ming Chen was supported by Grant No. MOST 107-2115-M-007-008 of the Ministry of Science and Technology of the Republic of China.

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Applied Mathematics, National Chiayi University, Chiayi, Taiwan
Chih-Sheng Chuang
Institute of Computational and Modeling Science, National Tsing Hua University, Hsinchu, Taiwan
Chi-Ming Chen

Authors

Chih-Sheng Chuang
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors contributed equally and significantly in writing this paper. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Chi-Ming Chen.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Chuang, CS., Chen, CM. Split proximal linearized algorithm and convergence theorems for the split DC program. J Inequal Appl 2019, 148 (2019). https://doi.org/10.1186/s13660-019-2084-9

Download citation

Received: 04 October 2018
Accepted: 29 April 2019
Published: 22 May 2019
DOI: https://doi.org/10.1186/s13660-019-2084-9

Split proximal linearized algorithm and convergence theorems for the split DC program

Abstract

1 Introduction

Algorithm 1.1

Algorithm 1.2

Theorem 1.1

Theorem 1.2

2 Preliminaries

Definition 2.1

Definition 2.2

Definition 2.3

Remark 2.1

Example 2.1

Definition 2.4

Lemma 2.1

Lemma 2.2

Lemma 2.3

Lemma 2.4

Lemma 2.5

3 Split proximal linearized algorithm

Proposition 3.1

Proof

Algorithm 3.1

Theorem 3.1

Proof

Remark 3.1

Remark 3.2

Remark 3.3

Proof

Remark 3.4

Proposition 3.2

Theorem 3.2

Proof

4 Application to the DC program and numerical results

Algorithm 4.1

Theorem 4.1

Algorithm 4.2

Theorem 4.2

Example 4.1

Example 4.2

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

MSC

Keywords