Skip to main content

The forward–backward splitting methods for variational inequalities and minimization problems in Banach spaces

Abstract

This paper is concerned with the study of a class of forward–backward splitting methods based on Lyapunov distance for variational inequalities and convex minimization problem in Banach spaces.

Introduction

Let X be a reflexive, strictly convex and smooth Banach space with the dual space \(X^{*}\), \(A:X\rightrightarrows X^{*}\) a general maximal monotone operator, and C a closed convex set in X. We denote by \(N_{C}\) the normal cone to C. In this work, we study the following variational inequality problem: find x in a Banach space X such that

$$ 0\in A(x)+N_{C}(x). $$
(1)

Problem (1) is a very important format for certain concrete problems in machine learning, linear inverse and many nonlinear problems such as convex programming, split feasibility problem, see [3, 4, 15, 17]. The set of solutions of (1) is supposed to be nonempty and is denoted by \(\mathcal{S}\). In [3], the authors provided a generic framework, the so called backward–backward splitting method, for solving (1) in a Hilbert space:

$$ x_{n+1}=(I+\lambda _{n}\beta _{n} \partial \varPsi )^{-1}(I+\lambda _{n}A)^{-1}(x _{n}), $$
(2)

where Ψ is a penalization function and \(\lambda _{n}\), \(\beta _{n}\) are sequences of positive parameters. In [3], convergence results have been obtained for the backward–backward splitting method (2) under the key Fenchel conjugate assumption that

$$ \sum_{n=1}^{+\infty }\lambda _{n}\beta _{n}\biggl[\varPsi ^{*}\biggl(\frac{p^{*}}{\beta _{n}} \biggr)- \sigma _{C}\biggl(\frac{p^{*}}{\beta _{n}}\biggr)\biggr]< +\infty,\quad \forall p ^{*}\in R( N_{C}), $$

where \(\varPsi ^{*}\) is the Fenchel conjugate of Ψ, \(\sigma _{C}\) is the support function of C and \(R(N_{C})\) denotes the range of \(N_{C}\). This condition somehow relates the growth of the sequence \(\{\beta _{n}\}\) to the shape of Ψ around its minimizing set C. The reader is referred to [14] for further discussion.

When the penalization function Ψ is Gâteaux differentiable, it is rather natural to consider the following forward–backward splitting method (FBS):

$$ x_{n+1} =(I+\lambda _{n}A)^{-1} \bigl(x_{n}-\lambda _{n}\beta _{n}\nabla \varPsi (x_{n})\bigr). $$
(3)

The forward–backward method has the advantage of being easier to compute than the backward–backward method, which ensures enhanced applicability to real-life problems. Iterations have lower computational cost and can be computed exactly. A special case of this method is the projected subgradient algorithm aimed at solving constrained minimization problems. There have been many works concerning the problem of finding zero points of the sum of two monotone operators. For further information on forward–backward splitting methods and their applications, see [4, 6, 9, 11, 12].

Let \(X=H\) be a Hilbert space. If \(A=\partial \varPhi \) is the subdifferential of a proper, lower-semicontinuous and convex function \(\varPhi: H\rightarrow (-\infty,+\infty ] \), the variational inequality problem (1) becomes the following minimization problem:

$$ x\in \operatorname{Argmin}\bigl\{ \varPhi (z):z\in \operatorname{Argmin}\varPsi \bigr\} . $$

It is convenient to reformulate method (3) as

$$ \frac{x_{n}-x_{n+1}}{\lambda _{n}}-\beta _{n}\nabla \varPsi (x_{n})\in \partial \varPhi (x_{n+1}). $$

This is equivalent to

$$ x_{n+1} = \mathop{\operatorname{argmin}}_{y\in X}\biggl\{ \frac{1}{2} \Vert x_{n}-y \Vert ^{2}+\lambda _{n}\beta _{n}\bigl\langle \nabla \varPsi (x_{n}) ,y\bigr\rangle + \lambda _{n}\varPhi (y) \biggr\} . $$

In [4], the authors prove that every sequence generated by the forward–backward splitting method converges weakly to a solution of the minimization problem if either the penalization function or the objective function is inf-compact. However, this inf-compactness assumption is not necessary. In [13], the authors prove that every sequence generated by the forward–backward splitting method converges weakly to a solution without the inf-compactness assumption.

A generalization of this method from Hilbert to Banach space is not immediate. The main difficulties are due to the fact that the inner product structure of a Hilbert space is missing in a Banach space. In [18], the authors prove that every sequence generated by a projection iterative method converges strongly to a common minimum norm solution of a variational inequality problem for an inverse strongly monotone mapping in Banach spaces.

In this paper, we extend the forward–backward splitting method (3) to a Banach space, that is,

$$ x_{n+1}=(cJ+\lambda _{n}A)^{-1} \bigl(cJ(x_{n})-\lambda _{n}\beta _{n}\nabla \varPsi (x_{n})\bigr), $$
(4)

where J is the duality mapping and c is constant. If \(A=\partial \varPhi \) is the subdifferential of a proper, lower-semicontinuous and convex function \(\varPhi: X\rightarrow (-\infty,+\infty ]\), the forward–backward splitting method (4) becomes

$$ \frac{cJ(x_{n})-cJ(x_{n+1})}{\lambda _{n}}-\beta _{n}\nabla \varPsi (x_{n}) \in \partial \varPhi (x_{n+1}). $$
(5)

Iterative formula (5) can be rewritten as

$$ x_{n+1} = \mathop{\operatorname{argmin}}_{y\in X}\biggl\{ \frac{c}{2}W(x_{n},y)+\lambda _{n} \beta _{n}\bigl\langle \nabla \varPsi (x_{n}) ,y\bigr\rangle + \lambda _{n}\varPhi (y)\biggr\} . $$

Throughout this paper, let \(A:X\rightrightarrows X^{*}\) be a maximal monotone operator and let the monotone operator \(T_{A,C} = A + N_{C}\) be also maximal monotone. Let \(\varPsi:X\rightarrow (-\infty,+\infty ]\) be a proper, lower-semicontinuous and convex function with \(C= \operatorname{argmin}(\varPsi )\neq \emptyset \) and \(\min (\varPsi )=0\). We assume that Ψ is Gâteaux differentiable and Ψ is L-Lipschitz continuous on the domain of Ψ. We also assume that there exists \(c>0\) such that

$$ cW(x,y)\geq \Vert x-y \Vert ^{2}, \quad\forall x,y \in X. $$
(6)

The objective of the present paper is to propose a forward–backward splitting method to solve problem (1), which is so far limited to Hilbert spaces, in the general framework of Banach spaces. The paper is organized as follows. In Sect. 2, we provide some preliminary results. We present the forward–backward splitting method and prove its convergence in Sect. 3. Section 4 is devoted to an application of our result to convex minimization problem. Finally, in Sect. 5, we also prove a convergence result without Fenchel conjugate assumption.

Preliminaries

Let f be a proper, lower-semicontinuous and convex function on a Banach space X. The subdifferential of f at \(x\in X\) is the convex set

$$ \partial f(x)=\bigl\{ x^{*}\in X^{*}:\bigl\langle x^{*},y-x\bigr\rangle \leq f(y)-f(x), \forall y\in X\bigr\} . $$

It is easy to verify that \(0\in \partial f(\hat{x})\) if and only if \(f(\hat{x})=\min_{x\in X}f(x)\). We denote by \(f^{*}\) the Fenchel conjugate of f:

$$ f^{*}\bigl(x^{*}\bigr)=\sup_{x\in X}\bigl\{ \bigl\langle x^{*},x\bigr\rangle -f(x)\bigr\} . $$

Given a nonempty closed convex set \(C\subset X\), its indicator function is defined as \(\delta _{C}(x)=0\) if \(x\in C\), and +∞ otherwise. The support function of C at a point \(x^{*}\) is \(\sigma _{C}(x^{*})= \sup_{y\in C}\langle x^{*},y\rangle \). Then the normal cone of C at \(x\in X\) is \(N_{C}(x)=\partial \delta _{C}(x)=\{x^{*}\in X^{*}| \langle x^{*},y-x\rangle \leq 0, \forall y\in C\}\).

The duality mapping \(J \colon X \rightrightarrows X^{*}\) is defined by

$$ J(x)=\bigl\{ x^{*}\in X^{*}|\bigl\langle x^{*},x \bigr\rangle = \bigl\Vert x^{*} \bigr\Vert \Vert x \Vert , \bigl\Vert x^{*} \bigr\Vert = \Vert x \Vert \bigr\} ,\quad \forall x\in X. $$

The Hahn–Banach theorem guarantees that \(J(x)\neq \emptyset \) for every \(x\in X\). It is clear that \(J(x)=\partial (\frac{1}{2}\|\cdot \|^{2})(x)\) for all \(x\in X\). It is well known that if X is smooth, then J is single-valued and is norm-to-weak star continuous. It is also well known that if a Banach space X is reflexive, strictly convex and smooth, then the duality mapping \(J^{*}\) from \(X^{*} \) into X is the inverse of J, that is, \(J^{-1}=J^{*}\). Properties of the duality mapping have been given in [1, 2, 8, 17].

Let X be a smooth Banach space. Alber [1, 2] considered the following Lyapunov distance function:

$$ W(x,y)= \Vert x \Vert ^{2}-2\bigl\langle J(x),y\bigr\rangle + \Vert y \Vert ^{2},\quad \forall x,y \in X. $$

It is obvious from the definition of W that

$$ \bigl( \Vert x \Vert - \Vert y \Vert \bigr)^{2}\leq W(x,y)\leq \bigl( \Vert x \Vert + \Vert y \Vert \bigr)^{2}, \quad\forall x,y\in X. $$

We also know that

$$ W(x,y)=W(x,z)+W(z,y)+2\bigl\langle J(x)-J(z),z-y\bigr\rangle , \quad\forall x,y,z \in X. $$
(7)

A set-valued mapping \(A:X\rightrightarrows X^{*}\) is said to be a monotone operator if \(\langle x^{*}-y^{*},x-y\rangle \geq 0\), for all \(x^{*}\in A(x)\) and for all \(y^{*}\in A(y)\). It is maximal monotone if its graph is not properly contained in the graph of any other monotone operator. The subdifferential of a proper, lower-semicontinuous and convex function is maximal monotone. The inverse \(A^{-1}:X^{*}\rightrightarrows X\) of A is defined by \(x\in A^{-1}(x^{*})\Leftrightarrow x^{*} \in A(x)\). The operator \(J +\lambda A\) is surjective for any maximal monotone operator \(A:X\rightrightarrows X^{*}\) and for any \(\lambda > 0\) by Minty’s Theorem. The operator \((J +\lambda A)^{-1}\) is nonexpansive and everywhere defined. If X is a reflexive, strictly convex and smooth Banach space and A is a maximal monotone operator, then for each \(\lambda >0\) and \(x\in X\), there is a unique element satisfying \(J(x)\in J(\bar{x})+\lambda A(\bar{x})\) (see [16]). An operator \(A:X\rightrightarrows X^{*}\) is strongly monotone with parameter \(\alpha >0\) if

$$ \bigl\langle x^{*}-y^{*},x-y\bigr\rangle \geq \alpha \Vert x-y \Vert ^{2}, \quad\forall x^{*}\in A(x), y^{*}\in A(y). $$

Observe that the set of zeroes of a maximal monotone operator which is strongly monotone must contain exactly one element.

Let \(A:X\rightrightarrows X^{*}\) be a maximal monotone operator. Suppose the operator \(T_{A,C} = A + N_{C} \) is maximal monotone and \(\mathcal{S} = (T_{A,C})^{-1}(0)\neq \emptyset \). By maximal monotonicity of \(T_{A,C}\), we know that

$$ z\in \mathcal{S}\quad\Longleftrightarrow\quad \bigl\langle 0-\omega ^{*}, z-u \bigr\rangle \geq 0,\quad\forall \bigl(u,\omega ^{*}\bigr)\in T_{A,C}, $$

that is,

$$ z\in \mathcal{S}\quad\Longleftrightarrow\quad \bigl\langle 0-\omega ^{*}, z-u \bigr\rangle \geq 0, \quad\forall u\in \operatorname{dom}(T_{A,C})=C\cap \operatorname{dom} A, \forall \omega ^{*}\in T_{A,C}(u). $$

If \(A=\partial \varPhi \) is the subdifferential of a proper, lower-semicontinuous and convex function \(\varPhi: X\rightarrow (- \infty,+\infty ]\) and if \(u\in \mathcal{S}\), then there exists \(u^{*}\in N_{C}(u)\) such that \(-u^{*}\in \partial \varPhi (u)\). Hence, by \(u^{*}\in N_{C}(u)\Rightarrow \sigma _{C}(u^{*})=\langle u^{*},u\rangle \), one has

$$ \varPhi (x)\geq \varPhi (u)+\bigl\langle -u^{*},x-u\bigr\rangle = \varPhi (u)+\sigma _{C}\bigl(u ^{*}\bigr)-\bigl\langle u^{*},x\bigr\rangle \geq \varPhi (u), \quad\forall x\in C. $$

Thus, when \(A=\partial \varPhi \), the maximal monotonicity of \(T_{A,C}\) implies

$$ \mathcal{S} =\operatorname{Argmin} \bigl\{ \varPhi (x): x \in C\bigr\} . $$

The following result will be used subsequently.

Lemma 2.1

([4])

Let \(\{a_{n}\}, \{b_{n}\}\) and \(\{\epsilon _{n}\}\) be real sequences. Assume that \(\{a_{n}\}\) is bounded from below, \(\{b_{n}\}\) is nonnegative, \(\sum_{n=1}^{\infty }|\epsilon _{n}|< +\infty \) and \(a_{n+1}-a_{n}+b_{n}\leq \epsilon _{n}\). Then \(\{a_{n}\}\) converges and \(\sum_{n=1}^{\infty }b_{n}< +\infty\).

The FBS method for variational inequalities

In this section, we firstly extend Baillon–Haddad theorem (see [5]) to Banach space.

Lemma 3.1

Let \(\varPsi:X\rightarrow (-\infty, +\infty ]\) be a proper, lower-semicontinuous and convex function and let Ψ be Gâteaux differentiable on the domain of Ψ. The following are equivalent:

  1. (i)

    Ψ is Lipschitz continuous with constant L.

  2. (ii)

    \(\varPsi (y)-\varPsi (x)-\langle \nabla \varPsi (x), y-x \rangle \leq \frac{L}{2}\|y-x\|^{2}, \forall x,y\in \operatorname{dom} \varPsi\).

  3. (iii)

    \(\varPsi (x)+\langle \nabla \varPsi (x), y-x\rangle +\frac{L}{2}\|\nabla \varPsi (x)-\nabla \varPsi (y)\|^{2}\leq \varPsi (y), \forall x,y\in \operatorname{dom}\varPsi\).

  4. (iv)

    Ψ is \(\frac{1}{L}\) cocoercive, that is,

    $$ \bigl\langle \nabla \varPsi (x)-\nabla \varPsi (y), x-y\bigr\rangle \geq \frac{1}{L} \bigl\Vert \nabla \varPsi (x)-\nabla \varPsi (y) \bigr\Vert ^{2},\quad \forall x,y \in \operatorname{dom} \varPsi. $$

Proof

\(\mathrm{(i)}\Rightarrow {\mathrm{(ii)}}\). By the mean value theorem, we have

$$\begin{aligned} \varPsi (y)-\varPsi (x)-\bigl\langle \nabla \varPsi (x),y-x\bigr\rangle &= \int _{0}^{1} \bigl\langle \nabla \varPsi (x)-\nabla \varPsi \bigl(x+t(y-x)\bigr),x-y\bigr\rangle \,\mathrm{d}t \\ &\leq \int _{0}^{1} \bigl\Vert \nabla \varPsi \bigl(x+t(y-x)\bigr)-\nabla \varPsi (x) \bigr\Vert \Vert x-y \Vert \,\mathrm{d}t \\ &\leq \int _{0}^{1}L \Vert x-y \Vert ^{2}t\,\mathrm{d}t \\ &\leq \frac{L}{2} \Vert x-y \Vert ^{2}. \end{aligned}$$

\(\mathrm{(ii)}\Rightarrow {\mathrm{(iii)}}\). Let us fix \(x_{0}\in \operatorname{dom}\varPsi \). Consider the function

$$ F(y)=\varPsi (y)-\bigl\langle \nabla \varPsi (x_{0}),y\bigr\rangle . $$

Note that F is a proper, lower-semicontinuous, convex and Gâteaux differentiable function and F is Lipschitz continuous on the \(\operatorname{dom}F=\operatorname{dom}\varPsi \) with constant L. Therefore, by \(\mathrm{(i)}\Rightarrow {\mathrm{(ii)}}\), we have

$$ F(u)-F(v)-\bigl\langle \nabla F(v), u-v\bigr\rangle \leq \frac{L}{2} \Vert u-v \Vert ^{2},\quad \forall u,v\in \operatorname{dom}F. $$
(8)

By the definition of F, we have \(x_{0}\in \operatorname{Argmin}_{x \in X}F(x)\). Then, by (8), we have

$$ F(x_{0})\leq F\biggl(y-\frac{1}{L}J^{-1}\nabla F(y) \biggr)\leq F(y)-\frac{L}{2} \biggl\Vert \biggl(y- \frac{1}{L}J^{-1} \nabla F(y)\biggr)-y \biggr\Vert ^{2}=F(y)-\frac{1}{2L} \bigl\Vert \nabla F(y) \bigr\Vert ^{2}. $$

Hence, by \(\nabla F(y)=\nabla \varPsi (y)-\nabla \varPsi (x_{0})\), we get (iii).

\(\mathrm{(iii)}\Rightarrow {\mathrm{(iv)}}\). For any \(x,y\in \operatorname{dom} \varPsi \), by \(\mathrm{(iii)}\), we have

$$ \varPsi (x)+\bigl\langle \nabla \varPsi (x), y-x\bigr\rangle +\frac{L}{2} \bigl\Vert \nabla \varPsi (x)-\nabla \varPsi (y) \bigr\Vert ^{2}\leq \varPsi (y) $$

and

$$ \varPsi (y)+\bigl\langle \nabla \varPsi (y), x-y\bigr\rangle +\frac{L}{2} \bigl\Vert \nabla \varPsi (x)-\nabla \varPsi (y) \bigr\Vert ^{2}\leq \varPsi (x). $$

Adding the two inequalities, we get

$$ \bigl\langle \nabla \varPsi (x)-\nabla \varPsi (y), x-y\bigr\rangle \geq \frac{1}{L} \bigl\Vert \nabla \varPsi (x)-\nabla \varPsi (y) \bigr\Vert ^{2}. $$

\(\mathrm{(iv)}\Rightarrow {\mathrm{(i)}}\). By Cauchy–Schwartz inequality, we get \(\|\nabla \varPsi (x)-\nabla \varPsi (y)\|\leq L \|x-y\|\). □

Iterative Method 3.1

Given \(x_{0}\in X\), set

$$ x_{n+1}=(cJ+\lambda _{n}A)^{-1} \bigl(cJ(x_{n})-\lambda _{n}\beta _{n}\nabla \varPsi (x_{n})\bigr), $$
(9)

where \(\{\lambda _{n}\},\{\beta _{n}\}\) are two sequences of positive real numbers with \(\sum_{n=1}^{\infty }\lambda _{n}=+\infty, \sum_{n=1} ^{\infty }\lambda _{n}^{2}<+\infty \) and \(\lambda _{n}\beta _{n}< \frac{2c}{L}\).

For our results, the following Fenchel conjugate assumption will be used subsequently:

$$ \sum_{n=1}^{+\infty }\lambda _{n}\beta _{n}\biggl[\varPsi ^{*}\biggl( \frac{p^{*}}{\beta _{n}} \biggr)-\sigma _{C}\biggl(\frac{p^{*}}{\beta _{n}}\biggr) \biggr]< +\infty,\quad \forall p ^{*}\in R(N_{C}). $$
(10)

Remark 3.1

Since \(\varPsi (x)\leq \delta _{C}(x)\) for all \(x\in X\), we obtain, \(\varPsi ^{*}(x^{*} )-\sigma _{C}(x^{*})\geq 0\) for all \(x^{*}\in X^{*}\). Hence, the terms in the sum are nonnegative.

Remark 3.2

If \(\varPsi (x)=\frac{1}{2}\operatorname{dist}(x,C)^{2}\), then we have \(\varPsi ^{*}(x^{*})-\sigma _{C}(x^{*})=\frac{1}{2}\|x^{*}\|^{2}\) for all \(x^{*}\in X^{*}\) and so

$$ \sum_{n=1}^{+\infty }\lambda _{n}\beta _{n}\biggl[\varPsi ^{*}\biggl(\frac{p^{*}}{\beta _{n}} \biggr)- \sigma _{C}\biggl(\frac{p^{*}}{\beta _{n}}\biggr)\biggr]< +\infty,\quad \forall p^{*} \in R( N_{C})\quad\Longleftrightarrow\quad \sum _{n=1}^{+\infty }\frac{\lambda _{n}}{\beta _{n}}< +\infty. $$

It is easy to see that, if the sequence \(\{\beta _{n}\}\) is chosen so that \(\limsup_{n\rightarrow +\infty }\lambda _{n}\beta _{n}<+ \infty \) and \(\liminf_{n\rightarrow +\infty }\lambda _{n}\beta _{n}>0\), then

$$ \sum_{n=1}^{+\infty }\frac{\lambda _{n}}{\beta _{n}}< +\infty \quad\Longleftrightarrow \quad\sum_{n=1}^{+\infty }\lambda _{n}^{2}< +\infty. $$

Proposition 3.1

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (9). Take \(u\in C\cap\operatorname{dom}A\) and \(v^{*}\in A(u)\). Then for all \(t\geq 0\), we have

$$\begin{aligned} &c W(x_{n+1},u)-cW(x_{n},u)+cW(x_{n},x_{n+1})- \frac{1}{1+t} \Vert x_{n}-x _{n+1} \Vert ^{2}+\frac{2t}{1+t}\lambda _{n}\beta _{n} \varPsi (x_{n}) \\ &\quad\leq \lambda _{n}\beta _{n} \biggl((1+t)\lambda _{n}\beta _{n}- \frac{2}{L(1+t)} \biggr) \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+2\lambda _{n} \bigl\langle v^{*}, u-x_{n+1}\bigr\rangle . \end{aligned}$$
(11)

Proof

Since \(v^{*}\in A(u)\) and \(cJ(x_{n})-cJ(x_{n+1})- \lambda _{n}\beta _{n}\nabla \varPsi (x_{n})\in \lambda _{n}A(x_{n+1})\), the monotonicity of A implies

$$ \bigl\langle cJ(x_{n})-cJ(x_{n+1})-\lambda _{n}\beta _{n}\nabla \varPsi (x_{n})- \lambda _{n}v^{*}, x_{n+1}-u\bigr\rangle \geq 0, $$
(12)

and so

$$ \bigl\langle cJ(x_{n})-cJ(x_{n+1}), u-x_{n+1}\bigr\rangle \leq \bigl\langle \lambda _{n}\beta _{n}\nabla \varPsi (x_{n})+\lambda _{n}v^{*}, u-x_{n+1}\bigr\rangle , $$

which in turn gives

$$ cW(x_{n+1},u)-cW(x_{n},u)+cW(x_{n},x_{n+1}) \leq 2\lambda _{n}\bigl\langle \beta _{n}\nabla \varPsi (x_{n})+v^{*}, u-x_{n+1}\bigr\rangle . $$

Hence, we have that

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u)+cW(x_{n},x_{n+1}) \\ &\quad \leq 2\lambda _{n}\bigl\langle \beta _{n}\nabla \varPsi (x_{n}), u-x_{n} \bigr\rangle +2\lambda _{n} \bigl\langle \beta _{n}\nabla \varPsi (x_{n}), x_{n}-x _{n+1}\bigr\rangle +2\lambda _{n}\bigl\langle v^{*}, u-x_{n+1}\bigr\rangle . \end{aligned}$$
(13)

On the one hand, since Ψ is Lipschitz continuous, by Lemma 3.1, we have

$$ \bigl\langle \nabla \varPsi (x_{n})-\nabla \varPsi (u), x_{n}-u\bigr\rangle \geq \frac{1}{L} \bigl\Vert \nabla \varPsi (x_{n})-\nabla \varPsi (u) \bigr\Vert ^{2}. $$

Hence, by \(\nabla \varPsi (u)=0\), we obtain

$$ \bigl\langle \nabla \varPsi (x_{n}), u-x_{n}\bigr\rangle \leq -\frac{1}{L} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}. $$
(14)

Since \(\varPsi (u)=0\), by \(\varPsi (u)\geq \varPsi (x_{n})+\langle \nabla \varPsi (x_{n}), u-x_{n}\rangle \), we have

$$ \bigl\langle \nabla \varPsi (x_{n}), u-x_{n}\bigr\rangle \leq -\varPsi (x_{n}). $$
(15)

For any \(t\geq 0\), by taking a convex combination of inequalities (14) and (15), we have

$$ 2\lambda _{n}\beta _{n}\bigl\langle \nabla \varPsi (x_{n}), u-x_{n}\bigr\rangle \leq - \frac{2}{L(1+t)}\lambda _{n}\beta _{n} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}- \frac{2t}{1+t}\lambda _{n}\beta _{n}\varPsi (x_{n}). $$
(16)

On the other hand, for the remaining term \(2\lambda _{n}\beta _{n} \langle \nabla \varPsi (x_{n}), x_{n}-x_{n+1}\rangle \), we have

$$\begin{aligned} &2\lambda _{n}\beta _{n}\bigl\langle \nabla \varPsi (x_{n}), x_{n}-x_{n+1}\bigr\rangle \\ &\quad= 2 \biggl\langle \lambda _{n}\beta _{n}\sqrt{1+t}\nabla \varPsi (x_{n}), \frac{1}{ \sqrt{1+t}}(x_{n}-x_{n+1})\biggr\rangle \\ &\quad\leq 2 \bigl\Vert \lambda _{n}\beta _{n}\sqrt{1+t} \nabla \varPsi (x_{n}) \bigr\Vert \biggl\Vert \frac{1}{ \sqrt{1+t}}(x_{n}-x_{n+1}) \biggr\Vert \\ &\quad\leq (1+t)\lambda _{n}^{2}\beta _{n}^{2} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+ \frac{1}{(1+t)} \Vert x_{n}-x_{n+1} \Vert ^{2}. \end{aligned}$$
(17)

Inequalities (13), (16) and (17) together give

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u)+cW(x_{n},x_{n+1})- \frac{1}{1+t} \Vert x_{n}-x _{n+1} \Vert ^{2}+\frac{2t}{1+t}\lambda _{n}\beta _{n} \varPsi (x_{n}) \\ &\quad \leq \lambda _{n}\beta _{n} \biggl((1+t)\lambda _{n}\beta _{n}- \frac{2}{L(1+t)} \biggr) \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+2\lambda _{n} \bigl\langle v^{*}, u-x_{n+1}\bigr\rangle . \end{aligned}$$

 □

Proposition 3.2

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (9). Then, there exist \(a>0\) and \(b>0\), such that for any \(u\in C\cap \operatorname{dom} A\) and any \(v^{*}\in A(u)\), we have

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u)+a\bigl( \Vert x_{n}-x_{n+1} \Vert ^{2} +\lambda _{n} \beta _{n}\varPsi (x_{n})+\lambda _{n}\beta _{n} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}\bigr) \\ &\quad\leq b\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2}+2\lambda _{n}\bigl\langle v^{*}, u-x _{n}\bigr\rangle . \end{aligned}$$
(18)

Proof

Using the fact that \(\langle p^{*},p\rangle \leq \frac{s}{2}\|p^{*}\|^{2}+\frac{1}{2s}\|p\|^{2}\) for any \(p^{*}\in X ^{*}\), \(p\in X\) and any \(s>0\) yields

$$\begin{aligned} &2\lambda _{n}\bigl\langle v^{*}, u-x_{n+1}\bigr\rangle \\ &\quad=2\lambda _{n}\bigl\langle v^{*}, x_{n}-x_{n+1} \bigr\rangle +2\lambda _{n} \bigl\langle v^{*}, u-x_{n}\bigr\rangle \\ &\quad\leq \frac{t}{2(1+t)} \Vert x_{n}-x_{n+1} \Vert ^{2}+\frac{2(1+t)}{t}\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +2\lambda _{n}\bigl\langle v^{*}, u-x_{n}\bigr\rangle . \end{aligned}$$

Then, we get from (11) that

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u)+cW(x_{n},x_{n+1}) \\ &\qquad{}-\frac{1}{1+t} \Vert x_{n}-x_{n+1} \Vert ^{2}+\frac{2t}{1+t}\lambda _{n}\beta _{n} \varPsi (x_{n})+\frac{t}{1+t}\lambda _{n}\beta _{n} \bigl\Vert \nabla \varPsi (x _{n}) \bigr\Vert ^{2} \\ &\quad\leq \lambda _{n}\beta _{n} \biggl((1+t)\lambda _{n}\beta _{n}- \frac{2}{L(1+t)}+\frac{t}{1+t} \biggr) \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2} \\ &\qquad{}+\frac{t}{2(1+t)} \Vert x_{n}-x_{n+1} \Vert ^{2}+\frac{2(1+t)}{t}\lambda _{n} ^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +2\lambda _{n}\bigl\langle v^{*}, u-x_{n}\bigr\rangle . \end{aligned}$$

Hence, by \(cW(x_{n},x_{n+1})\geq \|x_{n}-x_{n+1}\|^{2}\), we have

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u)+ \frac{t}{2+2t} \Vert x_{n}-x_{n+1} \Vert ^{2} + \frac{2t}{1+t}\lambda _{n}\beta _{n} \varPsi (x_{n})\\ &\qquad{}+\frac{t}{1+t}\lambda _{n}\beta _{n} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2} \\ &\quad\leq \lambda _{n}\beta _{n} \biggl((1+t)\lambda _{n}\beta _{n}- \frac{2}{L(1+t)}+\frac{t}{1+t} \biggr) \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2} + \frac{2(1+t)}{t}\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2}\\ &\qquad{} +2\lambda _{n}\bigl\langle v ^{*}, u-x_{n}\bigr\rangle . \end{aligned}$$

Since \(\lambda _{n}\beta _{n}<\frac{2c}{L}\), we have

$$ \lim_{t\rightarrow 0}\lambda _{n}\beta _{n} \biggl((1+t)\lambda _{n}\beta _{n}-\frac{2}{L(1+t)}+ \frac{t}{1+t} \biggr) =\lambda _{n}\beta _{n}\biggl( \lambda _{n}\beta _{n}-\frac{2}{L}\biggr)< 0. $$

Therefore, it suffices to take \(t_{0}>0\) small enough, then set

$$ a=\frac{t_{0}}{2(1+t_{0})},\qquad b=\frac{2(1+t_{0})}{t_{0}} $$

to obtain (18). □

Proposition 3.3

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (9) and let \(u\in C\cap \operatorname{dom} A\). Take \(\omega ^{*} \in T_{A,C}(u),v^{*}\in A(u)\) and \(p^{*}\in N_{C}(u)\), such that \(v^{*} =\omega ^{*}-p^{*}\). The following inequality holds:

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u)+a\biggl( \Vert x_{n}-x_{n+1} \Vert ^{2} +\frac{\lambda _{n}\beta _{n}}{2} \varPsi (x_{n})+\lambda _{n}\beta _{n} \bigl\Vert \nabla \varPsi (x _{n}) \bigr\Vert ^{2}\biggr) \\ &\quad \leq b\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle +\frac{a\lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*}\biggl(\frac{4p ^{*}}{a\beta _{n}} \biggr)-\sigma _{C}\biggl( \frac{4p^{*}}{a\beta _{n}}\biggr) \biggr). \end{aligned}$$
(19)

Proof

First observe that

$$\begin{aligned} &2\lambda _{n}\bigl\langle v^{*}, u-x_{n}\bigr\rangle -\frac{a\lambda _{n}\beta _{n}}{2}\varPsi (x_{n}) \\ &\quad =2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle +2\lambda _{n}\bigl\langle p^{*}, x_{n}\bigr\rangle -\frac{a\lambda _{n}\beta _{n}}{2}\varPsi (x_{n})-2 \lambda _{n}\bigl\langle p^{*}, u\bigr\rangle \\ &\quad=2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle +\frac{a\lambda _{n} \beta _{n}}{2} \biggl(\biggl\langle \frac{4}{a\beta _{n}} p^{*}, x_{n}\biggr\rangle -\varPsi (x_{n})-\biggl\langle \frac{4}{a\beta _{n}} p^{*}, u\biggr\rangle \biggr) \\ &\quad\leq 2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle +\frac{a\lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*} \biggl(\frac{4p^{*}}{a\beta _{n}} \biggr)-\biggl\langle \frac{4p^{*}}{a\beta _{n}}, u\biggr\rangle \biggr). \end{aligned}$$

Since \(\frac{4p^{*}}{a\beta _{n}}\in N_{C}(u)\), the support function satisfies

$$ \sigma _{C}\biggl(\frac{4p^{*}}{a\beta _{n}}\biggr)=\biggl\langle \frac{4p^{*}}{a\beta _{n}},u\biggr\rangle , $$

whence

$$\begin{aligned} & 2\lambda _{n}\bigl\langle v^{*}, u-x_{n}\bigr\rangle \\ &\quad\leq \frac{a\lambda _{n}\beta _{n}}{2}\varPsi (x_{n})+2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle +\frac{a \lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*} \biggl(\frac{4p^{*}}{a\beta _{n}} \biggr)- \sigma _{C}\biggl(\frac{4p^{*}}{a\beta _{n}} \biggr) \biggr). \end{aligned}$$
(20)

Hence by (18) and (20), we obtain (19). □

Theorem 3.1

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (9). Then, we have the following:

  1. (i)

    For each \(u\in \mathcal{S},\lim_{n\rightarrow +\infty }W(x_{n},u)\) exists.

  2. (ii)

    The series \(\sum_{n=1}^{+\infty }\|x_{n}-x_{n+1} \|^{2},\sum_{n=1}^{+\infty }\lambda _{n}\beta _{n}\varPsi (x_{n})\) and \(\sum_{n=1}^{+\infty }\lambda _{n}\beta _{n}\|\nabla \varPsi (x_{n})\|^{2}\) are convergent.

In particular, \(\lim_{n\rightarrow +\infty }\|x_{n}-x_{n+1}\|=0\). If, moreover, \(\liminf_{n\rightarrow +\infty }\lambda _{n}\beta _{n}>0\), then \(\lim_{n\rightarrow +\infty }\varPsi (x_{n})= \lim_{n\rightarrow +\infty }\|\nabla \varPsi (x_{n})\|=0\) and every weak cluster point of \(\{x_{n}\}\) lies in C.

Proof

Since \(u\in \mathcal{S}\) one can take \(\omega ^{*}=0\) in (19). By hypothesis the right-hand side is summable, and all the conclusions follow using Lemma 2.1. □

Theorem 3.2

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (9), and let \(\{z_{k}\}\) be the sequence of weighted averages

$$ z_{k}=\frac{1}{\gamma _{k}}\sum_{n=1}^{k} \lambda _{n}x_{n}, \quad\textit{where } \gamma _{k}=\sum _{n=1}^{k}\lambda _{n}. $$

Then every weak cluster of \(\{z_{k}\}\) lies in \(\mathcal{S}\).

Proof

Let \(u\in C\cap \operatorname{dom}A\). Take \(\omega ^{*} \in T_{A,C}(u)\), \(v^{*}\in A(u)\) and \(p^{*}\in N_{C}(u)\), so that \(v^{*} =\omega ^{*}-p^{*}\). By Proposition 3.3, we have

$$\begin{aligned} &cW(x_{n+1},u)-cW(x_{n},u) \\ &\quad \leq b\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle + \frac{a\lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*}\biggl(\frac{4p^{*}}{a\beta _{n}} \biggr)-\sigma _{C}\biggl(\frac{4p^{*}}{a\beta _{n}}\biggr) \biggr). \end{aligned}$$

Hence, we obtain

$$\begin{aligned} &{-}c\frac{W(x_{1},u)}{2\gamma _{k}} \\ &\quad\leq \frac{\sum_{n=1}^{k}b\lambda _{n}^{2} \Vert v^{*} \Vert ^{2} +\sum_{n=1} ^{k}\frac{a\lambda _{n}\beta _{n}}{2} (\varPsi ^{*}(\frac{4p^{*}}{a \beta _{n}} )-\sigma _{C}(\frac{4p^{*}}{a\beta _{n}}) )}{2\gamma _{k}} +\frac{ 2\sum_{n=1}^{k}\langle \omega ^{*}, \lambda _{n}u-\lambda _{n}x_{n}\rangle }{2\gamma _{k}} \\ &\quad =\frac{\sum_{n=1}^{k}b\lambda _{n}^{2} \Vert v^{*} \Vert ^{2} +\sum_{n=1}^{k}\frac{a \lambda _{n}\beta _{n}}{2} (\varPsi ^{*}(\frac{4p^{*}}{a\beta _{n}} )- \sigma _{C}(\frac{4p^{*}}{a\beta _{n}}) )}{2\gamma _{k}} +\biggl\langle \omega ^{*}, u-\frac{\sum_{n=1}^{k}\lambda _{n}x_{n}}{\gamma _{k}} \biggr\rangle . \end{aligned}$$
(21)

Then by (10), (21) and using that \(\gamma _{k}\rightarrow +\infty \) as \(k\rightarrow +\infty \), we obtain

$$ \liminf_{k\rightarrow +\infty }\bigl\langle \omega ^{*}, u-z_{k}\bigr\rangle \geq 0. $$

Finally, if z is any weak sequential cluster point of the sequence \(\{z_{k}\}\), then \(\langle \omega ^{*}, u-z\rangle \geq 0\). Since \(\omega ^{*}\in T_{A,C}(u)\) and \(T_{A,C}\) is maximal monotone, we obtain that \(z\in \mathcal{S}\). □

Theorem 3.3

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (9) and let A be a maximal monotone and strongly monotone operator. Then the sequence \(\{x_{n}\}\) converges strongly as \(n\rightarrow +\infty \) to a point in \(\mathcal{S}\).

Proof

Take \(u\in \mathcal{S}\subset C\cap \operatorname{dom} A\), \(v^{*}\in A(u)\), \(\omega ^{*}\in T_{A,C}(u)\) and \(p^{*}\in N_{C}(u)\), so that \(v^{*} =\omega ^{*}-p^{*}\). Since \(v^{*}\in A(u)\) and \(cJ(x_{n})-cJ(x _{n+1})-\lambda _{n}\beta _{n}\nabla \varPsi (x_{n})\in \lambda _{n}A(x_{n+1})\), the strong monotonicity of A implies

$$ \bigl\langle cJ(x_{n})-cJ(x_{n+1})-\lambda _{n} \beta _{n}\nabla \varPsi (x_{n})- \lambda _{n}v^{*}, x_{n+1}-u\bigr\rangle \geq \lambda _{n}\alpha \Vert x_{n+1}-u \Vert ^{2}. $$

We follow the arguments in the proof of Proposition 3.3 to obtain successively

$$\begin{aligned} &2\lambda _{n}\alpha \Vert x_{n+1}-u \Vert ^{2}+cW(x_{n+1},u)-cW(x_{n},u) \\ &\qquad{}+a\biggl( \Vert x _{n}-x_{n+1} \Vert ^{2} + \frac{\lambda _{n}\beta _{n}}{2}\varPsi (x_{n})+\lambda _{n}\beta _{n} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}\biggr) \\ &\quad \leq b\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +2\lambda _{n}\bigl\langle \omega ^{*}, u-x_{n}\bigr\rangle +\frac{a\lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*}\biggl(\frac{4p ^{*}}{a\beta _{n}} \biggr)-\sigma _{C}\biggl( \frac{4p^{*}}{a\beta _{n}}\biggr) \biggr). \end{aligned}$$
(22)

Since \(u\in S\), one can take \(\omega ^{*}=0\) in (22). By \(a(\|x_{n}-x_{n+1}\|^{2} +\frac{\lambda _{n}\beta _{n}}{2}\varPsi (x_{n})+ \lambda _{n}\beta _{n}\|\nabla \varPsi (x_{n})\|^{2})\geq 0\), we have

$$\begin{aligned} &2\lambda _{n}\alpha \Vert x_{n+1}-u \Vert ^{2}+cW(x_{n+1},u)-cW(x_{n},u)\\ &\quad \leq b \lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +\frac{a\lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*}\biggl( \frac{4p^{*}}{a\beta _{n}} \biggr)-\sigma _{C}\biggl(\frac{4p^{*}}{a\beta _{n}}\biggr) \biggr). \end{aligned}$$

Summation gives

$$\begin{aligned} &2\alpha \sum_{n=1}^{+\infty }\lambda _{n} \Vert x_{n+1}-u \Vert ^{2} \\ &\quad \leq cW(x _{1},u)+b\sum_{n=1}^{+\infty }\lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} +\sum_{n=1} ^{+\infty } \frac{a\lambda _{n}\beta _{n}}{2} \biggl(\varPsi ^{*}\biggl(\frac{4p ^{*}}{a\beta _{n}} \biggr)- \sigma _{C}\biggl(\frac{4p^{*}}{a\beta _{n}}\biggr) \biggr). \end{aligned}$$

Since \(\sum_{n=1}^{+\infty }\lambda _{n}=+\infty \), there exists subsequence \(\{x_{n,k}\}\subset \{x_{n}\}\), such that \(\lim_{k\rightarrow +\infty }\|x_{n,k}-u\|=0\). Then, \(\lim_{k\rightarrow +\infty }W(x_{n,k},u)=0\). Since \(\lim_{n\rightarrow +\infty }W(x_{n},u)\) exists by Theorem 3.1(i), we must have \(\lim_{n\rightarrow +\infty }W(x _{n},u)=0\). Hence, by \(cW(x_{n},u)\geq \|x_{n}-u\|^{2}\), we have \(\lim_{n\rightarrow +\infty }\|x_{n}-u\|=0\). □

The FBS method for the minimization

In this section, we consider the forward–backward splitting method in the special case where \(A=\partial \varPhi \) is the subdifferential of a proper, lower-semicontinuous and convex function \(\varPhi:X\rightarrow (-\infty,+\infty ]\). The solution set \(\mathcal{S}\) is equal to

$$ (\partial \varPhi +N_{C})^{-1}(0)= \mathop{\operatorname{Argmin}}_{C} \varPhi. $$

Iterative Method 4.1

Given \(x_{0}\in X\), set

$$ x_{n+1}=(cJ+\lambda _{n}\partial \varPhi )^{-1}\bigl(cJx_{n}-\lambda _{n}\beta _{n}\nabla \varPsi (x_{n})\bigr), $$
(23)

where \(\{\lambda _{n}\},\{\beta _{n}\}\) are two sequences of positive real numbers with \(\sum_{n=1}^{\infty }\lambda _{n}=+\infty, \sum_{n=1} ^{\infty }\lambda _{n}^{2}<+\infty\), \(\beta _{n+1}-\beta _{n}\leq K,~K>0,~0< \bar{c}\leq \lambda _{n}\beta _{n}<\frac{2c}{L}\).

We also shall make the following Fenchel conjugate assumption:

$$ \sum_{n=1}^{+\infty }\lambda _{n}\beta _{n}\biggl[\varPsi ^{*}\biggl(\frac{p^{*}}{\beta _{n}} \biggr)- \sigma _{C}\biggl(\frac{p^{*}}{\beta _{n}}\biggr)\biggr]< +\infty, \quad\forall p ^{*}\in R(N_{C}). $$

The analysis relies on the study of the sequence \(\{H_{n}(x_{n})\}\), where \(H_{n}\) is the penalized function given by \(H_{n}=\varPhi +\beta _{n}\varPsi \) for \(n\geq 1\).

Proposition 4.1

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (23). Then the sequence \(\{H_{n}(x_{n})\}\) converges as \(n\rightarrow +\infty\).

Proof

Recall that \(\frac{cJ(x_{n})-cJ(x_{n+1})}{\lambda _{n}}-\beta _{n}\nabla \varPsi (x_{n})\in \partial \varPhi (x_{n+1})\). The subdifferential inequality for Φ gives

$$ \varPhi (x_{n})\geq \varPhi (x_{n+1})+\biggl\langle \frac{cJ(x_{n})-cJ(x_{n+1})}{ \lambda _{n}}-\beta _{n}\nabla \varPsi (x_{n}), x_{n}-x_{n+1}\biggr\rangle , $$

and so

$$ \varPhi (x_{n+1})-\varPhi (x_{n})+ \frac{1}{2\lambda _{n}} \bigl(cW(x_{n+1},x_{n})+cW(x _{n},x_{n+1})\bigr)\leq \beta _{n}\bigl\langle \nabla \varPsi (x_{n}), x_{n}-x_{n+1} \bigr\rangle . $$
(24)

Then, by Lemma 3.1(ii), we have

$$ \varPsi (x_{n+1})\leq \varPsi (x_{n})+\bigl\langle \nabla \varPsi (x_{n}), x_{n+1}-x _{n}\bigr\rangle +\frac{L}{2} \Vert x_{n+1}-x_{n} \Vert ^{2}, $$
(25)

whence

$$\begin{aligned} &\beta _{n+1}\varPsi (x_{n+1})-\beta _{n}\varPsi (x_{n}) \\ &\quad \leq \beta _{n}\bigl\langle \nabla \varPsi (x_{n}), x_{n+1}-x_{n}\bigr\rangle +\frac{L \beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2}+(\beta _{n+1}- \beta _{n})\varPsi (x_{n+1}). \end{aligned}$$
(26)

Adding (24) and (26), we obtain

$$\begin{aligned} & H_{n+1}(x_{n+1})-H_{n}(x_{n})+ \frac{1}{2\lambda _{n}} \bigl(cW(x_{n+1},x _{n})+cW(x_{n},x_{n+1}) \bigr)-\frac{L\beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2} \\ &\quad \leq (\beta _{n+1}-\beta _{n})\varPsi (x_{n+1}). \end{aligned}$$
(27)

Since \(\lambda _{n}\beta _{n}<\frac{2c}{L} \) and \(cW(x,y)\geq \|x-y\| ^{2}\), \(\forall x,y \in X\), we have

$$ \frac{1}{2\lambda _{n}} \bigl(cW(x_{n+1},x_{n})+cW(x_{n},x_{n+1}) \bigr)-\frac{L \beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2}\geq 0. $$
(28)

Since \(\beta _{n+1}-\beta _{n}\leq K \), by (27) and (28),

$$ H_{n+1}(x_{n+1})-H_{n}(x_{n})\leq K \varPsi (x_{n+1}). $$

By Theorem 3.1(i), we deduce that \(\{x_{n}\}\) is bounded and \(\{\varPhi (x_{n})\}\) is therefore bounded from below. Hence, the sequence \(\{H_{n}(x_{n})\}\) is also bounded from below. The right-hand side is summable by Theorem 3.1(ii), whence Lemma 2.1 implies that \(\lim_{n\rightarrow +\infty }H_{n}(x_{n})\) exists. □

Proposition 4.2

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (23). For each \(u\in C\), we have \(\sum_{n=1}^{+\infty } \lambda _{n}(H_{n+1}(x_{n+1})-\varPhi (u))<+\infty\).

Proof

First observe that

$$\begin{aligned} &H_{n+1}(x_{n+1})-\varPhi (u) \\ &\quad=\varPhi (x_{n+1})+\beta _{n}\varPsi (x_{n})- \varPhi (u)+(\beta _{n+1}-\beta _{n})\varPsi (x_{n+1})+\beta _{n}\bigl(\varPsi (x_{n+1})-\varPsi (x_{n})\bigr) \\ &\quad \leq \varPhi (x_{n+1})+\beta _{n}\varPsi (x_{n})-\varPhi (u)+K\varPsi (x_{n+1})+ \beta _{n} \bigl(\varPsi (x_{n+1})-\varPsi (x_{n})\bigr). \end{aligned}$$
(29)

Using (25), we obtain

$$\begin{aligned} \beta _{n}\bigl(\varPsi (x_{n+1})- \varPsi (x_{n})\bigr) &\leq \beta _{n}\bigl\langle \nabla \varPsi (x_{n}), x_{n+1}-x_{n}\bigr\rangle + \frac{L\beta _{n}}{2} \Vert x_{n+1}-x _{n} \Vert ^{2} \\ &\leq \beta _{n} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert \Vert x_{n+1}-x_{n} \Vert +\frac{L\beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2} \\ &\leq \frac{\beta _{n}}{2} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+\frac{(L+1)\beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2}. \end{aligned}$$
(30)

Inequalities (29) and (30) give

$$\begin{aligned} &\lambda _{n}\bigl(H_{n+1}(x_{n+1})- \varPhi (u)\bigr) \\ &\quad\leq \lambda _{n}\bigl(\varPhi (x_{n+1})+\beta _{n}\varPsi (x_{n})-\varPhi (u)\bigr) \\ &\qquad{}+\lambda _{n}K\varPsi (x_{n+1})+\frac{\lambda _{n}\beta _{n}}{2} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+ \frac{(L+1)\lambda _{n}\beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2}. \end{aligned}$$

Since the sequence \(\{\lambda _{n}\}\) is bounded and \(0<\bar{c}\leq \lambda _{n}\beta _{n}<\frac{2}{L} \), Theorem 3.1 implies

$$ \sum_{n}^{+\infty }\biggl(\lambda _{n}K\varPsi (x_{n+1})+\frac{\lambda _{n}\beta _{n}}{2} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2} + \frac{(L+1)\lambda _{n}\beta _{n}}{2} \Vert x_{n+1}-x_{n} \Vert ^{2}\biggr)< +\infty. $$

On the other hand, the subdifferential inequality for Φ at points u and \(x_{n+1}\) gives

$$ \varPhi (u)\geq \varPhi (x_{n+1})+\biggl\langle \frac{cJx_{n}-cJx_{n+1}}{\lambda _{n}}-\beta _{n}\nabla \varPsi (x_{n}), u-x_{n+1}\biggr\rangle . $$
(31)

Since \(\varPsi (u)=0\), the subdifferential inequality for Ψ at points u and \(x_{n}\) gives

$$ 0\geq \varPsi (x_{n})+\bigl\langle \nabla \varPsi (x_{n}), u-x_{n}\bigr\rangle = \varPsi (x_{n})+\bigl\langle \nabla \varPsi (x_{n}), u-x_{n+1}\bigr\rangle +\bigl\langle \nabla \varPsi (x_{n}), x_{n+1}-x_{n}\bigr\rangle . $$
(32)

Combining (31) and (32), we obtain

$$ 2\lambda _{n}\bigl(\varPhi (x_{n+1})+\beta _{n} \varPsi (x_{n})-\varPhi (u)\bigr)\leq 2 \langle cJx_{n}-cJx_{n+1}, x_{n+1}-u\rangle +2\lambda _{n}\beta _{n} \bigl\langle \nabla \varPsi (x_{n}), x_{n}-x_{n+1}\bigr\rangle . $$

However,

$$ 2\langle cJx_{n}-cJx_{n+1}, x_{n+1}-u\rangle =cW(x_{n},u)-cW(x_{n+1},u)-cW(x _{n},x_{n+1}) $$

and

$$ 2\lambda _{n}\beta _{n}\bigl\langle \nabla \varPsi (x_{n}), x_{n}-x_{n+1}\bigr\rangle \leq \frac{4}{L^{2}} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+ \Vert x_{n}-x_{n+1} \Vert ^{2}. $$

Hence,

$$\begin{aligned} &2\lambda _{n}\bigl(\varPhi (x_{n+1})+\beta _{n}\varPsi (x_{n})-\varPhi (u)\bigr) \\ &\quad\leq cW(x_{n},u)-cW(x_{n+1},u)-cW(x_{n},x_{n+1})+ \frac{4}{L^{2}} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+ \Vert x_{n}-x_{n+1} \Vert ^{2} \\ &\quad\leq cW(x_{n},u)-cW(x_{n+1},u)+\frac{4}{L^{2}} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}. \end{aligned}$$

We conclude that

$$ \sum_{n=1}^{m}2\lambda _{n} \bigl(\varPhi (x_{n+1})+\beta _{n}\varPsi (x_{n})- \varPhi (u)\bigr) \leq cW(x_{1},u)-cW(x_{m+1},u)+ \frac{4}{L^{2}}\sum_{n=1} ^{m} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2} $$

for \(m\geq 1\). In view of Theorem 3.1, this show

$$ \sum_{n=1}^{+\infty }\lambda _{n}\bigl( \varPhi (x_{n+1})+\beta _{n}\varPsi (x_{n})- \varPhi (u)\bigr)< +\infty, $$

and completes the proof. □

The duality mapping J is said to be weakly continuous on a smooth Banach space if \(x_{n}\rightharpoonup x\) implies \(J(x_{n})\rightharpoonup J(x)\). This happens, for example, if X is a Hilbert space, or finite-dimensional and smooth, or \(l^{p},1< p<+\infty \). This property of Banach spaces was introduced by Browder [7]. More information can be found in [10].

Theorem 4.1

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (23). Then every weak cluster point of \(\{x_{n}\}\) lies in \(\mathcal{S}\). If the duality mapping J is weakly continuous, then \(\{x_{n}\}\) convergence weakly as \(n\rightarrow +\infty \) to a point in \(\mathcal{S}\).

Proof

Since \(\sum_{n=1}^{+\infty }\lambda _{n}=+\infty \), Propositions 4.1 and 4.2 imply \(\lim_{n\rightarrow +\infty }H_{n}(x_{n})\leq \varPhi (u)\) whenever \(u\in C\). Suppose that a subsequence \(\{x_{n,k}\}\) of \(\{x_{n}\}\) converges weakly to some as \(k\rightarrow +\infty \). Then \(\hat{x}\in C\) by Theorem 3.1. The weak lower-semicontinuity of Φ and \(\varPhi =H_{n}-\beta _{n}\varPsi \leq H_{n}\) then gives

$$ \varPhi (\hat{x})\leq \liminf_{k\rightarrow +\infty }\varPhi (x_{n,k}) \leq \liminf_{k\rightarrow +\infty }H_{n,k}(x_{n,k})= \lim _{n\rightarrow +\infty }H_{n}(x_{n})\leq \varPhi (u). $$

Therefore, minimizes Φ on C, and so \(\hat{x}\in \mathcal{S}\).

Clearly, the sequence \(\{x_{n}\}\) is bounded (see Theorem 3.1(i)). The space being reflexive, it suffices to prove that \(\{x_{n}\}\) has only one weak cluster point as \(n\rightarrow +\infty \). Suppose otherwise that \(x_{n,l}\rightharpoonup \bar{x}\) and \(x_{n,k}\rightharpoonup \hat{x}\). Since

$$ 2\bigl\langle J(x_{n}),\bar{x}-\hat{x}\bigr\rangle =W(x_{n},\hat{x})-W(x_{n}, \bar{x})- \Vert \hat{x} \Vert ^{2}+ \Vert \bar{x} \Vert ^{2}, $$

we deduce the existence of \(\lim_{n\rightarrow +\infty }2 \langle J(x_{n}),\bar{x}-\hat{x}\rangle \). Hence,

$$ \lim_{l\rightarrow +\infty }\bigl\langle J(x_{n,l}),\bar{x}-\hat{x} \bigr\rangle -\lim_{k\rightarrow +\infty }\bigl\langle J(x_{n,k}), \bar{x}-\hat{x}\bigr\rangle =0. $$

Since the duality mapping J is weakly continuous, we have

$$ \bigl\langle J(\bar{x})-J(\hat{x}),\bar{x}-\hat{x}\bigr\rangle =0. $$

Since X is strictly convex, we have that \(\bar{x}=\hat{x}\). □

If \(\varPhi:X\rightarrow (-\infty,+\infty ]\) is also a strongly convex function, that is, there exists \(\lambda >0\), for any \(0< t<1\), any \(x,y\in \operatorname{dom} \varPhi \) such that \(t\varPhi (x)+(1-t)\varPhi (y) \geq \varPhi (tx+(1-t)y)+\lambda t(1-t)\|x-y\|^{2}\), then ∂Ψ is strong monotone. Hence, the following theorem follows immediately from Theorem 3.3.

Theorem 4.2

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (23) and let Φ be a proper, lower semicontinuous and strongly convex function. Then the sequence \(\{x_{n}\}\) converges strongly as \(n\rightarrow +\infty \) to a point in \(\mathcal{S}\).

Additional result

The purpose of this section is to prove a convergence result without Fenchel conjugate assumption.

Iterative Method 5.1

Given \(x_{0}\in X\), set

$$ x_{n+1}=(cJ+\lambda _{n}A)^{-1} \bigl(cJx_{n}-\lambda _{n}\nabla \varPsi (x_{n}) \bigr), $$
(33)

where \(\{\lambda _{n}\}\) is a sequence of positive real numbers with \(\sum_{n=1}^{\infty }\lambda _{n}=+\infty, \sum_{n=1}^{\infty } \lambda _{n}^{\frac{4}{3}}<+\infty\).

Keeping the notations of the preceding section, set \(z_{k}=\frac{1}{ \gamma _{k}}\sum_{n=1}^{k}\lambda _{n}x_{n}\), where \(\gamma _{k}=\sum_{n=1}^{k}\lambda _{n}\). The following gives the weak ergodic convergence of the sequence \(\{x_{n}\}\) given by (33).

Proposition 5.1

Let \(\{x_{n}\}\) be a sequence generated by iterative formula (33). Assume that the sequence \(\{\lambda _{n}^{\frac{1}{3}} \nabla \varPsi (x_{n})\}\) is bounded. Then every weak cluster of \(\{z_{k}\}\) lies in \(\mathcal{S}\).

Proof

Take \(u\in C\cap \operatorname{dom} A\), \(v^{*}\in A(u)\) such that \(v^{*}= v^{*}+0\in A(u)+ N_{C}(u)=T_{A,C}(u)\). Since \(v^{*}\in A(u)\) and \(cJ(x_{n})-cJ(x_{n+1})-\lambda _{n}\nabla \varPsi (x _{n})\in \lambda _{n}A(x_{n+1})\), the monotonicity of A implies

$$ \bigl\langle cJ(x_{n})-cJ(x_{n+1})-\lambda _{n} \nabla \varPsi (x_{n})-\lambda _{n}v^{*}, x_{n+1}-u\bigr\rangle \geq 0, $$

and so

$$ \bigl\langle cJ(x_{n})-cJ(x_{n+1}), u-x_{n+1}\bigr\rangle \leq \bigl\langle \lambda _{n} \nabla \varPsi (x_{n})+\lambda _{n}v^{*}, u-x_{n+1} \bigr\rangle . $$

Then, we get from (7) that

$$ cW(x_{n+1},u)-cW(x_{n},u)+cW(x_{n},x_{n+1}) \leq 2\lambda _{n}\bigl\langle \nabla \varPsi (x_{n})+v^{*}, u-x_{n+1}\bigr\rangle . $$

By developing the right-hand side, we deduce the following inequality:

$$\begin{aligned} &2\lambda _{n}\bigl\langle \nabla \varPsi (x_{n})+v^{*}, x_{n}-u\bigr\rangle +2 \lambda _{n}\bigl\langle \nabla \varPsi (x_{n})+v^{*}, x_{n+1}-x_{n}\bigr\rangle \\ &\quad \leq cW(x_{n},u)-cW(x_{n+1},u)-cW(x_{n},x_{n+1}). \end{aligned}$$
(34)

Now, combing the facts that

$$ 2\lambda _{n}\bigl\langle \nabla \varPsi (x_{n})+v^{*}, x_{n+1}-x_{n}\bigr\rangle \geq - \Vert x_{n+1}-x_{n} \Vert ^{2}-\lambda _{n}^{2} \bigl\Vert \nabla \varPsi (x_{n})+v ^{*} \bigr\Vert ^{2} $$

and

$$ \bigl\langle \nabla \varPsi (x_{n})+v^{*}, x_{n}-u\bigr\rangle =\bigl\langle \nabla \varPsi (x_{n}), x_{n}-u\bigr\rangle +\bigl\langle v^{*}, x_{n}-u \bigr\rangle , $$

we derive from (34) that

$$\begin{aligned} &2\lambda _{n}\bigl\langle \nabla \varPsi (x_{n}), x_{n}-u\bigr\rangle +2\lambda _{n}\bigl\langle v^{*}, x_{n}-u\bigr\rangle -\lambda _{n}^{2} \bigl\Vert \nabla \varPsi (x _{n})+v^{*} \bigr\Vert ^{2} \\ &\quad\leq cW(x_{n},u)-cW(x_{n+1},u)-cW(x_{n},x_{n+1})+ \Vert x_{n+1}-x_{n} \Vert ^{2}. \end{aligned}$$
(35)

Since \(u\in C\cap \operatorname{dom} A\) and \(C=\operatorname{argmin}(\varPsi )\), we have \(\nabla \varPsi (u)=0\). Hence,

$$ \bigl\langle \nabla \varPsi (x_{n}), x_{n}-u\bigr\rangle =\bigl\langle \nabla \varPsi (x _{n})- \nabla \varPsi (u), x_{n}-u\bigr\rangle \geq 0. $$
(36)

Since \(cW(x_{n},x_{n+1})\geq \|x_{n+1}-x_{n}\|^{2}\), then by (35) and (36) we have

$$ 2\lambda _{n}\bigl\langle v^{*}, x_{n}-u\bigr\rangle -\lambda _{n}^{2} \bigl\Vert \nabla \varPsi (x_{n})+v^{*} \bigr\Vert ^{2} \leq cW(x_{n},u)-cW(x_{n+1},u). $$

Summing up these inequalities over n from 1 to k, and dividing by \(\gamma _{k}\) gives

$$ 2\bigl\langle v^{*}, z_{k}-u\bigr\rangle \leq \frac{cW(x_{1},u)}{\gamma _{k}}+\frac{ \sum_{n=1}^{k}\lambda _{n}^{2} \Vert \nabla \varPsi (x_{n})+v^{*} \Vert ^{2}}{\gamma _{k}}. $$
(37)

Since \(\{\lambda _{n}^{\frac{1}{3}}\nabla \varPsi (x_{n})\}\) is bounded, due to \(\sum_{n=1}^{\infty }\lambda _{n}^{\frac{4}{3}}<+\infty \) and \(\{\lambda _{n}^{2}\|\nabla \varPsi (x_{n})\|^{2}\}= \{\lambda _{n}^{ \frac{4}{3}}\lambda _{n}^{\frac{2}{3}}\|\nabla \varPsi (x_{n})\|^{2}\}\), we have

$$ \sum_{n=1}^{+\infty }\lambda _{n}^{2} \bigl\Vert \nabla \varPsi (x_{n})+v^{*} \bigr\Vert ^{2} \leq 2 \Biggl(\sum_{n=1}^{+\infty } \lambda _{n}^{2} \bigl\Vert \nabla \varPsi (x_{n}) \bigr\Vert ^{2}+\sum_{n=1}^{+\infty } \lambda _{n}^{2} \bigl\Vert v^{*} \bigr\Vert ^{2} \Biggr)< + \infty. $$

Finally, since \(\gamma _{k}\rightarrow +\infty \) as \(k\rightarrow + \infty \), we conclude that

$$ \lim_{k\rightarrow +\infty }\frac{\sum_{n=1}^{k}\lambda _{n}^{2} \Vert \nabla \varPsi (x_{n})+v^{*} \Vert ^{2}}{\gamma _{k}}=0. $$

Consequently, if z is any weak sequential cluster point of the sequence \(\{z_{k}\}\), letting \(k\rightarrow +\infty \) on both side of (37) yields

$$ \bigl\langle v^{*}, z-u\bigr\rangle \leq 0. $$

Then by maximal monotonicity of \(A+N_{C}\), we conclude that \(0\in (A+N_{C})(z)\), that is, \(z\in \mathcal{S}\). □

Concluding remarks

In this paper, we considered a class of forward–backward splitting methods based on Lyapunov distance for variational inequalities and convex minimization problem in a reflexive, strictly convex and smooth Banach space. Weak and strong convergence results have been obtained for the forward–backward splitting method under the key Fenchel conjugate assumption. Finally, we have also obtained a weak convergence result without Fenchel conjugate assumption.

References

  1. Albert, Y.: Iterative regularization in Banach spaces. Sov. Math. 30, 1–8 (1986)

    Google Scholar 

  2. Albert, Y.: Decomposition theorem in Banach spaces. Fields Inst. Commun. 25, 77–93 (2000)

    MathSciNet  Google Scholar 

  3. Attouch, H., Czarnecki, M.O., Peypouquet, J.: Prox-penalization and splitting methods for constrained variational problems. SIAM J. Optim. 21(1), 149–173 (2011)

    MathSciNet  Article  Google Scholar 

  4. Attouch, H., Czarnecki, M.O., Peypouquet, J.: Coupling forward–backward with penalty schemes and parallel splitting for constrained variational inequalities. SIAM J. Optim. 21(4), 1251–1274 (2011)

    MathSciNet  Article  Google Scholar 

  5. Baillon, J.B., Haddad, G.: Quelques propriétés des opérateurs angle-bornés et n-cycliquement monotones. Israel J. Math. 26, 137–150 (1977)

    MathSciNet  Article  Google Scholar 

  6. Bredies, K.: A forward–backward splitting algorithm for the minimization of nonsmooth convex functionals in Banach space. Inverse Probl. 25, 015005 (2008)

    Article  Google Scholar 

  7. Browder, F.E.: Fixed point therems for nonlinear semicontractive mappings in Banach spaces. Arch. Ration. Mech. Anal. 21, 259–269 (1965)

    Article  Google Scholar 

  8. Cioranescu, I.: Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems. Kluwer Academic Publishers, Dordrecht (1990)

    Book  Google Scholar 

  9. Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)

    MathSciNet  Article  Google Scholar 

  10. Gossez, J.P., Dozo, E.L.: Some geometric properties related to the fixed point theory for nonexpansive mappings. Pacific J. Math. 40, 565–573 (1972)

    MathSciNet  Article  Google Scholar 

  11. Guan, W.B., Song, W.: The generalized forward–backward splitting method for the minimization of the sum of two functions in Banach spaces. Numer. Funct. Anal. Optim. 36, 867–886 (2015)

    MathSciNet  Article  Google Scholar 

  12. Moudafi, A.: On the convergence of the forward–backward algorithm for null-point problems. J. Nonlinear Var. Anal. 2, 263–268 (2018)

    MATH  Google Scholar 

  13. Noun, N., Peypouquet, J.: Forward–backward penalty scheme for constrained convex minimization without inf-compactness. J. Optim. Theory Appl. 158, 787–795 (2013)

    MathSciNet  Article  Google Scholar 

  14. Peypouquet, J.: Coupling the gradient method with a general exterior penalization scheme for convex minimization. J. Optim. Theory Appl. 153(1), 123–138 (2012)

    MathSciNet  Article  Google Scholar 

  15. Yao, Y., Postolache, M., Yao, J.C.: An iterative algorithm for solving the generalized variational inequalities and fixed points problems. Mathematics 7, Article ID 61 (2019)

    Article  Google Scholar 

  16. Yao, Y., Shahzad, N.: Strong convergence of a proximal point algorithm with general errors. Optim. Lett. 6, 621–628 (2012)

    MathSciNet  Article  Google Scholar 

  17. Yuan, H.: A splitting algorithm in a uniformly convex and 2-uniformly smooth Banach space. J. Nonlinear Funct. Anal. 2018, Article ID 26 (2018)

    Google Scholar 

  18. Zegeye, H., Shahzad, N., Yao, Y.H.: Minimum-norm solution of variational inequality and fixed point problem in Banach spaces. Optimization 64, 453–471 (2015)

    MathSciNet  Article  Google Scholar 

Download references

Funding

The work was supported by PhD research startup foundation of Harbin Normal University (No. XKB201804) and the National Natural Sciences Grant (No. 11871182).

Author information

Authors and Affiliations

Authors

Contributions

All authors equally contributed to this work. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Wei-Bo Guan.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guan, WB., Song, W. The forward–backward splitting methods for variational inequalities and minimization problems in Banach spaces. J Inequal Appl 2019, 89 (2019). https://doi.org/10.1186/s13660-019-2035-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13660-019-2035-5

MSC

  • 49M30
  • 46N10
  • 90C25
  • 90C30

Keywords

  • Banach space
  • Forward–backward splitting methods
  • Variational inequalities
  • Maximal monotone operators
  • Duality mapping