Skip to main content

Convergence analysis of a variable metric forward–backward splitting algorithm with applications

Abstract

The forward–backward splitting algorithm is a popular operator-splitting method for solving monotone inclusion of the sum of a maximal monotone operator and an inverse strongly monotone operator. In this paper, we present a new convergence analysis of a variable metric forward–backward splitting algorithm with extended relaxation parameters in real Hilbert spaces. We prove that this algorithm is weakly convergent when certain weak conditions are imposed upon the relaxation parameters. Consequently, we recover the forward–backward splitting algorithm with variable step sizes. As an application, we obtain a variable metric forward–backward splitting algorithm for solving the minimization problem of the sum of two convex functions, where one of them is differentiable with a Lipschitz continuous gradient. Furthermore, we discuss the applications of this algorithm to the fundamental of the variational inequalities problem, constrained convex minimization problem, and split feasibility problem. Numerical experimental results on LASSO problem in statistical learning demonstrate the effectiveness of the proposed iterative algorithm.

Introduction

Let H be a real Hilbert space with inner product \(\langle \cdot , \cdot \rangle \) and induced norm \(\| \cdot \|\). The forward–backward splitting algorithm is a classical operator-splitting algorithm, which solves the monotone inclusion problem

$$ \text{find } x\in H \text{ such that } 0\in Ax+Bx, $$
(1.1)

where \(A:H\rightarrow 2^{H}\) is a maximal monotone operator and \(B:H\rightarrow H\) is a β-inverse strongly monotone operator (see Sect. 2 for the precise definition) for some \(\beta > 0\). The forward–backward splitting algorithm, which dates back to the original work of Lions and Mercier [1], has been studied and reported extensively in the literature; see, for example, [2,3,4,5,6]. The emergence of compressive sensing theory and large-scale optimization problems associated with signal and image processing has resulted in the forward–backward splitting algorithm receiving much attention in recent years. A forward–backward splitting algorithm with relaxation and errors in Hilbert spaces was proposed by Combettes [4]. More precisely, let \(x_{0}\in H\), set

$$ x_{k+1}=x_{k}+\lambda _{k} \bigl(J_{\gamma _{k} A} \bigl(x_{k}-\gamma _{k}(Bx_{k}+b _{k}) \bigr)+a_{k}-x_{k} \bigr),\quad k\geq 0, $$
(1.2)

where \(\{\gamma _{k}\}\subset (0,2\beta )\), \(\{\lambda _{k}\}\subset (0,1]\), \(\{a_{k}\}\) and \(\{b_{k}\}\) are absolutely summable sequences in H. In addition, \(J_{\gamma _{k}A}:=(I+ \gamma _{k}A)^{-1}\) denotes the resolvent of operator A with index \(\gamma _{k}>0\). Combettes [4] proved the convergence of the iterative scheme (1.2) when certain conditions are imposed upon the parameters. Jiao and Wang [7] proved the convergence of (1.2) by requiring the parameters \(\{\lambda _{k}\}\) such that \(\{\lambda _{k}\} \subset (0,\frac{4\beta }{2 \beta +\gamma _{k}} )\) when \(b_{k} =0\). It is easy to see that \(\frac{4\beta }{2\beta +\gamma _{k}}\) is strictly larger than one when \(\{\gamma _{k}\}\subset (0,2\beta )\). Further, Combettes and Yamada [8] improved the range of the relaxation parameters \(\{\lambda _{k}\}\) in (1.2) to \((0,\frac{4 \beta -\gamma _{k}}{2\beta })\). After a simple calculation, we know that \(\frac{4\beta -\gamma _{k}}{2\beta } > \frac{4\beta }{2\beta +\gamma _{k}}\). Therefore, the range of \(\{\lambda _{k}\}\) in the work of Combettes and Yamada [8] is larger than that of Jiao and Wang [7].

In the case when \(\gamma _{k} = \gamma \) and \(a_{k}=b_{k}=0\), the iterative scheme (1.2) is reduced to the forward–backward splitting algorithm with a constant step size [9]:

$$ x_{k+1}=x_{k}+ \lambda _{k} \bigl(J_{\gamma A}(x_{k}-\gamma Bx_{k})-x_{k} \bigr),\quad k\geq 0, $$
(1.3)

where \(\gamma \in (0,2\beta )\) and \(\{\lambda _{k}\} \subset (0, \frac{4 \beta -\gamma }{2 \beta })\). Bauschke and Combettes [9] obtained the convergence of the iterative algorithm (1.3) by adopting the Krasnosekii–Mann (KM) iteration for computing the fixed points of nonexpansive operators. Some recent progress on the KM iteration for solving fixed point problem and split inclusion problem can be found in [10,11,12]. The forward–backward splitting algorithm with constant step size (1.3) is usually considered to be stationary, whereas the forward–backward splitting algorithm with variable step sizes (1.2) is referred to as non-stationary.

It is worth mentioning that by letting \(\lambda _{k}=1\), then (1.3) reduces to the classical forward–backward splitting algorithm. More precisely, the iterative sequence \(\{x_{k}\}\) is defined by

$$ x_{k+1}=J_{\gamma A}(x_{k}-\gamma Bx_{k}), \quad k\geq 0. $$
(1.4)

In the context of convex optimization, the forward–backward splitting algorithm is equivalent to the so-called proximal gradient algorithm (PGA) applied to solve the following convex minimization problem:

$$ \min_{x\in H} f(x)+g(x), $$
(1.5)

where \(f:H\rightarrow R\) is convex, differentiable with an L-Lipschitz continuous gradient for some \(L>0\) and \(g:H\rightarrow (-\infty ,+ \infty ]\) is a proper, lower semicontinuous, convex function. The convex optimization problem (1.5) has found widespread application in signal and image processing, for example, [13,14,15,16,17]. As a consequence of [4], Combettes and Wajs [18] employed the forward–backward splitting algorithm (1.2) to solve the minimization problem (1.5). The obtained iterative algorithm is defined as

$$ x_{k+1}=x_{k}+\lambda _{k} \bigl( \operatorname{prox}_{\gamma _{k} g} \bigl(x_{k}-\gamma _{k} \bigl(\nabla f(x_{k})+b_{k} \bigr) \bigr)+a_{k}-x_{k} \bigr), \quad k\geq 0, $$
(1.6)

where \(\{\gamma _{k}\}\subset (0,2/L)\), \(\{\lambda _{k}\}\subset (0,1]\), and \(\{a_{k}\}\), \(\{b_{k}\}\) are absolutely summable sequences in H. \(\operatorname{prox}_{\gamma g}\) denotes the proximity operator of g with index \(\gamma >0\). In addition, Combettes and Wajs [18] presented applications of this algorithm to many concrete convex optimization problems. This iterative algorithm (1.6) was subsequently improved by Combettes and Yamada [8] who extended the range of the relaxation parameters \(\{\lambda _{k}\}\).

Inspired by solving large-scale convex optimization problems arising in image processing, machine learning, and economic management, many efficient primal–dual splitting algorithms have been proposed for structured monotone inclusions involving maximal monotone operators and single-valued Lipschitz or inverse strongly monotone operators; see, for example, [19, 20]. Although these monotone inclusions are more complicated than the monotone inclusion problem (1.1), they can be transformed into the form of this problem in a suitable product space. Therefore, it is natural to consider using the forward–backward splitting algorithm (e.g., (1.2) or (1.3)) to solve the equivalent monotone inclusion problem. Because the backward steps cannot be decomposed, direct use of the forward–backward splitting algorithm often fails to obtain a completely splitting algorithm. Many researchers attempted to overcome this difficulty by investigating variable metric operator splitting algorithms. The use of a suitable variable metric enables the implicit step of backward splitting to be easily decomposed. For example, the primal–dual hybrid gradient algorithm [21] (also known as the primal–dual of the Chambolle–Pock algorithm [22]) is equivalent to the variable metric proximal point algorithm [23, 24]. We refer the readers to a subsequent paper [25] for more details. Vũ [26] proposed a variable metric extension of the forward–backward–forward splitting algorithm [3] for solving monotone inclusion of the sum of a maximal monotone operator and a monotone Lipschitzian operator in Hilbert spaces. Liang [27] proposed a variable metric multi-step inertial operator-splitting algorithm for solving the monotone inclusion problem (1.1). Bonettini et al. [28] developed a scaled inertial forward–backward splitting algorithm for solving (1.1) in the context of convex minimization. Neither of the respective algorithms in the work by Liang [27] and Bonettini et al. [28] was compatible with the relaxation strategy. The variable metric forward–backward splitting algorithm was originally studied in finite-dimensional Hilbert spaces [2, 29]; however, the methods in these studies either had to be strongly monotone to study the convergence rate or they did not make use of the inverse strongly monotone property of B in (1.1). For infinite-dimensional Hilbert spaces, Combettes and Vũ [30] proposed a variable metric forward–backward splitting algorithm to solve (1.1) and analyzed its weak and strong convergence. This algorithm is defined as follows. Let \(x_{0}\in H \), and set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}U_{k}(Bx_{k}+b_{k}), \\ x_{k+1} = x_{k}+ \lambda _{k}(J_{\gamma _{k}U_{k}A}(y_{k})+a_{k}-x_{k}), \end{cases} $$
(1.7)

where \(\{U_{k}\}\subset \mathcal{P}_{\alpha }(H)\), \(\{\lambda _{k}\} \subset (0,1]\), \(\{\gamma _{k}\}\subset (0,2\beta )\), \(\{a_{k}\}\) and \(\{b_{k}\} \) are absolutely summable sequences in H. This algorithm (1.7) includes a variable metric, variable step sizes, relaxation parameter, and errors. It includes nearly all of the forward–backward type of splitting algorithms mentioned above. For example, by letting \(U_{k}=I \) in (1.7), it is reduced to (1.2). The relaxation parameters \(\{\lambda _{k}\}\) in (1.2) are observed to be strictly larger than those based on the work of Combettes and Yamada [8]. While preparing this manuscript, we discovered that in Chap. 5 of the dissertation [31], Simões generalized the variable metric forward–backward splitting algorithm by replacing the relaxation parameters \(\{\lambda _{k}\}\) in (1.7) with self-adjoint, strong positive linear operators. However, this approach still requires the maximum eigenvalue of the operators to be smaller than one.

The purpose of this paper is to introduce a new convergence analysis for the variable metric forward–backward splitting algorithm (1.7) with an extended range of relaxation parameters. We prove the weak convergence of the variable metric forward–backward splitting algorithm by setting the relaxation parameter \(\{\lambda _{k}\}\) larger than one in real Hilbert spaces. To achieve this goal, we make full use of the averaged and firmly nonexpansive property of operators \(J_{\gamma _{k} U_{k}A}(I-\gamma _{k} U_{k}B)\) and \(J_{\gamma _{k} U_{k}A}\), where \(\lambda _{k} >0\) and \(U_{k}\in \mathcal{P}_{\alpha }(H)\). In contrast, existing solutions mainly rely on \(J_{\gamma _{k} U_{k} A}\) being firmly nonexpansive. Consequently, we obtain the convergence of the forward–backward splitting algorithm with variable step sizes. Moreover, we impose a slightly weak condition on the relaxation parameters to ensure the convergence of this algorithm. The results we obtained complement and extend those of Combettes and Yamada [8]. As an application, we obtain the variable metric forward–backward splitting algorithm for solving the minimization problem (1.5). We also present the application of this algorithm to the variational inequalities problem, constrained convex minimization problem, and split feasibility problem. To the best of our knowledge, the iterative algorithms we obtained are the most general ones for solving these problems. Finally, we conduct numerical experiments on LASSO problem to validate the effectiveness of the proposed iterative algorithm.

The remainder of this paper is organized as follows. Section 2 reviews selected notations and lemmas on monotone operator theory and presents some technical lemmas. In Sect. 3, we prove the main convergence results of the variable metric forward–backward splitting algorithm with relaxation in real Hilbert spaces. Consequently, we obtain several corollaries of some special cases. Section 4 presents our use of the proposed iterative algorithm to solve three typical optimization problems including the variational inequalities problem, constrained convex minimization problem, and split feasibility problem. In Sect. 5, we present preliminary numerical results on LASSO problem to illustrate the performance of the proposed iterative algorithm. Finally, we provide our conclusions.

Preliminaries

In this section, we recall selected concepts and lemmas that are commonly used in the context of convex analysis and monotone operator theory. Throughout this paper, let H be a real Hilbert space. The inner product and the associated norm of Hilbert space H are denoted by \(\langle \cdot ,\cdot \rangle \) and \(\|\cdot \|\), respectively. I denotes the identity operator and the symbols and → denote weak and strong convergence.

Let \(A:H\rightarrow 2^{H}\) be a set-valued operator. We denote its domain, range, graph, and zeros by \(\operatorname{dom} A= \{ x\in H| Ax \neq \emptyset \}\), \(\operatorname{ran} A = \{ u\in H | (\exists x\in H) u\in Ax\}\), \(\operatorname{gra} A = \{ (x,u) \in H\times H | u\in Ax \}\), and \(\operatorname{zer} A = \{x\in H | 0\in Ax\}\), respectively.

Definition 2.1

([9])

Let \(A:H\rightarrow 2^{H}\) be a set-valued operator. A is said to be monotone if

$$ \langle x-y, u-v \rangle \geq 0, \quad \forall (x,u), (y,v)\in \operatorname{gra} A. $$

Moreover, A is said to be maximal monotone if its graph is not strictly contained in the graph of any other monotone operator on H.

A well-known example of a maximal monotone operator is the subgradient mapping of a proper, lower semicontinuous convex function \(f:H \rightarrow (-\infty , +\infty ]\) defined by

$$ \partial f: H\rightarrow 2^{H} : x \mapsto \bigl\{ u\in H | f(y) \geq f(x) + \langle u, y-x\rangle , \forall y\in H \bigr\} . $$

Definition 2.2

([9])

Let \(A:H\rightarrow 2^{H}\) be a maximal monotone operator. The resolvent operator of A with index \(\lambda >0\) is defined as

$$ J_{\lambda A} = (I+\lambda A)^{-1}. $$

According to the Minty theorem, the resolvent operator \(J_{\lambda A}\) is defined everywhere on Hilbert space H, and \(J_{\lambda A}\) is firmly nonexpansive.

Let us recall the definition of the proximity operator, which was first introduced by Moreau [32]. Let \(f\in \varGamma _{0}(H)\), where \(\varGamma _{0}(H)\) denotes the set of all proper lower semicontinuous convex functions \(f:H\rightarrow (-\infty , +\infty ]\). The proximity operator of f with index \(\lambda >0\) is defined by

$$ \operatorname{prox}_{\lambda f} : H\rightarrow H: x \mapsto \arg \min _{y\in H} \biggl\{ \frac{1}{2} \Vert y-x \Vert ^{2} + \lambda f(y) \biggr\} . $$

In fact, the resolvent operator of the subdifferential operator of any \(f\in \varGamma _{0}(H)\) with index \(\lambda >0\) is the proximal operator of f with index \(\lambda >0\), that is,

$$ \operatorname{prox}_{\lambda f} = (I+\lambda \partial f)^{-1}. $$

In fact, let \(x\in H\). Set \(p = \operatorname{prox}_{\lambda f}(x)\). By the famous Fermat lemma, we have \(0\in \lambda \partial f(p) + p -x \Leftrightarrow x\in \lambda \partial f(p) + p\). Then \(p = (I + \lambda \partial f)^{-1}(x)\). In other words, \((I + \lambda \partial f)^{-1} = \operatorname{prox}_{\lambda f}\). Therefore, the proximity operators have the same property as the resolvent operators.

Definition 2.3

([9])

Let \(B:H\rightarrow H\) be a single-valued operator. Let \(\beta >0\), then B is said to be β-inverse strongly monotone if

$$ \langle x-y, Bx-By \rangle \geq \beta \Vert Bx-By \Vert ^{2},\quad \forall x,y\in H. $$

The β-inverse strongly monotone operator is also known as a β-cocoercive operator. It is easy to see from the above definition that a β-inverse strongly monotone operator is \(\frac{1}{\beta }\)-Lipschitz continuous, i.e., \(\|Bx-By\| \leq \frac{1}{ \beta }\|x-y\|\).

Next, we recall the definitions of nonexpansive and related mappings. These mappings often appear in the convergence analysis of optimization algorithms.

Definition 2.4

([9])

Let C be a nonempty subset of H. Let \(T:C\rightarrow H\), then

  1. (i)

    T is considered to be nonexpansive if

    $$ \Vert Tx-Ty \Vert \leq \Vert x-y \Vert ,\quad \forall x,y\in C. $$
  2. (ii)

    T is considered to be firmly nonexpansive if

    $$ \Vert Tx-Ty \Vert ^{2} \leq \Vert x-y \Vert ^{2} - \bigl\Vert (I-T)x- (I-T)y \bigr\Vert ^{2},\quad \forall x,y\in C. $$
  3. (iii)

    T is referred to as α-averaged, where \(\alpha \in (0,1)\), if there exists a nonexpansive mapping S such that \(T = (1-\alpha )I + \alpha S\).

It follows immediately that a firmly nonexpansive mapping is a nonexpansive mapping and an α-averaged mapping is also nonexpansive.

We denote by \(\operatorname{Fix}(T)\) the set of fixed points of a mapping T, that is, \(\operatorname{Fix}(T) = \{x\in H | x = Tx\}\).

Lemma 2.1

(Demiclosedness principle [9])

Let C be a nonempty subset of H. Let \(T:C\rightarrow H\) be a nonexpansive mapping with \(\operatorname{Fix}(T)\neq \emptyset \). If \(\{x_{k}\}\) is a sequence in C that converges weakly to x and if \(\{(I-T)x_{k}\}\) converges strongly to y, then \((I-T)x=y\); in particular, if \(y=0\), then \(x\in \operatorname{Fix}(T)\).

The following proposition provides some equivalent definitions of the firmly nonexpansive mappings. This proposition can be found in Proposition 4.4 of [9].

Proposition 2.1

([9])

Let C be a nonempty subset of H. Let \(T:C\rightarrow H\), then the following are equivalent:

  1. (i)

    T is firmly nonexpansive;

  2. (ii)

    \(I-T\) is firmly nonexpansive;

  3. (iii)

    \(2T-I\) is nonexpansive;

  4. (iv)

    \(\langle x-y, Tx-Ty \rangle \geq \|Tx-Ty\|^{2}\), \(\forall x,y\in C\).

From Proposition 2.1(iii) and (iv), we know that if T is firmly nonexpansive, then T is \(\frac{1}{2}\)-averaged, and a 1-inverse strongly monotone operator is firmly nonexpansive.

The following proposition is taken from Proposition 4.35 of [9].

Proposition 2.2

Let C be a nonempty subset of H. Let \(T:C\rightarrow H\), then T is α-averaged if and only if

$$ \Vert Tx-Ty \Vert ^{2} \leq \Vert x-y \Vert ^{2} - \frac{1-\alpha }{\alpha } \bigl\Vert (I-T)x-(I-T)y \bigr\Vert ,\quad \forall x,y\in C. $$

The following lemma provides a relation between an operator T with its complement \(I-T\).

Lemma 2.2

([9])

Let C be a nonempty subset of H. Let \(T:C\rightarrow H\), then

  1. (i)

    T is nonexpansive if and only if the complement \(I-T\) is \(\frac{1}{2}\)-inverse strongly monotone;

  2. (ii)

    T is α-averaged if and only if the complement \(I-T\) is \(\frac{1}{2\alpha }\)-inverse strongly monotone.

We refer interested readers to [9] for further properties of nonexpansive, firmly nonexpansive, and α-averaged nonlinear mappings.

We recall the results of the composition of two averaged operators. The following lemma first appeared in [33] after which it was extended to a finite family of composition averaged operators [8].

Lemma 2.3

Let C be a nonempty subset of H. Let \(T_{1} : C\rightarrow H\) be \(\alpha _{1}\)-averaged and \(T_{2} :C\rightarrow H\) be \(\alpha _{2}\)-averaged. Then

$$ T := T_{1} T_{2} \textit{ is } \frac{\alpha _{1} +\alpha _{2} - 2\alpha _{1} \alpha _{2}}{1-\alpha _{1} \alpha _{2}} \textit{-averaged}. $$

Remark 2.1

  1. (i)

    It is worth mentioning that two other results of the combination of averaged operators were reported. From Proposition 4.32 of [34], \(T := T_{1} T_{2}\) is \(\overline{\alpha } = \frac{2}{1+\frac{1}{ \max (\alpha _{1},\alpha _{2})}}\)-averaged. From Byrne [35], \(T := T_{1} T_{2}\) is \(\widehat{\alpha } = \alpha _{1} + \alpha _{2} -\alpha _{1} \alpha _{2}\)-averaged. It is not difficult to verify that \(\frac{\alpha _{1} +\alpha _{2} - 2\alpha _{1} \alpha _{2}}{1- \alpha _{1} \alpha _{2}}\) is smaller than the other two constants α̅ and α̂.

  2. (ii)

    The constant α̂ is used in [7] to show the upper bound of the relaxation parameter \(\lambda _{k}\) such that \(\lambda _{k} < \frac{1}{\widehat{\alpha }}\).

We employ the following previously used notation [30]. Let \(\mathcal{B}(H,G)\) be the spaces of bounded linear operators from Hilbert space H to Hilbert space G. The norm of \(L\in \mathcal{B}(H,G)\) is defined as \(\|L\| = \sup_{x\in H}\frac{\|Lx\|}{\|x\|}\). We set \(\mathcal{B}(H)= \mathcal{B}(H,H)\) and \(\mathbb{S}(H)=\{L\in \mathcal{B}(H)| L=L^{*}\}\), where \(L^{*}\) denotes the adjoint of L. The Loewner partial ordering on \(S(H)\) is defined by, for any \(U,V\in S(H)\),

$$ U \succeq V \quad \Leftrightarrow\quad \langle Ux,x\rangle \geq \langle Vx,x \rangle ,\quad \forall x\in H . $$

Let \(\alpha \in [0, +\infty )\), set

$$ \mathcal{P}_{\alpha }(H)= \bigl\{ U\in S(H)|U\succeq \alpha I \bigr\} . $$

We denote by \(\sqrt{U}\) the square root of \(U\in \mathcal{P}_{ \alpha }(H)\). Moreover, for every \(U\in \mathcal{P}_{\alpha }(H)\), we define a semi-scalar product and a semi-norm (a scalar product and a norm if \(\alpha >0\)) by

$$ (\forall x\in H)\ (\forall y\in H) \quad \langle x,y \rangle _{U}= \langle Ux,y\rangle \quad \mbox{and}\quad \Vert x \Vert _{U}=\sqrt{ \langle Ux,x\rangle }. $$

We borrow the following results on monotone operators in a variable metric setting from Combettes [30].

Lemma 2.4

([30])

Let \(A:H\rightarrow 2^{H}\) be maximal monotone, let \(\alpha \in (0,+\infty )\), let \(U\in \mathcal{P}_{ \alpha }(H)\), and let \(H_{U^{-1}}\) be the real Hilbert space with the scalar product \(\langle x,y \rangle _{U^{-1}}=\langle U^{-1}x,y\rangle \), \(\forall x,y\in H\). Then the following hold:

  1. (i)

    \(UA:H\rightarrow 2^{H}\) is maximal monotone;

  2. (ii)

    \(J_{UA}:H\rightarrow 2^{H}\) is 1-inverse strongly monotone, i.e., firmly nonexpansive. More precisely,

    $$ \Vert J_{UA} x-J_{UA} y \Vert ^{2}_{U^{-1}} \leq \Vert x-y \Vert ^{2}_{U^{-1}}- \bigl\Vert (I-J _{UA})x-(I-J_{UA})y \bigr\Vert ^{2}_{U^{-1}}, \quad \forall x,y\in H. $$
    (2.1)
  3. (iii)

    \(J_{UA}=(U^{-1}+A)^{-1}\circ U^{-1}\).

Let \(U\in \mathcal{P}_{\alpha }(H)\) for some \(\alpha >0\). The proximity operator of \(f\in \varGamma _{0}(H)\) relative to the metric induced by U is defined by

$$ \operatorname{prox}_{f}^{U} :H\rightarrow H : x \mapsto \arg \min_{y\in H} \biggl( \frac{1}{2} \Vert x-y \Vert _{U}^{2} + f(y) \biggr). $$

We have \(\operatorname{prox}_{f}^{U} = J_{U^{-f}\partial f}\) and we can write \(\operatorname{prox}_{f}^{I}=\operatorname{prox}_{f}\).

We make full use of the following lemmas to obtain the weak convergence of the considered iterative sequence. Both of the two lemmas were previously reported [36]. In the following, we denote by \(\ell _{+}^{1}(\mathbb{N})\) the set of summable sequences in \([0,+\infty )\), where \(\mathbb{N}\) is a set of nonnegative integer numbers.

Lemma 2.5

([36])

Let \(\alpha \in (0,+\infty )\), and let \(\{W_{k}\} \) be in \(\mathcal{P}_{\alpha }(H)\), let C be a nonempty subset of H, and let \(\{x_{k}\}\) be a sequence in H such that

$$ \Vert x_{k+1}-z \Vert _{W_{k+1}}\leq (1+\eta _{k}) \Vert x_{k}-z \Vert _{W_{k}}+\epsilon _{k}, \quad \forall z\in C, $$
(2.2)

where \(\{\eta _{n}\}\subset \ell _{+}^{1}(\mathbb{N})\) and \(\{\epsilon _{k}\}\subset \ell _{+}^{1}(\mathbb{N})\). Then \(\{x_{k}\}\) is bounded and, for every \(z \in C\), \(( \Vert x_{k}-z \Vert _{W_{k}})\) converges.

Lemma 2.6

([36])

Let \(\alpha \in (0,+\infty )\), and let \(\{W_{k}\} \) and W be in \(\mathcal{P}_{\alpha }(H)\) such that \(W_{k} \rightarrow W\) pointwise as \(k\rightarrow +\infty \), as is the case when

$$ \sup_{k\in N} \Vert W_{k} \Vert < +\infty \quad \textit{and}\quad \bigl(\exists \{\eta _{k}\}\subset \ell _{+}^{1}( \mathbb{N}) \bigr) \quad (1+\eta _{k})W_{k} \succeq W_{k+1}. $$

Let C be a nonempty subset of H, and let \(\{x_{k}\}\) be a sequence in H such that (2.2) is satisfied. Then \(\{x_{k}\}\) converges weakly to a point in C if and only if every weak sequential cluster point of \(\{x_{k}\}\) is in C.

The following lemma can be found in Corollary 2.15 of Bauschke and Combettes [9].

Lemma 2.7

([9])

Let \(x\in H\), \(y \in H\), and \(\alpha \in R\). Then

$$ \bigl\Vert \alpha x + (1-\alpha )y \bigr\Vert ^{2}= \alpha \Vert x \Vert ^{2}+(1-\alpha ) \Vert x \Vert ^{2}-\alpha (1-\alpha ) \Vert x-y \Vert ^{2}. $$
(2.3)

Variable metric forward–backward splitting algorithm

In this section, we study the convergence of the variable metric forward–backward splitting algorithm. First, we prove the following useful lemmas.

Lemma 3.1

Let \(B:H\rightarrow H \) be a β-inverse strongly monotone operator. Let \(\alpha >0\), and let \(U\in \mathcal{P}_{\alpha }(H)\). Let \(H_{U^{-1}}\) be a real Hilbert space with the scalar product \(\langle x,y\rangle _{U^{-1}}=\langle U^{-1}x,y\rangle \), \(\forall x,y \in H\). Then \(I-\gamma UB\) is a \(\frac{\gamma \|U\|}{2\beta }\)-averaged operator on \(H_{U^{-1}}\) for any \(\gamma \in (0,\frac{2\beta }{\|U\|})\).

Proof

Let \(x,y\in H\). Because B is β-inverse strongly monotone, we have

$$\begin{aligned} \langle UBx-UBy,x-y\rangle _{U^{-1}} & = \langle Bx-By,x-y \rangle \\ & \geq \beta \Vert Bx-By \Vert ^{2}. \end{aligned}$$
(3.1)

On the other hand, we obtain

$$ \Vert UBx-UBy \Vert ^{2}_{U^{-1}}\leq \Vert U \Vert \cdot \Vert Bx-By \Vert ^{2}. $$
(3.2)

From (3.1) and (3.2), we obtain

$$ \langle UBx-UBy,x-y\rangle _{U^{-1}}\geq \frac{\beta }{ \Vert U \Vert }\cdot \Vert UBx-UBy \Vert ^{2}_{U^{-1}}, $$
(3.3)

which means that UB is \(\frac{\beta }{\|U\|}\)-inverse strongly monotone on \(H_{U^{-1}}\). Then \(\gamma UBx\) is \(\frac{\beta }{\gamma \|U\|}\)-inverse strongly monotone. By Lemma 2.2(ii), \(I-\gamma UB\) is a \(\frac{\gamma \|U\|}{2\beta }\)-averaged operator on \(H_{U^{-1}}\). □

Lemma 3.2

Let \(A:H\rightarrow 2^{H}\) be maximal monotone. Let \(\alpha \in (0,+ \infty )\), and let \(U\in \mathcal{P}_{\alpha }(H)\). Let \(H_{U^{-1}}\) be a real Hilbert space with the scalar product \(\langle x,y\rangle _{U ^{-1}}= \langle U^{-1}x,y\rangle \), \(\forall x,y\in H\). Let \(B:H\rightarrow H\) be a β-inverse strongly monotone operator. Then, for any \(\gamma \in (0,\frac{2\beta }{\|U\|}), J_{\gamma UA}(I-\gamma UB)\) is \(\frac{2\beta }{4\beta -\gamma \|U\|}\)-averaged on \(H_{U^{-1}}\).

Proof

Because A is maximal monotone, then for any \(\gamma >0\), \(\gamma UA\) is maximal monotone. According to Lemma 2.4(ii), \(J_{\gamma UA}\) is 1-inverse strongly monotone on \(H_{U^{-1}}\). Then \(J_{\gamma UA}\) is \(\frac{1}{2}\)-averaged. Lemma 3.1 determines that \(I-\gamma UB\) is \(\frac{\gamma \|U\|}{2\beta }\)-averaged. Therefore, we apply Lemma 2.3, from which we know that \(J_{\gamma UA}(I-\gamma UB)\) is

$$ \frac{\alpha _{1}+\alpha _{2}-2\alpha _{1}\alpha _{2}}{1-\alpha _{1}\alpha _{2}}=\frac{\frac{1}{2}+\frac{\gamma \Vert U \Vert }{2\beta } -\frac{\gamma \Vert U \Vert }{2\beta }}{1-\frac{1}{2}\cdot \frac{\gamma \Vert U \Vert }{2\beta }}=\frac{2 \beta }{4\beta -\gamma \Vert U \Vert }, $$
(3.4)

which is the averaged operator. □

Lemma 3.3

Let H be a real Hilbert space. Let \(A:H \rightarrow 2^{H}\) be a maximal monotone operator. Let \(B:H \rightarrow H\) be a β-inverse strongly monotone operator for some \(\beta > 0\). Suppose that \(\varOmega :=\operatorname{zer}(A+B)\neq \emptyset \). Let \(\gamma _{k} >0\), \(\alpha > 0\), and \(\{U_{k}\}\subset \mathcal{P}_{\alpha }(H)\). Then the following are equivalent:

  1. (i)

    \(x^{*} \in \operatorname{zer}(A+B)\).

  2. (ii)

    \(x^{*} = J_{\gamma _{k}U_{k}A}(I - \gamma _{k}U_{k}B)(x^{*})\) for any \(\gamma _{k}> 0\).

  3. (iii)

    \(x^{*} = (\frac{U_{k}^{-1}+\gamma _{k} A}{\alpha })^{-1} \circ (\frac{U_{k}^{-1}-\gamma _{k} B}{\alpha })x^{*}\).

Proof

(i) (ii) Let \(x^{*} \in \operatorname{zer}(A+B)\), then we have

$$\begin{aligned}& 0\in \gamma _{k}Ax^{*}+ \gamma _{k}Bx^{*} \\& \quad \Leftrightarrow\quad 0\in \gamma _{k}U_{k}Ax^{*}+ \gamma _{k}U_{k}Bx^{*} \\& \quad \Leftrightarrow\quad x^{*}- \gamma _{k}U_{k}Bx^{*} \in x^{*}+ \gamma _{k}U _{k}Ax^{*} \\& \quad \Leftrightarrow\quad x^{*}=(I+\gamma _{k}U_{k}A)^{-1} \bigl(x^{*}-\gamma _{k}U _{k}Bx^{*} \bigr) \\& \quad \Leftrightarrow\quad x^{*} = J_{\gamma _{k}U_{k}A}(I - \gamma _{k}U_{k}B) \bigl(x ^{*} \bigr). \end{aligned}$$

(ii) (iii) Let \(x^{*}= J_{\gamma _{k}U_{k}A}(I-\gamma _{k}U_{k}B)x^{*}\), then

$$\begin{aligned}& x^{*}-\gamma _{k}U_{k}Bx^{*}\in x^{*}+\gamma _{k}U_{k}Ax^{*} \\& \quad \Leftrightarrow \quad U_{k}^{-1}x^{*}-\gamma _{k}Bx^{*}\in U_{k}^{-1}x^{*}+ \gamma _{k}Ax^{*} \\& \quad \Leftrightarrow \quad \biggl(\frac{U_{k}^{-1}-\gamma _{k}B}{\alpha } \biggr)x^{*} \in \biggl( \frac{U_{k}^{-1}+\gamma _{k}A}{\alpha } \biggr)x^{*} \\& \quad \Leftrightarrow \quad x^{*} = \biggl(\frac{U_{k}^{-1}+\gamma _{k} A}{\alpha } \biggr)^{-1} \circ \biggl(\frac{U_{k}^{-1}-\gamma _{k} B}{\alpha } \biggr)x^{*}. \end{aligned}$$

 □

Lemma 3.4

Let H be a real Hilbert space. Let \(A:H \rightarrow 2^{H}\) be a maximal monotone operator. Let \(B:H \rightarrow H\) be a β-inverse strongly monotone operator for some \(\beta > 0\). Let \(r>0\) and \(s>0\), and let \(U, V\in \mathcal{P}_{\alpha }(H)\). Define a variable metric forward–backward operator \(T_{r U} := J_{rUA}(I-rUB)\). Then, for any \(x\in H\), we have

$$ \Vert T_{rU}x - T_{sV}x \Vert \leq \frac{1}{\lambda _{\mathrm{min}}(U^{-1})} \biggl\Vert \biggl(U^{-1} - \frac{r}{s}V^{-1} \biggr) (x- T_{sV}x) \biggr\Vert , $$

where \(\lambda _{\mathrm{min}}(U^{-1})\) represents the minimum eigenvalue of \(U^{-1}\).

Proof

Let \(x\in H\), in which case we have

$$\begin{aligned}& \frac{U^{-1}x-U^{-1}T_{rU}x}{r} - Bx \in A T_{rU}x, \\& \frac{V^{-1}x-V^{-1}T_{sV}x}{s} - Bx \in A T_{sV}x. \end{aligned}$$

It follows from the monotonicity of operator A that

$$ \biggl\langle T_{rU}x - T_{sV}x, \frac{U^{-1}x-U^{-1}T_{rU}x}{r} - \frac{V ^{-1}x-V^{-1}T_{sV}x}{s} \biggr\rangle \geq 0. $$

Then

$$ \Vert T_{rU}x - T_{sV}x \Vert _{U^{-1}}^{2} \leq r \biggl\langle T_{rU}x - T _{sV}x, \biggl( \frac{U^{-1}}{r} - \frac{V^{-1}}{s} \biggr) (x-T_{sV}x) \biggr\rangle . $$

Because of the Cauchy–Schwarz inequality and the fact that \(\lambda _{\mathrm{min}}(U^{-1})\|x\|^{2} \leq \|x\|_{U^{-1}}^{2}\), for any \(x\in H\), we obtain

$$ \Vert T_{rU}x - T_{sV}x \Vert \leq \frac{1}{\lambda _{\mathrm{min}}(U^{-1})} \biggl\Vert \biggl(U^{-1} - \frac{r}{s}V^{-1} \biggr) (x- T_{sV}x) \biggr\Vert . $$

 □

We are ready to state our main theorems and present their convergence analysis.

Theorem 3.1

Let H be a real Hilbert space. Let \(A:H \rightarrow 2^{H}\) be maximal monotone. Let \(B:H\rightarrow H\) be β-inverse strongly monotone for some \(\beta > 0\). Suppose that \(\varOmega :=\operatorname{zer}(A+B)\neq \emptyset \). Let \(\alpha > 0\), \(\{\eta _{k}\}\in \ell _{+}^{1}(\mathbb{N})\), and \(\{U_{k}\}\subset \mathcal{P}_{\alpha }(H)\) such that

$$ \mu = \sup_{k\in \mathbb{N}} \Vert U_{k} \Vert < +\infty \quad \textit{and}\quad ( 1+ \eta _{k}) U_{k+1} \succeq U_{k}, \quad \forall k \in \mathbb{N}. $$
(3.5)

Let \(\{\gamma _{k}\} \subset (0, \frac{2\beta }{\|U_{k}\|})\) and \(\{\lambda _{k}\} \subset (0,\frac{1}{\alpha _{k}})\), where \(\alpha _{k}=\frac{2 \beta }{4 \beta -\gamma _{k} \|U_{k}\|}\). Let \(\{a_{k}\}\) and \(\{b_{k}\}\) be two sequences in H such that \(\sum_{k=0}^{+\infty }\lambda _{k} \|a_{k}\|<+\infty \) and \(\sum_{k=0} ^{+\infty }\lambda _{k} \|b_{k}\|<+\infty \). Let \(x_{0}\in H\), and set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}U_{k}(Bx_{k}+b_{k}), \\ x_{k+1} = x_{k}+ \lambda _{k} (J_{\gamma _{k}U_{k}A}(y_{k})+a_{k}-x_{k} ). \end{cases} $$
(3.6)

Then we have:

  1. (i)

    For any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty } \|x_{k}-x^{*}\|_{U_{k}^{-1}}\) exists.

Suppose that \(0< \underline{\lambda }\leq \lambda _{k} \leq \frac{1}{ \alpha _{k}}-\tau \), where \(\tau \in (0,\frac{1}{\alpha _{k}}-\underline{ \lambda })\), then

  1. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - J_{\gamma _{k} U _{k} A}(x_{k} - \gamma _{k} U_{k} Bx_{k}) \| =0 \).

Suppose that \(0< \underline{\gamma}\leq \gamma _{k}\), then

  1. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω.

Further, suppose that \(\gamma _{k} \leq \frac{2\beta -\epsilon }{ \mu }\), where \(\epsilon \in (0,2\beta -\mu \underline{\gamma})\), then

  1. (iv)

    \(Bx_{k} \rightarrow Bx^{*}\) as \(k\rightarrow +\infty \), where \(x^{*}\in \varOmega \).

Proof

According to condition (3.5), we have

$$ \bigl\Vert U_{k}^{-1} \bigr\Vert \leq \frac{1}{\alpha },\quad U_{k}^{-1}\in \mathcal{P} _{\frac{1}{\mu }}(H),\quad \text{and}\quad (1+\eta _{k})U_{k}^{-1} \succeq U_{k+1}^{-1}. $$
(3.7)

Hence,

$$ ( 1+ \eta _{k}) \Vert x \Vert _{U_{k}^{-1}}^{2} \geq \Vert x \Vert _{U_{k+1}^{-1}}^{2},\quad \forall x\in H. $$
(3.8)

For the sake of convenience, let

$$ \overline{x}_{k+1} = x_{k}+\lambda _{k} \bigl(J_{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k})-x^{k} \bigr). $$
(3.9)

Then iterative scheme (3.6) can be rewritten as

$$ x_{k+1} = \overline{x}_{k+1} + \lambda _{k} e_{k}, $$
(3.10)

where \(e_{k} = J_{\gamma _{k}U_{k}A}(y_{k}) - J_{\gamma _{k}U_{k}A}(x _{k}-\gamma _{k}U_{k}Bx_{k}) + a_{k}\) such that \(\sum_{k=0}^{+\infty } \lambda _{k} \|e_{k}\|<+\infty \). In fact, because \(J_{\gamma _{k} U _{k} A}\) is nonexpansive on \(H_{U_{k}^{-1}}\), we have

$$\begin{aligned} \lambda _{k} \Vert e_{k} \Vert & \leq \sqrt{\mu } \lambda _{k} \Vert e_{k} \Vert _{U _{k}^{-1}} \\ & \leq \sqrt{\mu } \lambda _{k} \bigl\Vert y_{k} - (x_{k}-\gamma _{k}U_{k}Bx _{k}) \bigr\Vert _{U_{k}^{-1}} + \sqrt{\mu } \lambda _{k} \Vert a_{k} \Vert _{U_{k} ^{-1}} \\ & \leq \mu \gamma _{k} \lambda _{k} \Vert b_{k} \Vert + \sqrt{\frac{1}{ \alpha }} \lambda _{k} \Vert a_{k} \Vert \\ & \leq \mu \frac{2\beta }{\alpha } \lambda _{k} \Vert b_{k} \Vert + \sqrt{\frac{1}{ \alpha }} \lambda _{k} \Vert a_{k} \Vert . \end{aligned}$$
(3.11)

Notice that \(\sum_{k=0}^{+\infty }\lambda _{k} \|a_{k}\|<+\infty \) and \(\sum_{k=0}^{+\infty }\lambda _{k} \|b_{k}\|<+\infty \), (3.11) implies that \(\sum_{k=0}^{+\infty }\lambda _{k} \|e_{k}\|<+\infty \).

From Lemma 3.2, we know that \(J_{\gamma _{k}U_{k}A}(I- \gamma _{k}U_{k}B)\) is \(\frac{2\beta }{4\beta - \gamma _{k} \|U_{k}\|}\)-averaged. Let \(\alpha _{k} = \frac{2\beta }{4\beta - \gamma _{k} \|U_{k}\|}\), then there exist nonexpansive mappings \(R_{k}\) such that \(J_{\gamma _{k}U_{k}A}(I-\gamma _{k}U_{k}B) = (1-\alpha _{k})I + \alpha _{k} R_{k}\). Consequently, the iterative sequence \(\{ \overline{x}_{k+1}\}\) in (3.9) is equivalent to

$$\begin{aligned} \overline{x}_{k+1} & = (1-\lambda _{k})x_{k} + \lambda _{k} \bigl( (1-\alpha _{k})x_{k} + \alpha _{k} R_{k} x_{k} \bigr) \\ & = (1-\lambda _{k} \alpha _{k})x_{k} + \lambda _{k} \alpha _{k} R_{k} x _{k}. \end{aligned}$$
(3.12)

(i) Let \(x^{*}\in \operatorname{zer}(A+B)\), according to Lemma 3.3, \(x^{*} = J_{\gamma _{k}U_{k}A}(I - \gamma _{k}U_{k}B)(x^{*})\). Then \(x^{*} = R_{k} x^{*}\). From (3.8), (3.10), and (3.12), we obtain

$$\begin{aligned} \bigl\Vert x_{k+1} - x^{*} \bigr\Vert _{U_{k+1}^{-1}} & \leq \sqrt{(1+\eta _{k})} \bigl\Vert x _{k+1} -x^{*} \bigr\Vert _{U_{k}^{-1}} \\ & \leq \sqrt{(1+\eta _{k})} \bigl( \bigl\Vert \overline{x}_{k+1} - x^{*} \bigr\Vert _{U _{k}^{-1}} + \lambda _{k} \Vert e_{k} \Vert _{U_{k}^{-1}} \bigr) \\ & \leq \sqrt{(1+\eta _{k})} \bigl\Vert (1-\lambda _{k} \alpha _{k} ) \bigl(x_{k} - x ^{*} \bigr) + \lambda _{k} \alpha _{k} \bigl(R_{k} x_{k} - x^{*} \bigr) \bigr\Vert _{U_{k}^{-1}} \\ &\quad {} + \sqrt{(1+\eta _{k})} \sqrt{\frac{1}{\alpha }}\lambda _{k} \Vert e_{k} \Vert \\ & \leq (1+\eta _{k}) \bigl\Vert x_{k} - x^{*} \bigr\Vert _{U_{k}^{-1}} + \epsilon _{k}, \end{aligned}$$
(3.13)

where \(\epsilon _{k} = \sqrt{(1+\eta _{k})}\sqrt{\frac{1}{\alpha }} \lambda _{k} \|e_{k}\|\). Because \(\sum_{k=0}^{+\infty }\lambda _{k} \|e _{k}\| < +\infty \) and \(\sum_{k=0}^{+\infty }\|\eta _{k}\| < +\infty \), then \(\sum_{k=0}^{\infty }\|\epsilon _{k}\|<+\infty \). On the basis of Lemma 2.5, we conclude that \(\lim_{k\rightarrow +\infty }\|x_{k} - x^{*}\|_{U_{k}^{-1}}\) exists. Moreover, \(\{\|x^{k} - x^{*}\|\}\) is bounded. Let \(M>0\) such that \(\sup_{k\geq 0}\|x^{k} - x^{*}\| \leq M\).

(ii) With the help of the inequality \(\|x+y\|^{2} \leq \|x\|^{2} + 2 \langle y, x+y\rangle \), \(\forall x,y\in H\). We obtain

$$\begin{aligned} \bigl\Vert x_{k+1} - x^{*} \bigr\Vert _{U_{k+1}^{-1}}^{2} & \leq (1+\eta _{k}) \bigl\Vert x_{k+1} -x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \\ & = (1+\eta _{k}) \bigl\Vert \overline{x}_{k+1} -x^{*} + \lambda _{k} e_{k} \bigr\Vert _{U_{k}^{-1}}^{2} \\ & = (1+\eta _{k}) \bigl( \bigl\Vert \overline{x}_{k+1} -x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} + 2\lambda _{k} \bigl\langle e_{k}, x_{k+1}-x^{*} \bigr\rangle _{U_{k}^{-1}} \bigr) \\ & \leq (1+\eta _{k}) \bigl\Vert \overline{x}_{k+1} -x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} + 2(1+\eta _{k})M \bigl\Vert U_{k}^{-1} \bigr\Vert \lambda _{k} \Vert e_{k} \Vert . \end{aligned}$$
(3.14)

From Lemma 2.7 and (3.9) we derive that

$$\begin{aligned} \bigl\Vert \overline{x}_{k+1}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} &= \bigl\Vert (1- \lambda _{k}) \bigl(x_{k}-x^{*} \bigr)+\lambda _{k} \bigl(J_{\gamma _{k}U_{k}A}(x_{k} - \gamma _{k}U_{k}Bx_{k})-x^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &=(1-\lambda _{k}) \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} + \lambda _{k} \bigl\Vert \bigl(J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k})-x ^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\quad {} -\lambda _{k}(1-\lambda _{k}) \bigl\Vert x_{k}-J_{\gamma _{k}U_{k}A}(x _{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}^{2}. \end{aligned}$$
(3.15)

Because \(J_{\gamma _{k}U_{k}A}(I-\gamma _{k}U_{k}B)\) is \(\alpha _{k}\)-averaged, it follows from Proposition 2.2 that

$$\begin{aligned}& \bigl\Vert J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k})-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \\& \quad \leq \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2}- \frac{1- \alpha _{k}}{\alpha _{k}} \bigl\Vert x_{k}-J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}^{2}. \end{aligned}$$
(3.16)

Substituting (3.16) into (3.15) yields

$$ \bigl\Vert \bar{x}_{k+1}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \leq \bigl\Vert x _{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2}-\lambda _{k} \biggl( \frac{1}{\alpha _{k}}- \lambda _{k} \biggr) \bigl\Vert x_{k}-J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx _{k}) \bigr\Vert _{U_{k}^{-1}}^{2}. $$
(3.17)

Combining (3.17) with (3.14), we obtain

$$\begin{aligned} \bigl\Vert x_{k+1} - x^{*} \bigr\Vert _{U_{k+1}^{-1}}^{2} \leq& (1+\eta _{k}) \bigl\Vert x _{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} + 2(1+ \eta _{k})M \bigl\Vert U_{k}^{-1} \bigr\Vert \lambda _{k} \Vert e_{k} \Vert \\ &{} - (1+\eta _{k})\lambda _{k} \biggl( \frac{1}{\alpha _{k}}- \lambda _{k} \biggr) \bigl\Vert x_{k}-J_{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}^{2}, \end{aligned}$$
(3.18)

which implies that

$$\begin{aligned} & \lambda _{k} \biggl(\frac{1}{\alpha _{k}}-\lambda _{k} \biggr) \bigl\Vert x_{k}- J _{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}} ^{2} \\ &\quad \leq (1+\eta _{k}) \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2}- \bigl\Vert x_{k+1}-x^{*} \bigr\Vert _{U_{k+1}^{-1}}^{2} + 2(1+\eta _{k})M \bigl\Vert U _{k}^{-1} \bigr\Vert \lambda _{k} \Vert e_{k} \Vert . \end{aligned}$$
(3.19)

Observe that \(\lim_{k\rightarrow +\infty } \Vert x_{k}-x^{*} \Vert _{U_{k}^{-1}}\) exists and \(\sum_{k=0}^{+\infty }\lambda _{k} \|e_{k}\| < +\infty \). Then by letting \(k\rightarrow +\infty \) in the above inequality and considering the condition on \(\{\lambda _{k}\}\), we obtain

$$ \lim_{k\to +\infty } \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}=0. $$
(3.20)

Because the two norms \(\|\cdot \|_{U_{k}^{-1}}\) and \(\|\cdot \|\) defined on the Hilbert spaces H are equivalent, it follows from (3.20) that

$$ \lim_{k\to +\infty } \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert =0. $$
(3.21)

(iii) In this part, we prove that the sequence \(\{x_{k}\}\) converges weakly to a point in Ω. In fact, let be a weak sequential cluster point of \(\{x_{k}\}\), then there exists a subsequence \(\{x_{k_{n}}\}\subset \{x_{k}\}\) such that \(x_{k_{n}} \rightharpoonup \bar{x} \). Because \(\{\gamma _{k}\} \subset (\underline{ \gamma },\frac{2\beta }{\|U_{k}\|}) \subset (\underline{\gamma},\frac{2 \beta }{\alpha })\) is bounded, there exists a subsequence of \(\{\gamma _{k}\}\) converging to \(\gamma \in (\underline{\gamma},\frac{2 \beta }{\alpha })\). Without loss of generality, we may assume that \(\gamma _{k_{n}}\rightarrow \gamma \). According to condition (3.5), it follows from Lemma 2.6 that there exists \(U^{-1}\in \mathcal{P}_{\frac{1}{\mu }}(H)\) such that \(U_{k}^{-1}\rightarrow U^{-1}\) pointwise.

With the help of Lemma 3.4, we make the following estimation:

$$\begin{aligned} & \bigl\Vert x_{k_{n}} - J_{\gamma UA}(x_{k_{n}} - \gamma U Bx_{k_{n}} ) \bigr\Vert \\ &\quad \leq \bigl\Vert x_{k_{n}} - J_{\gamma _{k_{n}} U_{k_{n}}A}(x_{k_{n}} - \gamma _{k_{n}} U_{k_{n}} Bx_{k_{n}} ) \bigr\Vert \\ &\qquad {} + \bigl\Vert J_{\gamma _{k_{n}} U_{k_{n}}A}(x_{k_{n}} - \gamma _{k_{n}} U_{k_{n}} Bx_{k_{n}} ) - J_{\gamma UA}(x_{k_{n}} - \gamma U Bx_{k_{n}} ) \bigr\Vert \\ &\quad \leq \bigl\Vert x_{k_{n}} - J_{\gamma _{k_{n}} U_{k_{n}}A}(x_{k_{n}} - \gamma _{k_{n}} U_{k_{n}} Bx_{k_{n}} ) \bigr\Vert \\ &\qquad {} + \frac{1}{\lambda _{\mathrm{min}}(U_{k}^{-1})} \biggl\Vert \biggl( U_{k_{n}} ^{-1} - \frac{\gamma _{k_{n}}}{\gamma }U^{-1} \biggr) \bigl(x_{k_{n}}-J_{ \gamma UA}(x_{k_{n}} - \gamma U Bx_{k_{n}} ) \bigr) \biggr\Vert \\ &\quad \leq \bigl\Vert x_{k_{n}} - J_{\gamma _{k_{n}} U_{k_{n}}A}(x_{k_{n}} - \gamma _{k_{n}} U_{k_{n}} Bx_{k_{n}} ) \bigr\Vert \\ &\qquad {} + \frac{\mu }{\gamma } \bigl\Vert \bigl(U_{k_{n}}^{-1} \gamma - U_{k_{n}} ^{-1}\gamma _{k_{n}} \bigr) \bigl(x_{k_{n}}-J_{\gamma UA}(x_{k_{n}} - \gamma U Bx _{k_{n}} ) \bigr) \bigr\Vert \\ &\qquad {} + \frac{\mu }{\gamma } \bigl\Vert \bigl( U_{k_{n}}^{-1} \gamma _{k_{n}} - U ^{-1}\gamma _{k_{n}} \bigr) \bigl(x_{k_{n}}-J_{\gamma UA}(x_{k_{n}} - \gamma U Bx _{k_{n}} ) \bigr) \bigr\Vert \\ &\quad \leq \bigl\Vert x_{k_{n}} - J_{\gamma _{k_{n}} U_{k_{n}}A}(x_{k_{n}} - \gamma _{k_{n}} U_{k_{n}} Bx_{k_{n}} ) \bigr\Vert \\ &\qquad {} + \frac{\mu }{\gamma \alpha } \vert \gamma - \gamma _{k_{n}} \vert \bigl\Vert x _{k_{n}}-J_{\gamma UA}(x_{k_{n}} - \gamma U Bx_{k_{n}} ) \bigr\Vert \\ &\qquad {} + \frac{\mu }{\gamma } \frac{2\beta }{\alpha } \bigl\Vert \bigl( U_{k_{n}} ^{-1} - U^{-1} \bigr) \bigl(x_{k_{n}}-J_{\gamma UA}(x_{k_{n}} - \gamma U Bx_{k _{n}} ) \bigr) \bigr\Vert . \end{aligned}$$
(3.22)

Because \(\{\|x_{k_{n}}-J_{\gamma UA}(x_{k_{n}} - \gamma U Bx_{k_{n}} ) \|\}\) is bounded, it follows from the conditions above, and we can conclude from (3.22) that

$$ \bigl\Vert x_{k_{n}} - J_{\gamma UA}(x_{k_{n}} - \gamma U Bx_{k_{n}} ) \bigr\Vert \rightarrow 0\quad \text{as } k_{n} \rightarrow +\infty . $$
(3.23)

As \(J_{\gamma UA}(I - \gamma U B )\) is nonexpansive, based on the demiclosedness property of nonexpansive mapping, we deduce that \(\bar{x} = J_{\gamma UA}(\bar{x} - \gamma U B\bar{x} )\), which means that \(\bar{x} \in \operatorname{zer} (A+B)\). Because is arbitrary, together with conclusion (i), we can conclude from Lemma 2.6 that \(\{x_{k}\}\) converges weakly to a point in \(\operatorname{zer}(A+B)\).

(iv) On the other hand, as \(J_{\gamma _{k}U_{k}A}\) is firmly nonexpansive, it follows that we have

$$\begin{aligned} & \bigl\Vert J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k})-x ^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\quad \leq \bigl\Vert x_{k}-\gamma _{k}U_{k}Bx_{k}- \bigl(x^{*}-\gamma _{k}U_{k}Bx ^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\qquad {} - \bigl\Vert (I-J_{\gamma _{k}U_{k}A}) (x_{k}-\gamma _{k}U_{k}Bx _{k})-(I-J_{\gamma _{k}U_{k}A}) \bigl(x^{*}-\gamma _{k}U_{k}Bx^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\quad = \bigl\Vert x_{k}-x^{*}- \bigl(\gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\qquad {} - \bigl\Vert x_{k}-J_{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx _{k})- \bigl(\gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr) \bigr\Vert _{U_{k} ^{-1}}^{2} \\ &\quad = \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2}-2 \bigl\langle x_{k}-x ^{*}, \gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr\rangle _{U_{k}^{-1}} \\ &\qquad {} + \bigl\Vert \gamma _{k}U_{k}Bx_{k}- \gamma _{k}U_{k}Bx^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\qquad {} - \bigl\Vert x_{k}-J_{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx _{k})- \bigl(\gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr) \bigr\Vert _{U_{k} ^{-1}}^{2}. \end{aligned}$$
(3.24)

Because B is β-inverse strongly monotone, we have that

$$ \bigl\langle x_{k}-x^{*},\gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr\rangle _{U_{k}^{-1}}\geq \gamma _{k}\beta \bigl\Vert Bx_{k}-Bx^{*} \bigr\Vert ^{2}. $$
(3.25)

In addition, we have

$$\begin{aligned} \bigl\Vert \gamma _{k}U_{k}Bx_{k}- \gamma _{k}U_{k}Bx^{*} \bigr\Vert _{U_{k} ^{-1}}^{2} &\leq \gamma _{k}^{2} \Vert U_{k} \Vert \bigl\Vert Bx_{k}-Bx ^{*} \bigr\Vert ^{2} \\ &\leq \mu \gamma _{k}^{2} \bigl\Vert Bx_{k}-Bx^{*} \bigr\Vert ^{2}. \end{aligned}$$
(3.26)

Substituting (3.25) and (3.26) into (3.24), we obtain

$$\begin{aligned} & \bigl\Vert J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k})-x ^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\quad \leq \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} -\gamma _{k}(2 \beta -\gamma _{k}\mu ) \bigl\Vert Bx_{k}-Bx^{*} \bigr\Vert ^{2} \\ &\qquad {} - \bigl\Vert \bigl( x_{k}-J_{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k}) \bigr)- \bigl( \gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr) \bigr\Vert _{U_{k}^{-1}} ^{2}. \end{aligned}$$
(3.27)

The combination of (3.27) with (3.15) yields

$$\begin{aligned} & \bigl\Vert \overline{x}_{k+1}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\quad \leq \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} -\lambda _{k} \gamma _{k}(2 \beta -\gamma _{k}\mu ) \bigl\Vert Bx_{k}-Bx^{*} \bigr\Vert ^{2} \\ &\qquad {} -\lambda _{k} \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k})- \bigl(\gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\qquad {} -\lambda _{k}(1-\lambda _{k}) \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x _{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}^{2}. \end{aligned}$$
(3.28)

Further, on the basis of (3.28) and (3.14), we obtain

$$\begin{aligned} & \bigl\Vert x_{k+1}-x^{*} \bigr\Vert _{U_{k+1}^{-1}}^{2} \\ &\quad \leq (1+\eta _{k}) \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} - (1+ \eta _{k})\lambda _{k}\gamma _{k}(2\beta -\gamma _{k}\mu ) \bigl\Vert Bx_{k}-Bx ^{*} \bigr\Vert ^{2} \\ &\qquad {} - (1+\eta _{k}) \lambda _{k} \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x _{k}-\gamma _{k}U_{k}Bx_{k})- \bigl(\gamma _{k}U_{k}Bx_{k}-\gamma _{k}U_{k}Bx ^{*} \bigr) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\qquad {} - (1+\eta _{k}) \lambda _{k}(1-\lambda _{k}) \bigl\Vert x_{k}- J _{\gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}} ^{2} \\ &\qquad {} + 2\lambda _{k}(1+\eta _{k})M \bigl\Vert U_{k}^{-1} \bigr\Vert \Vert e_{k} \Vert , \end{aligned}$$
(3.29)

which implies that

$$\begin{aligned} & \lambda _{k}\gamma _{k}(2\beta -\gamma _{k} \mu ) \bigl\Vert Bx_{k}-Bx ^{*} \bigr\Vert ^{2} \\ &\quad \leq (1+\eta _{k}) \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} - \bigl\Vert x_{k+1}-x^{*} \bigr\Vert _{U_{k+1}^{-1}}^{2} \\ &\qquad {} -(1+\eta _{k}) \lambda _{k}(1-\lambda _{k}) \bigl\Vert x_{k}- J_{ \gamma _{k}U_{k}A}(x_{k}- \gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}} ^{2} \\ &\qquad {} + 2\lambda _{k}(1+\eta _{k})M \bigl\Vert U_{k}^{-1} \bigr\Vert \Vert e_{k} \Vert . \end{aligned}$$
(3.30)

By the conditions on \(\{\gamma _{k}\}\) and \(\{\lambda _{k}\}\), and together with conclusions (i), (ii) and the fact that \(\sum_{k=0}^{+ \infty }\lambda _{k} \|e_{k}\|<+\infty \), letting \(k\rightarrow + \infty \) in the above inequality, we obtain

$$ Bx_{k}\rightarrow Bx^{*}\quad \text{as } k \rightarrow +\infty . $$
(3.31)

This completes the proof. □

Remark 3.1

Because the upper bound of the relaxation parameter \(\{\lambda _{k}\}\) in Theorem 3.1 is governed by the averaged constant of the variable metric forward–backward operator, Theorem 3.1 provides a larger selection of the relaxation parameter than Theorem 4.1 of Combettes and Vũ [30].

Remark 3.2

If we assume that \(\lambda _{k} \in (\underline{\lambda },1]\), then we reaffirm the conclusion that \(\sum_{k=0}^{+\infty }\|Bx_{k}-Bx^{*}\| ^{2}<+\infty \) as in Theorem 4.1 of the paper by Combettes and Vũ [30]. In fact, from inequality (3.30), we have

$$\begin{aligned} \underline{\lambda }\underline{\gamma }\epsilon \bigl\Vert Bx_{k}-Bx ^{*} \bigr\Vert ^{2} & \leq \lambda _{k}\gamma _{k}(2\beta -\gamma _{k} \mu ) \bigl\Vert Bx_{k}-Bx^{*} \bigr\Vert ^{2} \\ & \leq (1+\eta _{k}) \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k}^{-1}}^{2} - \bigl\Vert x_{k+1}-x^{*} \bigr\Vert _{U_{k+1}^{-1}}^{2} \\ &\quad {} + 2\lambda _{k}(1+\eta _{k})M \bigl\Vert U_{k}^{-1} \bigr\Vert \Vert e_{k} \Vert . \end{aligned}$$

By summing the above inequality from zero to infinity, we have

$$\begin{aligned} & \underline{\lambda }\underline{\gamma }\epsilon \sum _{k=0} ^{+\infty } \bigl\Vert Bx_{k}-Bx^{*} \bigr\Vert ^{2} \\ &\quad \leq \bigl\Vert x_{0}-x^{*} \bigr\Vert _{U_{0}^{-1}}^{2} +\sum_{k=0}^{+ \infty } \eta _{k} \sup_{k\geq 0}{ \bigl\Vert x_{k}-x^{*} \bigr\Vert _{U_{k} ^{-1}}^{2}} \\ &\qquad {} + \sum_{k=0}^{+\infty }2\lambda _{k}(1+\eta _{k})M \bigl\Vert U_{k}^{-1} \bigr\Vert \Vert e_{k} \Vert , \end{aligned}$$

which implies that \(\sum_{k=0}^{+\infty } \Vert Bx_{k}-Bx^{*} \Vert ^{2} < +\infty \).

Remark 3.3

In view of Theorem 3.1(iii), the iterative sequence generated by (3.6) converges weakly to a point in Ω. The strong convergence of \(\{x_{k}\}\) requires \(x_{k} \rightarrow x^{*}\), \(x^{*}\in \varOmega \). Similar to Theorem 4.1 of Combettes and Vũ [30], we need to assume that one of the following conditions holds:

  1. (i)

    \(\liminf_{k\rightarrow +\infty }d_{\varOmega }(x_{k}) = 0\);

  2. (ii)

    A or B is demiregular at every point in Ω;

  3. (iii)

    int\(\varOmega \neq \emptyset \) and there exists \(\{v_{k}\}\in \ell _{+}^{1}(\mathbb{N})\) such that \((1+v_{k})U_{k} \succeq U_{k+1}\).

Because the proof is the same as that of Combettes and Vũ [30], we omit it here.

Next, we impose a slightly weaker condition on the iterative parameter \(\{\lambda _{k}\}\) than in Theorem 3.1 to ensure the weak convergence of the iterative sequence \(\{x_{k}\}\).

Theorem 3.2

Let H be a real Hilbert space. Let \(A:H \rightarrow 2^{H}\) be maximal monotone. Let \(B:H\rightarrow H\) be β-inverse strongly monotone for some \(\beta > 0\). Suppose that \(\varOmega :=\operatorname{zer}(A+B)\neq \emptyset \). Let \(\alpha > 0\), \(\{\eta _{k}\}\in \ell _{+}^{1}(\mathbb{N})\), and \(\{U_{k}\}\in \mathcal{P}_{\alpha }(H)\) such that

$$ \mu = \sup_{k\in N} \Vert U_{k} \Vert < +\infty \quad \textit{and}\quad ( 1+ \eta _{k}) U_{k+1} \succeq U_{k},\quad \forall k \in \mathbb{N}. $$
(3.32)

Let the iterative sequence \(\{x_{k}\}\) be defined by (3.6). Then we have:

  1. (i)

    For any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty } \|x_{k}-x^{*}\|_{U_{k}^{-1}}\) exists.

Suppose that

  1. (a)

    \(\sum_{k=0}^{+\infty }\lambda _{k} (\frac{1}{\alpha _{k}}- \lambda _{k}) = +\infty \), where \(\alpha _{k}=\frac{2 \beta }{4 \beta - \gamma _{k} \|U_{k}\|}\);

  2. (b)

    \(0< \underline{\gamma}\leq \gamma _{k} \leq \frac{2\beta - \epsilon }{\mu }\), where \(\epsilon \in (0,2\beta -\mu \underline{\gamma})\);

  3. (c)

    \(\sum_{k=0}^{+\infty }| \gamma _{k+1} - \gamma _{k} | < + \infty \), \(\sum_{k=0}^{+\infty }| \gamma _{k+1}\|U_{k+1}\| - \gamma _{k}\|U_{k}\| | < +\infty \), and \(\sum_{k=0}^{+\infty }\|U_{k}^{-1}x-U _{k+1}^{-1}x\|<+\infty \) for any \(x\in H\).

Then

  1. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - J_{\gamma _{k} U _{k} A}(x_{k} - \gamma _{k} U_{k} Bx_{k}) \| =0 \);

  2. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω.

Further, suppose that \(\lambda _{k} \geq \underline{\lambda }>0\). Then

  1. (iv)

    \(Bx_{k} \rightarrow Bx^{*}\) as \(k\rightarrow +\infty \), where \(x^{*}\in \varOmega \).

Proof

(i) Let \(x^{*}\in \varOmega \), it follows from the same proof of Theorem 3.1(i) and we know that \(\lim_{k\rightarrow +\infty } \|x_{k} - x^{*}\|_{U_{k}^{-1}}\) exists. Then, \(\{\|x_{k} - x^{*}\|\}\) is bounded. Let \(M:= \sup_{k\geq 0}\|x_{k} - x^{*}\|\).

(ii) From (3.19), we obtain

$$\begin{aligned} & \sum_{k=0}^{+\infty }\lambda _{k} \biggl(\frac{1}{\alpha _{k}}-\lambda _{k} \biggr) \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}^{2} \\ &\quad \leq \bigl\Vert x_{0}-x^{*} \bigr\Vert _{U_{0}^{-1}}^{2} + \frac{1}{ \alpha } M^{2} \sum _{k=0}^{+\infty }\eta _{k} + 2 \frac{1}{\alpha } \sum_{k=0}^{+\infty }(1+\eta _{k})M \lambda _{k} \Vert e_{k} \Vert . \end{aligned}$$
(3.33)

Because \(\sum_{k=0}^{+\infty }\eta _{k} < +\infty \) and \(\sum_{k=0} ^{+\infty } \lambda _{k} \|e_{k}\| < +\infty \), then

$$ \sum_{k=0}^{+\infty }\lambda _{k} \biggl(\frac{1}{\alpha _{k}}-\lambda _{k} \biggr) \bigl\Vert x_{k}- J_{\gamma _{k}U_{k}A}(x_{k}-\gamma _{k}U_{k}Bx_{k}) \bigr\Vert _{U_{k}^{-1}}^{2} < +\infty . $$
(3.34)

Let \(T_{k} = J_{\gamma _{k} U_{k} A}(I - \gamma _{k} U_{k} B)\). By condition (a), (3.34) implies that

$$ \liminf_{k\rightarrow +\infty } \Vert x_{k}- T_{k} x_{k} \Vert _{U_{k}^{-1}} =0. $$

Consequently, \(\lim \inf_{k\rightarrow +\infty } \Vert x_{k}- T_{k} x _{k} \Vert =0\). Because \(T_{k}\) is \(\alpha _{k}\)-averaged, where \(\alpha _{k} = \frac{2\beta }{4\beta - \gamma _{k} \|U_{k}\|}\), there exist nonexpansive mappings \(R_{k}\) on \(H_{U_{k}^{-1}}\) such that \(T_{k} = (1-\alpha _{k})I + \alpha _{k} R_{k}\). Then, \(\lim \inf_{k\rightarrow +\infty }\|x_{k} - R_{k} x_{k}\|_{U_{k}^{-1}}=0\). Next, we prove that \(\lim_{k\rightarrow +\infty }\|x_{k} - R_{k} x _{k}\|=0\).

Using formulation (3.10) and the fact that \(R_{k+1}\) is nonexpansive on \(H_{{U_{k+1}^{-1}}}\), we have

$$\begin{aligned} & \Vert x_{k+1} - R_{k+1}x_{k+1} \Vert _{U_{k+1}^{-1}} \\ &\quad \overset{\text{(3.10)}}{=} \Vert \overline{x}_{k+1} - R_{k+1}x _{k+1} + \lambda _{k} e_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad \leq \Vert \overline{x}_{k+1} - R_{k+1}x_{k+1} \Vert _{U_{k+1}^{-1}} + \lambda _{k} \Vert e_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad = \bigl\Vert (1-\lambda _{k} \alpha _{k})x_{k} + \lambda _{k} \alpha _{k} R _{k} x_{k} - R_{k+1}x_{k+1} \bigr\Vert _{U_{k+1}^{-1}} + \lambda _{k} \Vert e_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad = \bigl\Vert (1-\lambda _{k} \alpha _{k}) (x_{k} - R_{k}x_{k}) + R_{k} x_{k} - R_{k+1}x_{k+1} \bigr\Vert _{U_{k+1}^{-1}} + \lambda _{k} \Vert e_{k} \Vert _{U_{k+1} ^{-1}} \\ &\quad \leq (1-\lambda _{k} \alpha _{k}) \Vert x_{k} - R_{k}x_{k} \Vert _{U_{k+1} ^{-1}} + \Vert R_{k} x_{k} - R_{k+1}x_{k} \Vert _{U_{k+1}^{-1}} \\ &\qquad {} + \Vert R_{k+1} x_{k} - R_{k+1}x_{k+1} \Vert _{U_{k+1}^{-1}} + \lambda _{k} \Vert e_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad \leq (1-\lambda _{k} \alpha _{k}) \Vert x_{k} - R_{k}x_{k} \Vert _{U_{k+1} ^{-1}} + \Vert R_{k} x_{k} - R_{k+1}x_{k} \Vert _{U_{k+1}^{-1}} + \Vert x_{k} - x _{k+1} \Vert _{U_{k+1}^{-1}} \\ &\qquad {} + \lambda _{k} \Vert e_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad \leq \Vert x_{k} - R_{k}x_{k} \Vert _{U_{k+1}^{-1}} + \Vert R_{k} x_{k} - R _{k+1}x_{k} \Vert _{U_{k+1}^{-1}} + 2 \sqrt{ \frac{1}{\alpha }}\lambda _{k} \Vert e_{k} \Vert . \end{aligned}$$
(3.35)

On the other hand, using the relation \(R_{k} = (1- \frac{1}{\alpha _{k}})I +\frac{1}{\alpha _{k}} T_{k}\) and Lemma 3.4, we have

$$\begin{aligned} & \Vert R_{k} x_{k} - R_{k+1}x_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad = \biggl\Vert \biggl(1-\frac{1}{\alpha _{k}} \biggr)x_{k} + \frac{1}{\alpha _{k}} T _{k} x_{k} - \biggl(1- \frac{1}{\alpha _{k+1}} \biggr)x_{k} - \frac{1}{\alpha _{k+1}}T _{k+1}x_{k} \biggr\Vert _{U_{k+1}^{-1}} \\ &\quad \leq \biggl\vert \frac{1}{\alpha _{k+1}} - \frac{1}{\alpha _{k}} \biggr\vert \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \biggl\Vert \frac{1}{\alpha _{k}} T_{k} x_{k} - \frac{1}{\alpha _{k+1}}T_{k+1}x_{k} \biggr\Vert _{U_{k+1}^{-1}} \\ &\quad \leq \biggl\vert \frac{1}{\alpha _{k+1}} - \frac{1}{\alpha _{k}} \biggr\vert \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \biggl\Vert \frac{1}{\alpha _{k}} T_{k} x_{k} - \frac{1}{\alpha _{k+1}}T_{k}x_{k} \biggr\Vert _{U_{k+1}^{-1}} \\ &\qquad {} + \biggl\Vert \frac{1}{\alpha _{k+1}} T_{k} x_{k} - \frac{1}{ \alpha _{k+1}}T_{k+1}x_{k} \biggr\Vert _{U_{k+1}^{-1}} \\ &\quad \leq \frac{1}{2\beta } \bigl\vert \gamma _{k} \Vert U_{k} \Vert - \gamma _{k+1} \Vert U_{k+1} \Vert \bigr\vert \bigl( \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \Vert T_{k} x_{k} \Vert _{U_{k+1}^{-1}} \bigr) + \frac{1}{\alpha _{k+1}} \Vert T_{k} x_{k} - T _{k+1}x_{k} \Vert _{U_{k+1}^{-1}} \\ &\quad \leq \frac{1}{2\beta } \bigl\vert \gamma _{k} \Vert U_{k} \Vert - \gamma _{k+1} \Vert U_{k+1} \Vert \bigr\vert \bigl( \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \Vert T_{k} x_{k} \Vert _{U_{k+1}^{-1}} \bigr) + 2 \sqrt{\frac{1}{\alpha }} \Vert T_{k} x _{k} - T_{k+1}x_{k} \Vert \\ &\quad \leq \frac{1}{2\beta } \bigl\vert \gamma _{k} \Vert U_{k} \Vert - \gamma _{k+1} \Vert U_{k+1} \Vert \bigr\vert \bigl( \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \Vert T_{k} x_{k} \Vert _{U_{k+1}^{-1}} \bigr) \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\gamma _{k+1}} \bigl\Vert \bigl( \gamma _{k+1} U_{k}^{-1} - \gamma _{k} U_{k+1}^{-1} \bigr) (x_{k} - T _{k+1}x_{k}) \bigr\Vert \\ &\quad \leq \frac{1}{2\beta } \bigl\vert \gamma _{k} \Vert U_{k} \Vert - \gamma _{k+1} \Vert U_{k+1} \Vert \bigr\vert \bigl( \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \Vert T_{k} x_{k} \Vert _{U_{k+1}^{-1}} \bigr) \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\underline{\gamma }} \vert \gamma _{k+1}-\gamma _{k} \vert \bigl\Vert U_{k}^{-1}(x_{k} - T_{k+1}x_{k}) \bigr\Vert \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\underline{\gamma }} \frac{2\beta }{\alpha } \bigl\Vert \bigl(U_{k}^{-1}-U_{k+1}^{-1} \bigr) (x_{k} - T_{k+1}x_{k}) \bigr\Vert . \end{aligned}$$
(3.36)

The combination of (3.36) with (3.35) yields

$$\begin{aligned} & \Vert x_{k+1} - R_{k+1}x_{k+1} \Vert _{U_{k+1}^{-1}} \\ &\quad \leq \Vert x_{k} - R_{k}x_{k} \Vert _{U_{k+1}^{-1}} + \frac{1}{2\beta } \bigl\vert \gamma _{k} \Vert U_{k} \Vert - \gamma _{k+1} \Vert U_{k+1} \Vert \bigr\vert \bigl( \Vert x _{k} \Vert _{U_{k+1}^{-1}} + \Vert T_{k} x_{k} \Vert _{U_{k+1}^{-1}} \bigr) \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\underline{\gamma }} \vert \gamma _{k+1}-\gamma _{k} \vert \bigl\Vert U_{k}^{-1}(x_{k} - T_{k+1}x_{k}) \bigr\Vert \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\underline{\gamma }} \frac{2\beta }{\alpha } \bigl\Vert \bigl(U_{k}^{-1}-U_{k+1}^{-1} \bigr) (x_{k} - T_{k+1}x_{k}) \bigr\Vert + 2 \sqrt{ \frac{1}{\alpha }}\lambda _{k} \Vert e _{k} \Vert \\ &\quad \leq (1+\eta _{k}) \Vert x_{k} - R_{k}x_{k} \Vert _{U_{k}^{-1}} + \frac{1}{2 \beta } \bigl\vert \gamma _{k} \Vert U_{k} \Vert - \gamma _{k+1} \Vert U_{k+1} \Vert \bigr\vert \bigl( \Vert x_{k} \Vert _{U_{k+1}^{-1}} + \Vert T_{k} x_{k} \Vert _{U_{k+1}^{-1}} \bigr) \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\underline{\gamma }} \vert \gamma _{k+1}-\gamma _{k} \vert \bigl\Vert U_{k}^{-1}(x_{k} - T_{k+1}x_{k}) \bigr\Vert \\ &\qquad {} + 2 \sqrt{\frac{1}{\alpha }} \frac{\mu }{\underline{\gamma }} \frac{2\beta }{\alpha } \bigl\Vert \bigl(U_{k}^{-1}-U_{k+1}^{-1} \bigr) (x_{k} - T_{k+1}x_{k}) \bigr\Vert + 2 \sqrt{ \frac{1}{\alpha }}\lambda _{k} \Vert e _{k} \Vert . \end{aligned}$$
(3.37)

With the help of Lemma 2.5, we can conclude from (3.37) that \(\lim_{k\rightarrow +\infty }\|x_{k} - R_{k} x _{k}\|_{U_{k}^{-1}}=0\). Hence, \(\lim_{k\rightarrow +\infty }\|x_{k} - R_{k} x_{k}\|=0\). As a consequence, \(\lim_{k\rightarrow +\infty }\|x _{k} - T_{k} x_{k}\|=0\).

(iii) and (iv) can be proven using the same proof as Theorem 3.1. □

Remark 3.4

In Theorem 3.2, we prove the weak convergence of the iterative sequence generated by (3.6) with a weaker condition on \(\{\lambda _{k}\}\) than that in Theorem 3.1.

In Theorems 3.1 and 3.2, let \(U_{k}=I \), in which case we obtain the following corollary, which shows the convergence of the forward–backward splitting algorithm with variable step sizes.

Corollary 3.3

Let H be a real Hilbert space. Let \(A:H\rightarrow 2^{H} \) be maximal monotone. Let \(B:H\rightarrow H \) be β-inverse strongly monotone for some \(\beta > 0\). Suppose that \(\varOmega = \operatorname{zer} (A+B) \neq \emptyset \). Let \(\{\gamma _{k}\} \subset (0, 2\beta )\) and \(\{\lambda _{k}\} \subset (0,\frac{1}{\alpha _{k}})\), where \(\alpha _{k}=\frac{2 \beta }{4 \beta -\gamma _{k}}\). Let \(\{a_{k}\}\) and \(\{b_{k}\}\) be two sequences in H such that \(\sum_{k=0}^{+\infty } \lambda _{k} \|a_{k}\|<+\infty \) and \(\sum_{k=0}^{+\infty }\lambda _{k} \|b_{k}\|< +\infty \). Let \(x_{0}\in H\), and set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}(Bx_{k}+b_{k}), \\ x_{k+1} = x_{k}+\lambda _{k} (J_{\gamma _{k}A}(y_{k})+a_{k}-x_{k} ). \end{cases} $$
(3.38)

Then we have:

  1. (i)

    for any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty } \|x_{k}-x^{*}\|\) exists.

Suppose that

  1. (a1)

    \(0< \underline{\lambda}\leq \lambda _{k}\);

  2. (a2)

    \(\lambda _{k} \leq \frac{1}{\alpha _{k}}-\tau \), where \(\tau \in (0,\frac{1}{\alpha _{k}}-\underline{\lambda})\);

  3. (a3)

    \(0< \underline{\gamma} \leq \gamma _{k} \);

  4. (a4)

    \(\gamma _{k} \leq 2\beta -\epsilon \), where \(\epsilon \in (0,2 \beta - \underline{\gamma})\);

  5. (a5)

    \(\sum_{k=0}^{+\infty }\lambda _{k} (\frac{1}{\alpha _{k}}-\lambda _{k}) = +\infty \) and \(\sum_{k=0}^{+\infty }| \gamma _{k+1} - \gamma _{k} | < +\infty \).

If the conditions of (a1)(a2) or (a3)(a5) hold, then we have

  1. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - J_{\gamma _{k} A}(x _{k} - \gamma _{k} Bx_{k}) \| =0 \).

If the conditions of (a1)(a3) or (a3)(a5) hold, then we have

  1. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω.

If the conditions of (a1)(a3) or (a1), (a3)(a5) hold, then we have

  1. (iv)

    \(Bx_{k} \rightarrow Bx^{*}\) as \(k\rightarrow +\infty \), where \(x^{*}\in \varOmega \).

Remark 3.5

Under conditions (a1)–(a3), Corollary 3.3 reaffirms Proposition 4.4 of Combettes and Yamada [8]. In addition, we obtain the convergence of the iterative scheme (3.38) under conditions (a3)–(a5), which provide a weaker assumption on the relaxation parameters \(\lambda _{k}\) than conditions (a1) and (a2). Consequently, the obtained results improve and generalize Proposition 4.4 of Combettes and Yamada [8].

As an application of Theorems 3.1 and 3.2, we have the following convergence results for solving the convex minimization problem (1.5).

Corollary 3.4

Let H be a real Hilbert space. Let \(g:H\rightarrow (-\infty ,+ \infty ]\) be a proper, lower semi-continuous, convex function. Let \(f:H\rightarrow R\) be convex and differentiable with a \(1/\beta \)-Lipschitz continuous gradient. Assume that Ω is the set of solutions of problem (1.5) and \(\varOmega \neq \emptyset \). Let \(x_{0}\in H\), and set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}U_{k} (\nabla f (x_{k})+b_{k} ), \\ x_{k+1} = x_{k}+\lambda _{k} ( \operatorname{prox}_{\gamma _{k}g}^{U_{k}^{-1}}(y_{k})+a_{k}-x _{k} ), \end{cases} $$
(3.39)

where \(\{U_{k}\}\), \(\{\gamma _{k}\}\), \(\{\lambda _{k}\}\), \(\{a_{k}\}\), and \(\{b_{k}\}\) satisfy the same conditions as in Theorem 3.1 or Theorem 3.2.

Then the following hold:

  1. (i)

    For any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty }\|x_{k}-x ^{*}\|_{U_{k}^{-1}}\) exists;

  2. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - \operatorname{prox}_{\gamma _{k} g}^{U _{k}^{-1}}(x_{k} - \gamma _{k} U_{k} \nabla f(x_{k})) \| =0 \);

  3. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω;

  4. (iv)

    \(\nabla f(x_{k}) \rightarrow \nabla f(x^{*})\) as \(k\rightarrow + \infty \), where \(x^{*}\in \varOmega \).

Proof

Because f is convex differentiable, according to the Baillon–Haddad theorem, f is β-inverse strongly monotone. From the definition of the proximity operator on the Hilbert space \(H_{U^{-1}}\), we know that

$$ \operatorname{prox}_{\gamma _{k}g}^{U_{k}^{-1}}(u)=J_{\gamma _{k}U_{k}\partial g}(u). $$
(3.40)

Set \(A=\partial g\) and \(B=\nabla f\) in Theorem 3.1 and Theorem 3.2 and this enables us to confirm the conclusions of Corollary 3.4. □

In the following, we employ the variable metric forward–backward splitting algorithm investigated above for solving several classes of nonlinear optimization problems. First, we consider the variational inequality problem (VIP):

$$ \text{find } x^{*} \in C, \text{such that } \bigl\langle Bx ^{*}, y-x^{*} \bigr\rangle \geq 0, \quad \forall y \in C, $$
(3.41)

where C is a nonempty closed convex subset of H, and \(B:H\rightarrow H\) is a nonlinear operator.

Recall the indicator function \(\delta _{C}\), which is defined as

$$ \delta _{C}(x)= \textstyle\begin{cases} 0, & x\in C, \\ +\infty , &\text{otherwise}. \end{cases} $$
(3.42)

The proximal operator of \(\delta _{C}\) is well known to be the metric projection on C, which is defined by

$$ P_{C}(x) = \operatorname{prox}_{\delta _{C}}(x) = \arg \min _{y\in C} \Vert x-y \Vert . $$

The normal cone operator of C is \(N_{C}\), which is defined by

$$ N_{C}(x)= \textstyle\begin{cases} \{ w| \langle w, y-x \rangle \leq 0, \forall y\in C \}, & x\in C, \\ \emptyset , &\text{otherwise}. \end{cases} $$
(3.43)

Then VIP (3.41) is equivalent to the following monotone inclusion problem:

$$ 0\in Bx + N_{C}(x). $$
(3.44)

Assuming that B is β-inverse strongly monotone, (3.44) is a special case of the monotone inclusion problem (1.1). Let \(A=N_{C}\), then we know that \(J_{\gamma U A} = P_{C}^{U^{-1}}\) for any \(\gamma >0\) and \(U\in \mathcal{P}_{\alpha }(H)\). The operator \(P_{C}^{U^{-1}}\) denotes the projector onto a nonempty closed convex subset C of H relative to the norm \(\|\cdot \|_{U^{-1}}\). More precisely,

$$ P_{C}^{U^{-1}}(x) = \arg \min_{y\in C} \Vert x-y \Vert _{U^{-1}}. $$

On the basis of Theorems 3.1 and 3.2, we obtain the following convergence theorem to solve VIP (3.41).

Theorem 3.5

Let H be a real Hilbert space. Let \(B:H\rightarrow H\) be a β-inverse strongly monotone operator. We denote by Ω the solution set of VIP (3.41) and assume that \(\varOmega \neq \emptyset \). Let \(x_{0}\in H\), set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}U_{k}(Bx_{k}+b_{k}), \\ x_{k+1} = x_{k}+ \lambda _{k} (P_{C}^{U_{k}^{-1}}(y_{k})+a_{k}-x_{k} ), \end{cases} $$
(3.45)

where \(\{U_{k}\}\), \(\{\gamma _{k}\}\), \(\{\lambda _{k}\}\), \(\{a_{k}\}\), and \(\{b_{k}\}\) satisfy the same conditions as in Theorem 3.1 or Theorem 3.2.

Then the following hold:

  1. (i)

    For any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty }\|x_{k}-x ^{*}\|_{U_{k}^{-1}}\) exists;

  2. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - P_{C}^{U_{k}^{-1}}(x _{k} - \gamma _{k} U_{k} Ax_{k}) \| =0 \);

  3. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω;

  4. (iv)

    \(Bx_{k} \rightarrow Bx^{*}\) as \(k\rightarrow +\infty \), where \(x^{*}\in \varOmega \).

Second, we consider the following constrained convex minimization problem:

$$ \begin{aligned} &\min f(x) \\ &\quad \mbox{s.t. } x\in C, \end{aligned} $$
(3.46)

where C is a nonempty closed convex subset of H, and \(f:H\rightarrow R\) is a proper closed convex differentiable function with a Lipschitz continuous gradient.

It follows from the definition of the indicator function that constrained convex minimization problem (3.46) is equivalent to the following unconstrained minimization problem:

$$ \min_{x\in H} f(x) + \delta _{C}(x). $$
(3.47)

It is obvious that problem (3.47) is a special case of (1.5). Therefore, by taking \(g(x) = \delta _{C}(x)\), we obtain the following convergence theorem for solving constrained convex minimization problem (3.46).

Theorem 3.6

Let H be a real Hilbert space. Let \(f:H\rightarrow R\) be a proper, closed convex function such that f is differentiable with an L-Lipschitz continuous gradient. We denote by Ω the solution set of the constrained convex minimization problem (3.41) and assume that \(\varOmega \neq \emptyset \). Let \(x_{0}\in H\), and set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}U_{k} (\nabla f (x_{k})+b_{k} ), \\ x_{k+1} = x_{k}+\lambda _{k} (P_{C}^{U_{k}^{-1}}(y_{k})+a_{k}-x_{k} ), \end{cases} $$
(3.48)

where \(\{U_{k}\}\), \(\{\gamma _{k}\}\), \(\{\lambda _{k}\}\), \(\{a_{k}\}\), and \(\{b_{k}\}\) satisfy the same conditions as in Theorem 3.1 or Theorem 3.2.

Then the following hold:

  1. (i)

    For any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty }\|x_{k}-x ^{*}\|_{U_{k}^{-1}}\) exists;

  2. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - P_{C}^{U_{k}^{-1}}(x _{k} - \gamma _{k} U_{k} \nabla f(x_{k})) \| =0 \);

  3. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω;

  4. (iv)

    \(\nabla f(x_{k}) \rightarrow \nabla f(x^{*})\) as \(k\rightarrow + \infty \), where \(x^{*}\in \varOmega \).

Finally, we consider the split feasibility problem (SFP) as follows:

$$ \text{find } x\in C \text{ such that } Lx \in Q, $$
(3.49)

where C and Q are nonempty, closed convex subsets of Hilbert spaces H and G, respectively. \(L:H \rightarrow G\) is a bounded linear operator. SFP (3.49) was first introduced by Censor and Elfving [37] in a finite dimensional Hilbert space and has been extensively studied by many authors; see, for example, [38, 39] and the references therein.

SFP (3.49) is closely related to the constrained convex minimization problem (3.46). More precisely, the corresponding constrained convex minimization problem of SFP (3.49) is

$$ \begin{aligned} &\min_{x} \frac{1}{2} \bigl\Vert x-P_{Q}(Lx) \bigr\Vert ^{2} \\ &\quad \mbox{s.t. } x\in C. \end{aligned} $$
(3.50)

Let \(x^{*}\) be a solution of SFP (3.49), then \(x^{*}\) is a solution of (3.50). Conversely, let \(x^{*}\) be a solution of (3.50) and \(f(x):=\frac{1}{2}\| x-P_{Q}(Lx) \|^{2} = 0\), then \(x^{*}\) is a solution of SFP (3.49). Under the assumption that the solution set of SFP (3.49) is nonempty, SFP (3.49) and constrained convex minimization problem (3.50) are equivalent.

The function \(f(x) = \frac{1}{2}\| x-P_{Q}(Lx) \|^{2}\) is convex differentiable and the gradient operator \(\nabla f(x) = L^{*}(Lx - P _{Q}(Lx))\) is \(\frac{1}{\|L\|^{2}}\)-inverse strongly monotone. Therefore, we obtain the following theorem for solving SFP (3.49).

Theorem 3.7

Let H and G be real Hilbert spaces. Let \(L:H\rightarrow G\) be a bounded linear operator. Let C and Q be nonempty closed and convex subsets of H and G, respectively. We denote by Ω the solution set of SFP (3.49) and assume that \(\varOmega \neq \emptyset \). Let \(x_{0}\in H\), and set

$$ \textstyle\begin{cases} y_{k} = x_{k}-\gamma _{k}U_{k} (L^{*} (Lx_{k} - P_{Q}(Lx_{k}) )+b_{k} ), \\ x_{k+1} = x_{k}+\lambda _{k} (P_{C}^{U_{k}^{-1}}(y_{k})+a_{k}-x _{k} ), \end{cases} $$
(3.51)

where \(\{U_{k}\}\), \(\{\gamma _{k}\}\), \(\{\lambda _{k}\}\), \(\{a_{k}\}\), and \(\{b_{k}\}\) satisfy the same conditions as in Theorem 3.1 or Theorem 3.2.

Then the following hold:

  1. (i)

    For any \(x^{*}\in \varOmega \), \(\lim_{k\rightarrow +\infty }\|x_{k}-x ^{*}\|_{U_{k}^{-1}}\) exists;

  2. (ii)

    \(\lim_{k\rightarrow +\infty } \| x_{k} - P_{C}^{U_{k}^{-1}}(x _{k} - \gamma _{k} U_{k} L^{*}(Lx_{k} - P_{Q}(Lx_{k}))) \| =0 \);

  3. (iii)

    \(\{x_{k}\}\) converges weakly to a point in Ω;

  4. (iv)

    \(L^{*}(Lx_{k} - P_{Q}(Lx_{k})) \rightarrow L^{*}(Lx^{*} - P_{Q}(Lx ^{*}))\) as \(k\rightarrow +\infty \), where \(x^{*}\in \varOmega \).

Remark 3.6

To the best of our knowledge, the proposed iterative algorithms (3.45), (3.48), and (3.51) are the most general ones for solving variational inequality problem (3.41), constrained convex minimization problem (3.46), and split feasibility problem (3.49), respectively. Most of the existing algorithms [7, 35, 39,40,41] are special cases of ours.

Numerical experiments

In this section, we apply the proposed iterative algorithm (3.39) to solve the famous LASSO problem [42]. All the experiments are performed on a standard Lenovo Laptop with Intel (R) Core (TM) i7-4712MQ 2.3 GHZ CPU and 4 GB RAM. We run the program with MATLAB 2014a.

Let us recall the LASSO problem:

$$ \begin{aligned} &\min_{x\in R^{n}} \frac{1}{2} \Vert Ax-b \Vert _{2}^{2} \\ &\quad \mbox{s.t. } \Vert x \Vert _{1} \leq t, \end{aligned} $$
(4.1)

where \(A\in R^{m\times n}\), \(b\in R^{m}\), and \(t>0\). Define \(C:=\{ x | \|x\|_{1} \leq t \}\), by using the indicator function, we see that (4.1) is equivalent to the following unconstrained optimization problem:

$$ \min_{x} \frac{1}{2} \Vert Ax-b \Vert _{2}^{2} + \delta _{C}(x), $$
(4.2)

which is a special case of the general optimization problem (1.5). Let \(f(x) = \frac{1}{2}\|Ax-b\|_{2}^{2}\) and \(g(x) = \delta _{C}(x)\), then we can apply iterative algorithm (3.39) to solve (4.2). Notice that the gradient of \(f(x)\) is \(\nabla f(x) = A^{T}(Ax-b)\) and the Lipschitz constant of f is \(L:=\|A\|^{2}\). Besides, the proximity operator of indicator function \(\delta _{C}(x)\) is the orthogonal projection onto the closed convex set C. Although it has no closed-form solution, it can be calculated in a polynomial time.

In the tests, the true signal \(x\in R^{n}\) has k non-zero elements, which is generated from uniform distribution in the interval \([-2,2]\). The system matrix \(A\in R^{m\times n}\) is generated from standard Gaussian distribution. The observed signal b is given by \(b=Ax\). In the experiment, we set \(m=240\), \(n=1024\), and \(k=40\). The stopping criterion is defined as

$$ \frac{ \Vert x_{k+1}-x_{k} \Vert _{2}}{ \Vert x_{k} \Vert _{2}} \leq \varepsilon , $$
(4.3)

where \(\varepsilon >0\) is a small constant. We test the performance of the proposed iterative algorithm with different choices of the step size \(\gamma _{k}\) and the relaxation parameter \(\lambda _{k}\). For simplicity, we set them as constant during the iteration process. According to Corollary 3.4, we know that \(\gamma _{k} \in (0,\frac{2}{L})\) and \(\lambda _{k} \in (0,\frac{4-\gamma _{k} L}{2})\). The obtained numerical results are listed in Table 1, in which we report the number of iterations (“Iter”), the objective function value (“Obj”), and the error between the recovered signal and the true signal (“Err”). We can see from Table 1 that when the step size \(\gamma _{k}\) is fixed, a large relaxation parameter \(\lambda _{k}\) leads to a faster convergence. At the same time, the larger the step size, the faster the algorithm converges.

Table 1 Numerical results for different choices of \(\gamma _{k}\) and \(\lambda _{k}\) for solving the LASSO problem (4.1)

In order to more visualize the effect of iterative parameters on the value of the function, Fig. 1 shows the objective function value against the number of iterations. Further, we plot the true signal and the recovered signal in Fig. 2 for the parameters of \(\gamma _{k} = \frac{1.9}{L}\), \(\lambda _{k} = 1.05\) and the stopping criterion \(\varepsilon = 10^{-8}\). We can see from Fig. 2 that the true signal is successfully reconstructed.

Figure 1
figure 1

The objective function value against the number of iterations for the LASSO problem. (a\(\gamma _{k} = \frac{1}{2L}\), (b\(\gamma _{k} = \frac{1}{L}\), and (c\(\gamma _{k} = \frac{1.9}{L}\)

Figure 2
figure 2

The recovered sparse signal versus the true k-sparse signal

Conclusions

In this paper, we proposed a new convergence analysis of the variable metric forward–backward splitting algorithm (1.7) with extended relaxation parameters. Based on the averaged operator \(J_{\gamma _{k} U_{k} A}(I-\gamma _{k} U _{k} B)\) and the firmly nonexpansive \(J_{\gamma _{k} U_{k} A}\) on the Hilbert spaces \(H_{U_{k}^{-1}}\), we proved the weak convergence of this algorithm. Compared to existing work, we imposed a slightly weak condition on the relaxation parameters to ensure the convergence of the forward–backward splitting algorithm when using the variable metric and variable step sizes. Our results complemented and extended the corresponding results of Combettes and Yamada [8]. Furthermore, we obtained several general iterative algorithms for solving the variational inequality problem, the constrained convex minimization problem, and the split feasibility problem, respectively. These results generalized and improved the known results in the literature. Numerical experimental results on LASSO problem showed that the step size \(\gamma _{k}\) and relaxation parameter \(\lambda _{k}\) had much impact on the convergence speed of the proposed iterative algorithm. The larger the step size, the faster the algorithm converged. The over-relaxation parameter \(\lambda _{k}\) (\(\lambda _{k} >1\)) performed better than the under-relaxation parameter \(\lambda _{k}\) (\(\lambda _{k} \leq 1\)).

References

  1. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)

    MathSciNet  Article  Google Scholar 

  2. Chen, G.H.G., Rockafellar, R.T.: Convergence rates in forward–backward splitting. SIAM J. Optim. 7(2), 421–444 (1997)

    MathSciNet  Article  Google Scholar 

  3. Tseng, P.: A modified forward–backward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38(2), 431–446 (2000)

    MathSciNet  Article  Google Scholar 

  4. Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475–504 (2004)

    MathSciNet  Article  Google Scholar 

  5. Lopez, G., Martin-Marquez, V., Wang, F., Xu, H.K.: Forward–backward splitting method for accretive operators in Banach spaces. Abstr. Appl. Anal. 2012, Article ID 109236 (2012)

    MathSciNet  Article  Google Scholar 

  6. Zong, C.X., Tang, Y.C., Cho, Y.J.: Convergence analysis of an inexact three-operator splitting algorithm. Symmetry 10(11), 563 (2018)

    Article  Google Scholar 

  7. Jiao, H.W., Wang, F.H.: On an iterative method for finding a zero to the sum of two maximal monotone operators. J. Appl. Math. 2014, Article ID 414031 (2014)

    MathSciNet  Google Scholar 

  8. Combettes, P.L., Yamada, I.: Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 425, 55–70 (2015)

    MathSciNet  Article  Google Scholar 

  9. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, Berlin (2017)

    Book  Google Scholar 

  10. Qin, X., Yao, J.C.: Projection splitting algorithms for nonself operators. J. Nonlinear Convex Anal. 18, 925–935 (2017)

    MathSciNet  MATH  Google Scholar 

  11. Shang, M.: A descent-like method for fixed points and split conclusion problems. J. Appl. Numer. Optim. 1, 91–101 (2019)

    Google Scholar 

  12. Qin, X., Yao, J.C.: Weak convergence of a Mann-like algorithm for nonexpansive and accretive operators. J. Inequal. Appl. 2016, 232 (2016)

    MathSciNet  Article  Google Scholar 

  13. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)

    MathSciNet  Article  Google Scholar 

  14. Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)

    MathSciNet  Article  Google Scholar 

  15. Cai, J.F., Candes, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010)

    MathSciNet  Article  Google Scholar 

  16. Zhao, X., Ng, K.F., Li, C., Yao, J.C.: Linear regularity and linear convergence of projection-based methods for solving convex feasibility problems. Appl. Math. Optim. 78, 613–641 (2018)

    MathSciNet  Article  Google Scholar 

  17. Zhang, L., Zhao, H., Lv, Y.: A modified inertial projection and contraction algorithms for quasi-variational inequalities. Appl. Set-Valued Anal. Optim. 1, 63–76 (2019)

    Google Scholar 

  18. Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)

    MathSciNet  Article  Google Scholar 

  19. Combettes, P.L., Pesquet, J.C.: Primal–dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set-Valued Var. Anal. 20(2), 307–330 (2012)

    MathSciNet  Article  Google Scholar 

  20. Vũ, B.C.: A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 38, 667–681 (2013)

    MathSciNet  Article  Google Scholar 

  21. Esser, E., Zhang, X., Chan, T.: A general framework for a class of first order primal–dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)

    MathSciNet  Article  Google Scholar 

  22. Chambolle, A., Pock, T.: A first-order primal–dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)

    MathSciNet  Article  Google Scholar 

  23. Burke, J.V., Qian, M.J.: A variable metric proximal point algorithm for monotone operators. SIAM J. Control Optim. 37(2), 353–375 (1998)

    MathSciNet  Article  Google Scholar 

  24. Parente, L.A., Lotito, P.A., Solodov, M.V.: A class of inexact variable metric proximal point algorithms. SIAM J. Optim. 19(1), 240–260 (2008)

    MathSciNet  Article  Google Scholar 

  25. He, B.S., Yuan, X.M.: Convergence analysis of primal–dual algorithms for a saddle-point problem: from contraction perspective. SIAM J. Imaging Sci. 5(1), 119–149 (2012)

    MathSciNet  Article  Google Scholar 

  26. Vũ, B.C.: A variable metric extension of the forward–backward–forward algorithm for monotone operators. Numer. Funct. Anal. Optim. 34(9), 1050–1065 (2013)

    MathSciNet  Article  Google Scholar 

  27. Liang, J.: Convergence rates of first-order operator splitting methods. PhD thesis (2016)

  28. Bonettini, S., Porta, F., Ruggiero, V.: A variable metric forward–backward method with extrapolation. SIAM J. Sci. Comput. 38(4), 2558–2584 (2016)

    MathSciNet  Article  Google Scholar 

  29. Lotito, P.A., Parente, L.A., Solodov, M.V.: A class of variable metric decomposition methods for monotone variational inclusions. J. Convex Anal. 16, 857–880 (2009)

    MathSciNet  MATH  Google Scholar 

  30. Combettes, P.L., Vũ, B.C.: Variable metric forward–backward splitting with applications to monotone inclusions in duality. Optimization 63(9), 1289–1318 (2014)

    MathSciNet  Article  Google Scholar 

  31. Simoes, M.: On some aspects of inverse problems in image processing. PhD thesis (2017)

  32. Moreau, J.J.: Fonctions convexes duales et points proximaux dans un espace hilbertien. C. R. Acad. Sci., Paris Ser. A Math. 255, 2897–2899 (1962)

    MathSciNet  MATH  Google Scholar 

  33. Ogura, N., Yamada, I.: Non-strictly convex minimization over the fixed point set of the asymptotically shrinking nonexpansive mapping. Numer. Funct. Anal. Optim. 23, 113–137 (2002)

    MathSciNet  Article  Google Scholar 

  34. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Motonone Operator Theory in Hilbert Spaces. Springer, London (2011)

    Book  Google Scholar 

  35. Byrne, C.: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Probl. 20(1), 103–120 (2004)

    MathSciNet  Article  Google Scholar 

  36. Combettes, P.L., Wajs, V.R.: Variable metric quasi-Fejer monotonicity. Nonlinear Anal. 78, 17–31 (2013)

    MathSciNet  Article  Google Scholar 

  37. Censor, Y., Elfving, T.: A multiprojection algorithm using Bregman projections in a product space. Numer. Algorithms 8, 221–239 (1994)

    MathSciNet  Article  Google Scholar 

  38. Xu, H.K.: A variable Krasnoselskii–Mann algorithm and the multiple-set split feasibility problem. Inverse Probl. 22, 2021–2034 (2006)

    MathSciNet  Article  Google Scholar 

  39. Xu, H.K.: Iterative methods for the split feasibility problem in infinite dimensional Hilbert spaces. Inverse Probl. 26, 105018 (2010)

    MathSciNet  Article  Google Scholar 

  40. Yang, Q., Zhao, J.: Generalized KM theorems and their applications. Inverse Probl. 22, 833–844 (2006)

    MathSciNet  Article  Google Scholar 

  41. Xu, H.K.: Averaged mappings and the gradient-projection algorithm. J. Optim. Theory Appl. 150, 360–378 (2011)

    MathSciNet  Article  Google Scholar 

  42. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous referees and the associate editor for their valuable suggestions and comments, which improved this paper greatly.

Availability of data and materials

The datasets used during the current study are available from the corresponding author on reasonable request.

Authors’ information

F.Y. Cui, master student; Y.C. Tang, Ph.D., associate professor; C.X. Zhu, Ph.D., professor.

Funding

This work was supported by the National Natural Science Foundation of China (11661056, 11771198, 11401293), the Postdoctoral Research Foundation of China (2015M571989), and the Postdoctoral Science Foundation of Jiangxi Province (2015KY51).

Author information

Authors and Affiliations

Authors

Contributions

YCT carried out the idea of this paper. FYC wrote the original draft manuscript. YCT revised the manuscript. CXZ checked the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yuchao Tang.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cui, F., Tang, Y. & Zhu, C. Convergence analysis of a variable metric forward–backward splitting algorithm with applications. J Inequal Appl 2019, 141 (2019). https://doi.org/10.1186/s13660-019-2097-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13660-019-2097-4

MSC

  • 90C25
  • 47H05
  • 65K05

Keywords

  • Forward–backward splitting algorithm
  • Monotone inclusion
  • Variable metric
  • Split feasibility problem