New inertial self-adaptive algorithms for the split common null-point problem: application to data classifications

Promkam, Ratthaprom; Sunthrayuth, Pongsakorn; Kesornprom, Suparat; Tanprayoon, Ekapak

doi:10.1186/s13660-023-03049-2

Research
Open access
Published: 25 October 2023

New inertial self-adaptive algorithms for the split common null-point problem: application to data classifications

Ratthaprom Promkam¹,
Pongsakorn Sunthrayuth¹,
Suparat Kesornprom² &
…
Ekapak Tanprayoon¹

Journal of Inequalities and Applications volume 2023, Article number: 136 (2023) Cite this article

566 Accesses
Metrics details

Abstract

In this paper, we propose two inertial algorithms with a new self-adaptive step size for approximating a solution of the split common null-point problem in the framework of Banach spaces. The step sizes are adaptively updated over each iteration by a simple process without the prior knowledge of the operator norm of the bounded linear operator. Under suitable conditions, we prove the weak-convergence results for the proposed algorithms in p-uniformly convex and uniformly smooth Banach spaces. Finally, we give several numerical results in both finite- and infinite-dimensional spaces to illustrate the efficiency and advantage of the proposed methods over some existing methods. Also, data classifications of heart diseases and diabetes mellitus are presented as the applications of our methods.

1 Introduction

In this paper, we consider the following split common null-point problem [13] (see also [29]): find $z\in H_{1}$ such that

$$\begin{aligned} z\in A^{-1}0\cap T^{-1} \bigl(B^{-1}0\bigr), \end{aligned}$$

(1.1)

where $A:H_{1}\rightarrow 2^{H_{1}}$ and $B:H_{2}\rightarrow 2^{H_{2}}$ are set-valued maximal monotone operators, $T:H_{1}\rightarrow H_{2}$ is a bounded linear operator, and $H_{1}$ and $H_{2}$ are real Hilbert spaces. We denote the solution set of the split common null-point problem (1.1) by Ω. The split common null-point problem can be applied to solving many real-life problems, for instance, in practices as a model in intensity-modulated radiation-therapy treatment planning [14, 15] and in sensor networks in computerized tomography and data compression [19]. In addition, the split common null-point problem also generalizes several split-type problems that is the core the modeling of many inverse problems such as the split feasibility problem, the split equilibrium problem, and the split minimization problem as special cases.

Byrne et al. [13] introduced the following iterative scheme for solving the split common null-point problem: for given $x_{0}\in H_{1}$ and the sequence $\{x_{n}\}$ generated iteratively by

$$\begin{aligned} x_{n+1}=R_{\tau}\bigl(x_{n}- \lambda T^{*}(I-Q_{\mu})Tx_{n}\bigr), \end{aligned}$$

(1.2)

where $R_{\tau}=(I+\tau A)^{-1}$ and $Q_{\mu}=(I+\mu B)^{-1}$ are the resolvent operators of A for $\tau >0$ and of B for $\mu >0$, respectively. They proved the weak-convergence theorem for solving the split common null-point problem provided the step size $\lambda \in (0,\frac{2}{\|T\|^{2}} )$.

Alofi et al. [5] introduced the following iterative scheme based on a modified Halpern’s iteration for solving the split common null-point problem in the case that $H_{1}$ is a Hilbert space and F is a uniformly convex and smooth Banach space: for given $x_{1}\in H_{1}$ and the sequence $\{x_{n}\}$ generated iteratively by

$$\begin{aligned} x_{n+1}=\beta _{n}x_{n}+(1- \beta _{n}) \bigl(\alpha _{n}u_{n}+(1- \alpha _{n})R_{ \tau _{n}}\bigl(x_{n}-\tau _{n} T^{*}J(I-Q_{\mu _{n}})Tx_{n} \bigr)\bigr), \end{aligned}$$

(1.3)

where $R_{\tau}$ is the resolvent of A for $\tau >0$ and $Q_{\mu}$ is the metric resolvent of B for $\mu >0$, $\{\tau _{n}\}, \{\mu _{n}\}\subset (0,\infty )$, $\{\alpha _{n}\}\subset (0,1)$, and $\{\beta _{n}\}\subset (0,1)$ that satisfies some appropriate assumptions on the parameters, J is the duality mapping on F, T is the bounded linear operator from $H_{1}$ to F, and $\{u_{n}\}$ is the sequence in $H_{1}$ such that $u_{n}\rightarrow u$. They proved that the sequence $\{x_{n}\}$ generated by (1.3) converges strongly to a point of Ω provided $\tau _{n}$ satisfies the following inequality:

$$\begin{aligned} 0< a\leq \tau _{n} \Vert T \Vert ^{2}\leq b< 2 \end{aligned}$$

(1.4)

for some $a,b>0$.

Later, Suantai et al. [39] generalized the result of Alofi et al. [5] in the case that E is a p-uniformly convex and uniformly smooth Banach space, and F is a uniformly convex and smooth Banach space. To be more precise, they introduced the following scheme: for given $x_{1}\in E$ and the sequence $\{x_{n}\}$ generated iteratively by

$$ \textstyle\begin{cases} z_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{ \mu _{n}})Tx_{n}), \\ y_{n}=J_{q}^{E^{*}}(\alpha _{n}J_{p}^{E}(u_{n})+(1-\alpha _{n})J_{p}^{E}(R_{ \tau _{n}}z_{n})), \\ x_{n+1}=J_{q}^{E^{*}}(\beta _{n}J_{p}^{E}(x_{n})+(1-\beta _{n})J_{p}^{E}(y_{n})), \end{cases} $$

(1.5)

where $J_{p}^{E}$ and $J_{q}^{E^{*}}$ are the generalized duality mapping of E into $E^{*}$ and the duality mapping of $E^{*}$ into E, respectively, where $1< q\leq 2\leq p<\infty $ with $\frac{1}{p}+\frac{1}{q}=1$, and T is the bounded linear operator from E to F. They also proved the strong convergence of the sequence $\{x_{n}\}$ generated by (1.3) to a point of Ω provided $\tau _{n}$ satisfies the following inequality:

$$\begin{aligned} 0< a\leq \tau _{n}\leq b< \biggl( \frac{q}{\kappa _{q} \Vert T \Vert ^{q}} \biggr)^{ \frac{1}{q-1}} \end{aligned}$$

(1.6)

for some $a,b>0$.

However, several iterative methods involve a step size that requires to compute the norm of the bounded linear operator $\|T\|$ prior to choosing $\tau _{n}$. In general, it may not be easy to compute $\|T\|$. In particular, it makes the algorithms not easily implemented when the computation of $\|T\|$ is complicated. To overcome this drawback, a new step-size strategy without the prior knowledge of the operator norm of the bounded linear operator has been proposed by López et al. [28]. This method is known as a self-adaptive method that was first used to solve the split feasibility problem when the step-size criterion is independent of the operator norm of the bounded linear operator.

In optimization theory, the inertial technique has been widely used to accelerate the rate of convergence of the algorithms. This technique was motivated by the implicit time discretization of second-order dynamical systems (or a heavy ball with friction). Based on the inertial technique, Alvarez and Attouch [7] proposed the following so-called inertial proximal point algorithm for finding a zero point of a set-valued maximal monotone operator A: for given $x_{0},x_{1}\in H$ and the sequence $\{x_{n}\}$ generated iteratively by

$$ x_{n+1}=R_{\tau _{n}}\bigl(x_{n}+ \theta _{n}(x_{n}-x_{n-1})\bigr), $$

(1.7)

where $R_{\tau _{n}}$ is the resolvent of A and $x_{n}+\theta _{n}(x_{n}-x_{n-1})$ is called the inertial term. They also proved that the sequence $\{x_{n}\}$ generated by (1.7) converges weakly to a zero point of A provided $\{\tau _{n}\}$ is increasing and $\theta _{n}\in [0,1)$ is chosen so that $\sum_{n=1}^{\infty}\theta _{n}\|x_{n}-x_{n-1}\|^{2}<\infty $. In recent years, the inertial method was further studied intensively and it also has been used to solve some other optimization problems (see, for example, [16, 21, 32, 37, 38, 41, 45]).

In 2019, Tang [43] proposed the following inertial algorithm for solving the split common null-point problem in the case that $H_{1}$ is a Hilbert space, and F is a 2-uniformly convex and smooth Banach space: for given $x_{1}\in H_{1}$ and $\alpha \in [0,1)$, choose $\theta _{n}$ such that $0<\theta _{n}<\bar{\theta}_{n}$, where

$$\begin{aligned} \begin{aligned} \bar{\theta}_{n}= \textstyle\begin{cases} \min \{\alpha , \epsilon _{n}(\max \{ \Vert x_{n}-x_{n-1} \Vert ^{2}, \Vert x_{n}-x_{n-1} \Vert \})^{-1}\}, & x_{n}\neq x_{n-1}, \\ \alpha ,\quad &\text{otherwise,} \end{cases}\displaystyle \end{aligned} \end{aligned}$$

(1.8)

where $\{\epsilon _{n}\}\subset (0,\infty )$ such that $\sum_{n=1}^{\infty}\epsilon _{n}<\infty $. Compute the sequence $\{x_{n}\}$ generated iteratively by

$$ \textstyle\begin{cases} u_{n}=x_{n}+\theta _{n}(x_{n}-x_{n-1}), \\ x_{n+1}=R_{r}(I-\tau _{n}T^{*}J(I-Q_{\mu})T)u_{n} \end{cases} $$

(1.9)

with the step size

$$\begin{aligned} \tau _{n}=\rho _{n} \frac{f(u_{n})}{ \Vert F(w_{n}) \Vert ^{2}+ \Vert H(u_{n}) \Vert ^{2}}, \end{aligned}$$

(1.10)

where $\{\rho _{n}\}\subset (0,4)$, $f(u_{n})=\frac{1}{2}\|J(I-Q_{\mu})Tu_{n}\|^{2}$, $F(u_{n})=T^{*}J(I-Q_{\mu})Tw_{n}$, and $H(u_{n})=(I-R_{r})u_{n}$. The weak convergence of the sequence $\{x_{n}\}$ is established without the prior knowledge of the operator norm of the bounded linear operator.

Recently, several inertial algorithms for solving the split common null-point problem in Hilbert spaces have been studied by many authors (see, for example, [8, 17, 24, 30, 51]). However, such methods have been studied in Banach spaces by a few authors (see, for example, [43, 44]).

Inspired and motivated by the works mentioned above, in this paper, we introduce two new inertial self-adaptive algorithms that are based on the classical inertial method and relaxed inertial method for finding a solution of the split common null-point problem in Banach spaces. The weak-convergence theorems are proved without the prior knowledge of the operator norm of the bounded linear operator. We provide numerical implementations to show that our algorithms are efficient and competitive with some related algorithms. Our results are new and complement some previous results in the literature.

The contributions of this paper can be summarized as follows:

(1)
The weak-convergence result of iterative scheme (1.9) of Tang [43] is proved in a Hilbert space and a 2-uniformly convex smooth Banach space where this result can only be implemented in $\ell _{p}$ for $p\in (1,2]$ exclude the case of $p>2$. This is limited in practical applications of such a method. In this paper, our results generalize the weak-convergence result of Tang [43] from between two of those spaces to p-uniformly convex and uniformly smooth Banach spaces, and as a result our results can be implemented in $\ell _{p}$ for $p>1$.
(2)
Even though the step size of the iterative scheme (1.9) of Tang [43] is computed without the prior knowledge of the operator norms it requires calculation of $\|T^{*}J(I-Q_{\mu})u_{n}\|^{2}$ and $\|(I-R_{\tau})u_{n}\|^{2}$ in order to choose the step size $\tau _{n}$. This could be computationally expensive during implementations, especially in the case where the resolvent of A and the metric resolvent of B are difficult to compute. In this paper, our step size $\tau _{n}$ defined by (3.3), is adaptively updated by a cheap computation without the prior knowledge of the operator norms and only requires us to compute one metric resolvent of B.
(3)
For the iterative scheme (1.5) of Suantai et al. [39], the choice of the sequence of step size depends on the bounded linear operator $\|T\|$, which is a difficult task during the implementation of the algorithm. In this paper, the choice of the sequence of our step size $\tau _{n}$ defined by (3.3), is independent of the operator norm of the bounded linear operator. As a result, we do not require to calculate the norm $\|T\|$ in order to choose the step size $\tau _{n}$, which is easier to implement than such a method.
(4)
We use the inertial and relaxed inertial techniques to improve the rate of convergence of our algorithms that makes the algorithms converge faster and computationally more efficient for solving the split common null-point problem in Banach spaces. Note that these inertial techniques in this paper are studied outside Hilbert spaces for solving such a problem.
(5)
We present numerical results of our algorithms in Banach spaces to illustrate the efficiency and advantage over iterative scheme (1.5) of Suantai et al. [39] that gives the strong convergence and we also present several numerical results of our algorithms in finite-dimensional spaces. Moreover, we apply our results to data classifications for two datasets of heart diseases and diabetes mellitus.

Our paper is organized as the following four parts. In Sect. 2, we give some of the basic facts and notation that will be used in the paper. In Sect. 3, we propose two new inertial self-adaptive algorithms and prove our convergence results, and finally, in Sect. 4, we present several numerical results to verify the advantages and efficiency of the proposed algorithms.

2 Preliminaries

In this section, we give some definitions and preliminary results that will be used in proving the main results.Throughout this paper, we denote the set of real numbers and the set of positive integers by $\mathbb{R}$ and $\mathbb{N}$, respectively. Let E be a real Banach space with norm $\|\cdot \|$ with its the dual space $E^{*}$. We denote $\langle u,j\rangle $ by the value of a functional j in $E^{*}$ at $u\in E$, that is, $\langle u,j\rangle =j(u)$ for all $u\in E$. We write $u_{n}\rightarrow x$ to indicate that a sequence $\{u_{n}\}$ converges strongly to u. Similarly, $u_{n}\rightharpoonup u$ and $u_{n}\rightharpoonup ^{*} u$ will symbolize the weak and weak^∗ convergence, respectively. Let $S_{E}=\{u\in E:\|u\|=1\}$ and $B_{E}=\{u\in E:\|u\|\leq 1\}$ be a unit sphere and unit ball of E, respectively.

Let $1< q\leq 2\leq p<\infty $ with $\frac{1}{p}+\frac{1}{q}=1$. The modulus of convexity of E is the function $\delta _{E}:[0,2]\rightarrow [0,1]$ defined by

$$\begin{aligned} \delta _{E}(\epsilon )=\inf \biggl\{ 1-\frac{ \Vert u+v \Vert }{2}:u,v\in B_{E}, \Vert u-v \Vert \geq \epsilon \biggr\} . \end{aligned}$$

The modulus of smoothness of E is the function $\rho _{E}:[0,\infty )\rightarrow [0,\infty )$ defined by

$$\begin{aligned} \rho _{E}(t)=\sup \biggl\{ \frac{ \Vert u+tv \Vert + \Vert u-tv \Vert }{2}-1:u,v\in S_{E} \biggr\} . \end{aligned}$$

Definition 2.1

A Banach space E is said to be:

(1)
strictly convex if $\frac{\|u+v\|}{2}<1$ for all $u,v\in S_{E}$ and $u\neq v$;
(2)
smooth if $\lim_{t\rightarrow 0}\frac{\|u+tv\|-\|u\|}{t}$ exists for each $u,v\in S_{E}$;
(3)
uniformly convex if $\delta _{E}(\epsilon )>0$ for all $\epsilon \in (0,2]$;
(4)
p-uniformly convex if there is a $\kappa _{p} > 0$ such that $\delta _{E}(\epsilon )\geq \kappa _{p}\epsilon ^{p}$ for all $\epsilon \in (0,2]$;
(5)
uniformly smooth if $\lim_{t\rightarrow 0}\frac{\rho _{E}(t)}{t}=0$;
(6)
q-uniformly smooth if there exists a $\kappa _{q} > 0$ such that $\rho _{E}(t)\leq \kappa _{q} t^{q}$ for all $t>0$.

Remark 2.2

It is known that if E is uniformly convex, then E is reflexive and strictly convex; if E is uniformly smooth, then E is reflexive and smooth (see [2]). From the Definition 2.1, one can see that every p-uniformly convex (q-uniformly smooth) space is a uniformly convex (uniformly smooth) space. Moreover, it is also known that E is p-uniformly convex (q-uniformly smooth) if and only if $E^{*}$ is q-uniformly smooth (p-uniformly convex) (see [2, 49]).

For the Lebesgue spaces $L_{p}$, sequence spaces $l_{p}$, and Sobolev spaces $W_{p}^{m}$, it is also known that [23, 50]

$$\begin{aligned} \begin{aligned} L_{p} (l_{p}) \text{ or } W_{p}^{m} \text{ is } \textstyle\begin{cases} p\text{-uniformly smooth}, 2\text{-uniformly convex} & \text{for } 1< p \leq 2, \\ p\text{-uniformly convex}, 2\text{-uniformly smooth} & \text{for } 2\leq p< \infty . \end{cases}\displaystyle \end{aligned} \end{aligned}$$

(2.1)

For $p>1$. The mapping $J_{p}:E\rightarrow 2^{E^{*}}$ defined by

$$\begin{aligned} J_{p}(u)=\bigl\{ u^{*}\in E^{*}:\bigl\langle u,u^{*}\bigr\rangle = \Vert u \Vert ^{p}, \bigl\Vert u^{*} \bigr\Vert = \Vert u \Vert ^{p-1}\bigr\} \end{aligned}$$

is called the generalized duality mapping of E. In particular, $J_{2}=J$ is called the normalized duality mapping and if E is a Hilbert space, then $J_{p}=I$, where I is the identity mapping. The duality mapping $J_{p}$ of a smooth Banach space E is said to be weakly sequentially continuous if for any sequence $\{u_{n}\}\subset E$ such that $u_{n}\rightharpoonup u$ implies $J_{p}(u_{n})\rightharpoonup ^{*} J_{p}(u)$. For the generalized duality mapping, the following facts are known [2, 18, 34]:

(i)
$J_{p}$ is homogeneous degree $p-1$, that is, $J_{p}(\alpha u)=|\alpha |^{p-1}\operatorname{sign}(\alpha )J_{p}(u)$ for all $u\in E$, $\alpha \in \mathbb{R}$. In particular, $J_{p}(-u)=-J_{p}(u)$ for all $u\in E$.
(ii)
If E is smooth, then $J_{p}$ is monotone, that is, $\langle u-v,J_{p}(u)-J_{p}(v)\rangle \geq 0$ for all $u,v\in E$. Moreover, if E is strictly convex, then $J_{p}$ is strictly monotone.
(iii)
If E is uniformly smooth, then $J_{p}$ is single valued from E into $E^{*}$ and it is uniformly continuous on bounded subsets of E.
(iv)
If E is reflexive, smooth, and strictly convex, then the inverse $J_{p}^{-1}=J_{q}^{*}$ is single valued, one-to-one, and surjective, where $J_{q}^{*}$ is the duality mapping from $E^{*}$ into E.

Lemma 2.3

([49]) If E is a q-uniformly smooth Banach space, then there is a constant $\kappa _{q}>0$ such that

$$\begin{aligned} \Vert u-v \Vert ^{q}\leq \Vert u \Vert ^{q}-q \bigl\langle v,J_{q}(u)\bigr\rangle +\kappa _{q} \Vert v \Vert ^{q},\quad \forall u,v\in E, \end{aligned}$$

where $\kappa _{q}$ is called the q-uniform smoothness coefficient of E.

Remark 2.4

The exact values of the constant $\kappa _{q}$ can be found in [35, 50].

We next recall the definition of Bregman distance. Let E be a real smooth Banach space and f be a convex and Gâteaux differentiable function on E. The bifunction $D_{f}:E\times E\rightarrow [0,\infty )$ defined by

$$\begin{aligned} D_{f}(u,v)=f(u)-f(v)-\bigl\langle u-v,\nabla f(v)\bigr\rangle \end{aligned}$$

is called the Bregman distance with respect to f. Note that the Bregman distance is not a metric due to its lack of symmetry and failure to satisfy the triangle inequality. If $f_{p}(x)=\frac{1}{p}\|x\|^{p}$ for $p>1$, then $\nabla f=J_{p}$. Hence, we have the Bregman distance with respect to $f=f_{p}$ given by

$$\begin{aligned} D_{f_{p}}(u,v) =&\frac{1}{p} \Vert u \Vert ^{p}-\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle u-v,J_{p}(v) \bigr\rangle \\ =&\frac{1}{p} \Vert u \Vert ^{p}+ \frac{1}{q} \Vert v \Vert ^{p}-\bigl\langle u,J_{p}(v) \bigr\rangle . \end{aligned}$$

Moreover, if $p=2$, then $2D_{f_{2}}(u,v)=\|u\|^{2}-\|v\|^{2}-2\langle u,J(v)\rangle =\phi (u,v)$, where ϕ is called the Lyapunov function studied in [4, 31]. Also, if E is a Hilbert space, then $\phi (u,v)=\|u-v\|^{2}$. The following properties of the Bregman distance are well known: for each $u,v,w\in E$,

$$\begin{aligned} D_{f_{p}}(u,v)=D_{f_{p}}(u,w)-D_{f_{p}}(v,w)+ \bigl\langle u-v,J_{p}(w)-J_{p}(v) \bigr\rangle \end{aligned}$$

(2.2)

and

$$\begin{aligned} D_{f_{p}}(u,v)+D_{f_{p}}(v,u)=\bigl\langle u-v,J_{p}(u)-J_{p}(v)\bigr\rangle . \end{aligned}$$

(2.3)

For a p-uniformly convex space, it holds that [36]

$$\begin{aligned} \tau \Vert u-v \Vert ^{p}\leq D_{f_{p}}(u,v)\leq \bigl\langle u-v,J_{p}(u)-J_{p}(v) \bigr\rangle , \end{aligned}$$

(2.4)

where $\tau >0$ is some fixed number.

Also, we define a function $V_{f_{p}}: E\times E^{*}\rightarrow [0,\infty )$ by

$$ V_{f_{p}}\bigl(u,u^{*}\bigr)=\frac{1}{p} \Vert u \Vert ^{p}-\bigl\langle u,u^{*}\bigr\rangle + \frac{1}{q} \bigl\Vert u^{*} \bigr\Vert ^{q} $$

for all $u\in E$ and $u^{*}\in E^{*}$. Note that $V_{f_{p}}$ is nonnegative, convex in the second variable and $V_{f_{p}}(u,u^{*})=D_{f_{p}}(u,J_{q}(u^{*}))$ for all $u\in E$ and $u^{*}\in E^{*}$. Moreover, the following property is known:

$$\begin{aligned} V_{f_{p}}\bigl(u,u^{*}\bigr)+\bigl\langle J_{p}^{-1}\bigl(u^{*}\bigr)-u,v^{*} \bigr\rangle \leq V_{f_{p}}\bigl(u,u^{*}+v^{*} \bigr) \end{aligned}$$

(2.5)

for all $u\in E$ and $u^{*},v^{*}\in E^{*}$.

Let C be a nonempty, closed, and convex subset of a smooth, strictly convex, and reflexive Banach space. Then, for any $u\in E$, there exists a unique element $w\in C$ such that

$$\begin{aligned} \Vert u-w \Vert =\min_{v\in C} \Vert u-v \Vert . \end{aligned}$$

The mapping $P_{C}$ defined by $w=P_{C}(u)$ is called the metric projection of E onto C. We know the following property [36]:

$$\begin{aligned} \bigl\langle v-P_{C}(u),J_{p}\bigl(u-P_{C}(u) \bigr)\bigr\rangle \leq 0,\quad \forall v\in C. \end{aligned}$$

Recall that the Bregman projection with respect to $f_{p}$ is defined by

$$\begin{aligned} \Pi _{C}^{f_{p}}(u)=\mathop{\operatorname{argmin}}_{v\in C}D_{f_{p}}(u,v), \quad \forall u \in E. \end{aligned}$$

If $p=2$, then $\Pi _{C}^{f_{p}}$ becomes the generalized projection and denoted by $\Pi _{C}$. Also, in this case, if E is a Hilbert space, then $\Pi _{C}$ becomes the metric projection denoted by $P_{C}$. We also know the following property [11]:

$$\begin{aligned} \bigl\langle v-\Pi _{C}^{f_{p}}(u),J_{p}(u)-J_{p} \bigl(\Pi _{C}^{f_{p}}(u)\bigr) \bigr\rangle \leq 0,\quad \forall v\in C. \end{aligned}$$

(2.6)

Let C be a nonempty subset of E and $T:C \rightarrow C$ be a mapping. We denote the fixed-point set of T by $F(T)=\{u\in C: u=Tu\}$. Let $A: E\rightarrow 2^{E^{*}}$ be a set-valued mapping. The domain of A is denoted by $\mathcal{D}(A)=\{u\in E : Au\neq \emptyset \}$ and the range of A is also denoted by $\mathcal{R}(A)=\bigcup \{Au:u\in \mathcal{D}(A)\}$. The set of zeros of A is defined by $A^{-1}0=\{u\in \mathcal{D}(A):0\in Au\}$. It is known that $A^{-1}0$ is closed and convex (see [40]). A set-valued mapping A is said to be monotone if

$$\begin{aligned} \langle x-y,u-v\rangle \geq 0,\quad \forall x,y\in \mathcal{D}(A), u\in Ax \text{ and } v\in Ay. \end{aligned}$$

A monotone operator A on E is said to be maximal if its graph is not properly contained in the graph of any other monotone operator on E.

Let E be a p-uniformly convex and uniformly smooth Banach space and $A: E\rightarrow 2^{E^{*}}$ be a maximal monotone operator. Following [10], for each $u\in E$ and $\tau >0$, we define the resolvent of A by

$$\begin{aligned} R_{\tau}(u)=(J_{p}+\tau A)^{-1}J_{p}(u), \quad \forall u\in E. \end{aligned}$$

One can see that $A^{-1}0=F(R_{\tau})$ for $\tau >0$. We also know the following property [26]:

$$\begin{aligned} D_{f_{p}} \bigl(v,R_{\tau}(u)\bigr)+ D_{f_{p}} \bigl(R_{\tau}(u),u\bigr)\leq D_{f_{p}} (v,u) \end{aligned}$$

(2.7)

for all $u\in E$ and $v\in A^{-1}0$.

For each $u\in E$ and $\mu >0$, we define the metric resolvent of A for $\mu >0$ by

$$\begin{aligned} Q_{\mu}(u)=\bigl(I+\mu J_{p}^{-1}A \bigr)^{-1}(u),\quad \forall u\in E. \end{aligned}$$

(2.8)

It is clear that in a Hilbert space, the metric resolvent operator is equivalent to the resolvent operator. From (2.8), one can see that $0\in J_{p}(Q_{\mu}(u)-u)+\mu AQ_{\mu}(u)$ and $A^{-1}0=F(Q_{\mu})$ for $\mu >0$. The monotonicity of A implies that

$$\begin{aligned} \bigl\langle Q_{\mu}(u)-Q_{\mu}(v),J_{p} \bigl(u-Q_{\mu}(u)\bigr)-J_{p}\bigl(v-Q_{\mu}(v) \bigr) \bigr\rangle \geq 0 \end{aligned}$$

(2.9)

for all $u, v \in E$. If $A^{-1}0\neq \emptyset $, then

$$\begin{aligned} \bigl\langle Q_{\mu}(u)-v,J_{p} \bigl(u-Q_{\mu}(u)\bigr)\bigr\rangle \geq 0 \end{aligned}$$

(2.10)

for all $u\in E$ and $v\in A^{-1}0$ (see [9]). For any sequence $\{x_{n}\}$ in E, we see that

$$\begin{aligned} \Vert x_{n}-v \Vert \bigl\Vert x_{n}-Q_{\mu}(x_{n}) \bigr\Vert ^{p-1} \geq &\bigl\langle x_{n}-v,J_{p} \bigl(x_{n}-Q_{ \mu}(x_{n})\bigr)\bigr\rangle \\ \geq &\bigl\langle x_{n}-Q_{\mu}(x_{n}),J_{p} \bigl(x_{n}-Q_{\mu}(x_{n})\bigr) \bigr\rangle \\ =& \bigl\Vert x_{n}-Q_{\mu}(x_{n}) \bigr\Vert ^{p}. \end{aligned}$$

(2.11)

This implies that $\|x_{n}-Q_{\mu}(x_{n})\|\leq \|x_{n}-v\|$. If $\{x_{n}\}$ is bounded, then $\{x_{n}-Q_{\mu}(x_{n})\}$ is also bounded.

Let E be a p-uniformly convex and uniformly smooth Banach space and $f:E\rightarrow \mathbb{R}\rightarrow (-\infty ,+\infty ]$ be a proper, convex, and lower semicontinuous function. The subdifferential of f at x is defined by

$$\begin{aligned} \partial f(x)=\bigl\{ z\in E^{*}: f(x)+\langle y-x,z\rangle \leq f(y), \forall y\in E\bigr\} . \end{aligned}$$

Let C be a closed and convex subset of E. The indicator function $\delta _{C}$ of C at x is defined by

$$ \delta _{C}(x)= \textstyle\begin{cases} 0,& \text{if }x\in C, \\ +\infty , &\text{if } x\notin C. \end{cases} $$

The subdifferentiable $\partial \delta _{C}$ is a maximal monotone operator since $\delta _{C}$ is a proper, convex, and lower semicontinuous function (see [33]). Moreover, we also know that

$$\begin{aligned} \partial \delta _{C}(x)=N_{C}(x)=\bigl\{ z\in E^{*}:\langle y-x,z\rangle \leq 0, \forall y\in C\bigr\} , \end{aligned}$$

where $N_{C}$ is the normal cone of C. In particular, if we define the resolvent of $\partial \delta _{C}$ for $\tau >0$ by $R_{\tau}(u)=(J_{p}+\tau \partial \delta _{C})^{-1}J_{p}(u)$ for all $u\in E$, then $R_{\tau}=\Pi _{C}^{f_{p}}$, where $\Pi _{C}^{f_{p}}$ is the Bregman projection with respect to $f_{p}$ (see [48]). Moreover, we also have $(\partial \delta _{C})^{-1}0=C$. Also, if we define the metric resolvent of $\partial \delta _{C}$ for $\mu >0$ by $Q_{\mu}(u)=(I+\mu J_{p}^{-1}\partial \delta _{C})^{-1}(u)$ for all $u\in E$, then

$$\begin{aligned} z=Q_{\mu}(u) \quad \Leftrightarrow\quad &0\in J_{p}(z-u)+\mu Az \\ \quad \Leftrightarrow \quad &\frac{J_{p}(u-z)}{\mu}\in Az=\partial \delta _{C}(z)=N_{C}(z) \\ \quad \Leftrightarrow \quad &\bigl\langle y-z,J_{p}(u-z)\bigr\rangle \leq 0,\quad \forall y\in C \\ \quad \Leftrightarrow \quad &z=P_{C}(u), \end{aligned}$$

where $P_{C}$ is the metric projection of E onto C.

Throughout this paper, we adopt the notation $[a]_{+}:=\max \{a,0\}$, where $a\in \mathbb{R}$.

Lemma 2.5

([6]) Let $\{\varphi _{n}\}$, $\{\alpha _{n}\}$, and $\{\beta _{n}\}$ be three nonnegative real sequences such that

$$ \varphi _{n+1}\leq \varphi _{n}+\alpha _{n}(\varphi _{n}-\varphi _{n-1})+ \beta _{n},\quad \forall n\geq 1, $$

with $\sum_{n=1}^{\infty}\beta _{n}<\infty $ and there exists a real number α such that $0\leq \alpha _{n}\leq \alpha <1$ for all $n\in \mathbb{N}$. Then, the following results hold:

(i)
$\sum_{n=1}^{\infty}[\varphi _{n}-\varphi _{n-1}]_{+}<\infty $;
(ii)
There exists $\varphi ^{*}\in [0,\infty )$ such that $\lim_{n\rightarrow \infty}\varphi _{n}=\varphi ^{*}$.

Lemma 2.6

([42]) Assume that $\{s_{n}\}$ and $\{t_{n}\}$ are two nonnegative real sequences such that $s_{n+1}\leq s_{n}+t_{n}$ for all $n\geq 1$. If $\sum_{n=1}^{\infty}t_{n}<\infty $, then $\lim_{n\rightarrow \infty}s_{n}$ exists.

3 Main results

In this paper, we propose two weakly convergent inertial self-adaptive algorithms to solve the split common null-point problem in Banach spaces. In what follows, we denote $J_{p}^{E}$ and $J_{q}^{E^{*}}$ by the generalized duality mapping of E into $E^{*}$ and the duality mapping of $E^{*}$ into E, respectively, where $1< q\leq 2\leq p<\infty $ with $\frac{1}{p}+\frac{1}{q}=1$.

In order to prove the results, the following assumptions are needed in the following.

(A1)
Let E be a p-uniformly convex and uniformly smooth Banach space and F be a uniformly convex and smooth Banach space.
(A2)
Let $A:E\rightarrow 2^{E^{*}}$ and $B:F\rightarrow 2^{F^{*}}$ be maximal monotone operators.
(A3)
Let $T : E \rightarrow F$ be a bounded linear operator with $T\neq 0$ and $T^{*} : F^{*} \rightarrow E^{*}$ be the adjoint operator of T.
(A4)
Let $R_{\tau}$ be a resolvent operator associated with A for $\tau >0$ and $Q_{\mu}$ be a metric resolvent associated with B for $\mu >0$.
(A5)
The set solution $\Omega := A^{-1}0\cap T^{-1}(B^{-1}0)\neq \emptyset $.

The following conditions are also assumed:

(C1)
Let $\{\alpha _{n}\}\subset (0,1]$ with $\liminf_{n\rightarrow \infty}\alpha _{n}>0$;
(C2)
Let $\{\mu _{n}\}\subset (0,\infty )$ with $\liminf_{n\rightarrow \infty}\mu _{n}>0$.

The first algorithm is stated as follows:

Algorithm 1

(Inertial self-adaptive algorithm for the split common null-point problem)

Step 0. Given $\tau _{1}>0$, $\beta \in (0,1)$ and $\mu \in (0,\frac{q}{\kappa _{q}} )$. Choose $\{s_{n}\}\subset [0,\infty )$ such that $\sum_{n=1}^{\infty}s_{n}<\infty $ and $\{\beta _{n}\}\subset (0,\infty )$ such that $\sum_{n=1}^{\infty}\beta _{n}<\infty $. Let $x_{0},x_{1}\in E$ be arbitrary and calculate $x_{n+1}$ as follows:
Step 1. Given the iterates $x_{n-1}$ and $x_{n}$ ($n\geq 1$). Choose $\theta _{n}$ such that $0\leq \theta _{n}\leq \bar{\theta}_{n}$, where
$$\begin{aligned} \begin{aligned} \bar{\theta}_{n}= \textstyle\begin{cases} \min \{\beta ,\frac {\beta _{n}}{ \Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \Vert ^{q}},\frac {\beta _{n}}{D_{f_{p}}(x_{n},x_{n-1})} \}, & \text{if } x_{n}\neq x_{n-1},\\ \beta , &\text{otherwise.} \end{cases}\displaystyle \end{aligned} \end{aligned}$$
(3.1)
Step 2. Compute
$$ \textstyle\begin{cases} u_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})+\theta _{n}(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}))),\\ y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}),\\ x_{n+1}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(u_{n})+\alpha _{n}J_{p}^{E}(R_{\tau _{n}}y_{n})), \end{cases} $$
(3.2)
where
$$ \begin{aligned} \tau _{n+1}= \textstyle\begin{cases} \min \{ (\frac {\mu \Vert (I-Q_{\mu _{n}})Tu_{n}) \Vert ^{p}}{ \Vert T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \Vert ^{q}} )^{\frac{1}{q-1}},\tau _{n}+s_{n} \} ,& \text{if } T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}\neq 0,\\ \tau _{n}+s_{n}, &\text{otherwise}. \end{cases}\displaystyle \end{aligned} $$
(3.3)

Remark 3.1

If $x_{n+1}=y_{n}=u_{n}$ for some n, then $y_{n}$ is a solution in Ω. Indeed, if $y_{n}=u_{n}$, we see that $y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(y_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{ \mu _{n}})Tu_{n})$ for $\tau _{n}>0$. This implies that $(I-Q_{\mu _{n}})Ty_{n}=0$, that is, $Ty_{n}=Q_{\mu _{n}}Ty_{n}$. In addition, if $x_{n+1}=y_{n}$, then $y_{n}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(y_{n})+\alpha _{n}J_{p}^{E}(R_{ \tau _{n}}y_{n}))$. This implies that $y_{n}=R_{\tau _{n}}y_{n}$. Now, since $y_{n}=R_{\tau _{n}}y_{n}$ and $Ty_{n}=Q_{\mu _{n}}Ty_{n}$, we have $y_{n}\in A^{-1}0$ and $y_{n}\in T^{-1}(B^{-1}0)$. Therefore, $y_{n}\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)$.

Remark 3.2

From (3.1), we observe that $0\leq \theta _{n}\leq \beta <1$ for all $n\geq 1$. Also, we obtain $\theta _{n}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1})\|^{q}\leq \beta _{n}$ and $\theta _{n}D_{f_{p}}(x_{n},x_{n-1})\leq \beta _{n}$ for all $n\geq 1$. Since $\sum_{n=1}^{\infty}\beta _{n}<\infty $, we have

$$\begin{aligned} \sum_{n=1}^{\infty}\theta _{n} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q}< \infty \quad \text{and}\quad \sum _{n=1}^{\infty}\theta _{n}D_{f_{p}}(x_{n},x_{n-1})< \infty . \end{aligned}$$

(3.4)

Lemma 3.3

Let $\{\tau _{n}\}$ be a sequence generated by (3.3). Then, we have $\lim_{n\rightarrow \infty}\tau _{n}=\tau $, where

$$\begin{aligned} \tau \in \biggl[\min \biggl\{ \biggl(\frac{\mu}{ \Vert T \Vert ^{q}} \biggr)^{ \frac{1}{q-1}}, \tau _{1} \biggr\} ,\tau _{1}+s \biggr]\quad \textit{and}\quad s=\sum_{n=1}^{ \infty}s_{n}. \end{aligned}$$

Proof

In the case of $T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}\neq 0$, we see that

$$\begin{aligned} & \frac{\mu \Vert (I-Q_{\mu _{n}})Tu_{n} \Vert ^{p}}{ \Vert T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \Vert ^{q}} \\ &\quad \geq \frac{\mu \Vert (I-Q_{\mu _{n}})Tu_{n} \Vert ^{p}}{ \Vert T^{*} \Vert ^{q} \Vert J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \Vert ^{q}} \\ &\quad = \frac{\mu \Vert (I-Q_{\mu _{n}})Tu_{n} \Vert ^{p}}{ \Vert T \Vert ^{q} \Vert (I-Q_{\mu _{n}})Tu_{n} \Vert ^{(p-1)q}} \\ &\quad =\frac{\mu}{ \Vert T \Vert ^{q}}. \end{aligned}$$

(3.5)

By the definition of $\tau _{n}$ and induction, we have

$$ \tau _{n}\leq \tau _{n-1}+s_{n-1}\leq \tau _{n-2}+s_{n-2}+s_{n-1} \leq \cdots \leq \tau _{1}+\sum_{i=1}^{n-1}s_{i} \leq \tau _{1}+\sum_{n=1}^{ \infty}s_{n}. $$

Thus, $\tau _{n}\leq \tau _{1}+\sum_{n=1}^{\infty}s_{n}$ for all $n\geq 1$. From (3.5), we see that

$$\begin{aligned} \tau _{n+1} =&\min \biggl\{ \biggl( \frac {\mu \Vert (I-Q_{\mu _{n}})Tu_{n}) \Vert ^{p}}{ \Vert T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \Vert ^{q}} \biggr)^{\frac{1}{q-1}},\tau _{n}+s_{n} \biggr\} \\ \geq &\min \biggl\{ \biggl(\frac{\mu}{ \Vert T \Vert ^{q}} \biggr)^{\frac{1}{q-1}}, \tau _{n}+s_{n} \biggr\} \\ \geq &\min \biggl\{ \biggl(\frac{\mu}{ \Vert T \Vert ^{q}} \biggr)^{\frac{1}{q-1}}, \tau _{n} \biggr\} \\ \geq &\cdots \geq \min \biggl\{ \biggl(\frac{\mu}{ \Vert T \Vert ^{q}} \biggr)^{ \frac{1}{q-1}},\tau _{1} \biggr\} . \end{aligned}$$

Hence $\tau _{n}\geq \min \{ (\frac{\mu}{\|T\|^{q}} )^{ \frac{1}{q-1}},\tau _{1} \}$ for all $n\geq 1$. Therefore, $\min \{ (\frac{\mu}{\|T\|^{q}} )^{\frac{1}{q-1}},\tau _{1} \}\leq \tau _{n}\leq \tau _{1}+\sum_{n=1}^{\infty}s_{n}$ for all $n\geq 1$. Since $\tau _{n+1}\leq \tau _{n}+s_{n}$ for all $n\geq 1$, we have $\lim_{n\rightarrow \infty}\tau _{n}$ exists by Lemma 2.6. In this case, we denote $\tau =\lim_{n\rightarrow \infty}\tau _{n}$. Obviously, $\tau \in [\min \{ (\frac{\mu}{\|T\|^{q}} )^{ \frac{1}{q-1}},\tau _{1} \}, \tau _{1}+s ]$. □

Remark 3.4

The adaptive step size $\tau _{n}$ generated by (3.3) is different from many adaptive step sizes as studied in [43, 44]. Note that $\tau _{n}$ is allowed to increase when the iteration increases. Therefore, it reduces the dependence on the initial step size $\tau _{1}$. Since $\sum_{n=1}^{\infty}s_{n}<\infty $, we have $\lim_{n\rightarrow \infty}s_{n}=0$. As a result, $\tau _{n}$ may not increase when n is large.

Lemma 3.5

Let $\{x_{n}\}$ be a sequence generated by Algorithm 1. Then, for each $n\geq 1$, the following inequality holds for all $v\in \Omega $:

$$\begin{aligned} D_{f_{p}}(v,x_{n+1})\leq D_{f_{p}}(v,x_{n})+ \theta _{n}\bigl(D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr)+ \xi _{n}(p,q)-\delta _{n}(p,q), \end{aligned}$$

where $\xi _{n}(p,q):=\theta _{n}D_{f_{p}}(x_{n},x_{n-1})+ \frac{\kappa _{q}\theta _{n}^{q}}{q}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \|^{q}$, $\delta _{n}(p,q):=\alpha _{n}\tau _{n} (1- \frac{\kappa _{q}\mu}{q} (\frac{\tau _{n}}{\tau _{n+1}} )^{q-1} )\|w_{n}\|^{p}+\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n})$ and $w_{n}:=Tu_{n}-Q_{\mu _{n}}Tu_{n}$.

Proof

Let $v\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)$. From Lemma 2.3, we have

$$\begin{aligned} D_{f_{p}}(v,y_{n}) =&V_{f_{p}} \bigl(v,J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{ \mu _{n}})Tu_{n} \bigr) \\ =&\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(w_{n}) \bigr\rangle +\frac{1}{q} \bigl\Vert J_{p}^{E}(u_{n})- \tau _{n}T^{*}J_{p}^{F}(w_{n}) \bigr\Vert ^{q} \\ =&\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(u_{n})\bigr\rangle +\tau _{n} \bigl\langle Tv,J_{p}^{F}(w_{n}) \bigr\rangle +\frac{1}{q} \bigl\Vert J_{p}^{E}(u_{n})- \tau _{n}T^{*}J_{p}^{F}(w_{n}) \bigr\Vert ^{q} \\ \leq &\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(u_{n})\bigr\rangle +\tau _{n} \bigl\langle Tv,J_{p}^{F}(w_{n}) \bigr\rangle +\frac{1}{q} \bigl\Vert J_{p}^{E}(u_{n}) \bigr\Vert ^{q}- \tau _{n}\bigl\langle u_{n},T^{*}J_{p}^{F}(w_{n}) \bigr\rangle \\ &{}+ \frac{\kappa _{q}\tau _{n}^{q}}{q} \bigl\Vert T^{*}J_{p}^{F}(w_{n}) \bigr\Vert ^{q} \\ =&\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(u_{n})\bigr\rangle + \frac{1}{q} \Vert u_{n} \Vert ^{q}+\tau _{n}\bigl\langle Tv-Tu_{n},J_{p}^{F}(w_{n}) \bigr\rangle +\frac{\kappa _{q}\tau _{n}^{q}}{q} \bigl\Vert T^{*}J_{p}^{F}(w_{n}) \bigr\Vert ^{q} \\ =&D_{f_{p}}(v,u_{n})+\tau _{n}\bigl\langle Tv-Tu_{n},J_{p}^{F}(w_{n}) \bigr\rangle +\frac{\kappa _{q}\tau _{n}^{q}}{q} \bigl\Vert T^{*}J_{p}^{F}(w_{n}) \bigr\Vert ^{q}. \end{aligned}$$

(3.6)

Note that $w_{n}:=Tu_{n}-Q_{\mu _{n}}Tu_{n}$ and $v\in A^{-1}0\cap T^{-1}(B^{-1}0)$, we have $v\in A^{-1}0$ and $Tv\in B^{-1}0$. It then follows from (2.10) that

$$\begin{aligned} \bigl\langle Tv-Tu_{n},J_{p}^{F}(w_{n}) \bigr\rangle =& \underbrace{\bigl\langle Tv-Q_{\mu _{n}}Tu_{n},J_{p}^{F}(w_{n}) \bigr\rangle }_{ \leq 0}-\bigl\langle Tu_{n}-Q_{\mu _{n}}Tu_{n},J_{p}^{F}(w_{n}) \bigr\rangle \\ \leq &- \Vert Tu_{n}-Q_{\mu _{n}}Tu_{n} \Vert ^{p}=- \Vert w_{n} \Vert ^{p}. \end{aligned}$$

(3.7)

From the definition of $\tau _{n+1}$, we have

$$\begin{aligned} \bigl\Vert T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \bigr\Vert ^{q}\leq \frac{\mu}{\tau _{n+1}^{q-1}} \bigl\Vert (I-Q_{\mu _{n}})Tu_{n} \bigr\Vert ^{p}. \end{aligned}$$

(3.8)

Combining (3.6), (3.7), and (3.8), we obtain

$$\begin{aligned} D_{f_{p}}(v,y_{n})\leq D_{f_{p}}(v,u_{n})-\tau _{n} \biggl(1- \frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}. \end{aligned}$$

(3.9)

Now, we estimate $D_{f_{p}}(v,u_{n})$. From Lemma 2.3, we have

$$\begin{aligned} D_{f_{p}}(v,u_{n}) =&V_{f_{p}} \bigl(v,J_{p}^{E}(x_{n})+\theta _{n}\bigl(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr)\bigr) \\ =&\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(x_{n})+\theta _{n} \bigl(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr) \bigr\rangle \\ &{}+\frac{1}{q} \bigl\Vert J_{p}^{E}(x_{n})+ \theta _{n}\bigl(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr) \bigr\Vert ^{q} \\ \leq &\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(x_{n})\bigr\rangle - \theta _{n}\bigl\langle v,J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle \\ &{}+ \frac{1}{q} \bigl\Vert J_{p}^{E}(x_{n}) \bigr\Vert ^{q}+\theta _{n}\bigl\langle x_{n},J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle +\frac{\kappa _{q}}{q} \bigl\Vert \theta _{n} \bigl(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr) \bigr\Vert ^{q} \\ =&\frac{1}{p} \Vert v \Vert ^{p}-\bigl\langle v,J_{p}^{E}(x_{n})\bigr\rangle + \frac{1}{q} \Vert x_{n} \Vert ^{p}+ \frac{\kappa _{q}\theta _{n}^{q}}{q} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q} \\ &{}+\theta _{n}\bigl\langle x_{n}-v,J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle \\ =&D_{f_{p}}(v,x_{n})+\frac{\kappa _{q}\theta _{n}^{q}}{q} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q}+\theta _{n}\bigl\langle x_{n}-v,J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle . \end{aligned}$$

(3.10)

We observe that

$$\begin{aligned} \theta _{n}\bigl\langle x_{n}-v,J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle =\theta _{n}D_{f_{p}}(v,x_{n})- \theta _{n}D_{f_{p}}(v,x_{n-1})+ \theta _{n}D_{f_{p}}(x_{n},x_{n-1}). \end{aligned}$$

(3.11)

Combining (3.10) and (3.11), we obtain

$$\begin{aligned} D_{f_{p}}(v,u_{n}) \leq &D_{f_{p}}(v,x_{n})+\theta _{n} \bigl(D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr)+ \theta _{n}D_{f_{p}}(x_{n},x_{n-1}) \\ &{}+\frac{\kappa _{q}\theta _{n}^{q}}{q} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q}. \end{aligned}$$

(3.12)

Then, from (2.7) and (3.10), we obtain

$$\begin{aligned} D_{f_{p}}(v,x_{n+1}) \leq &(1-\alpha _{n})D_{f_{p}}(v,u_{n})+\alpha _{n}D_{f_{p}}(v,R_{ \tau _{n}}y_{n}) \\ \leq &(1-\alpha _{n})D_{f_{p}}(v,u_{n})+ \alpha _{n} D_{f_{p}}(v,y_{n})- \alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ \leq &(1-\alpha _{n})D_{f_{p}}(v,u_{n})+ \alpha _{n} \biggl[D_{f_{p}}(v,u_{n})- \tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p} \biggr] \\ &{}- \alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ =&D_{f_{p}}(v,u_{n})-\alpha _{n}\tau _{n} \biggl(1- \frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}-\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ \leq &D_{f_{p}}(v,x_{n})+\theta _{n} \bigl(D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr)+ \theta _{n}D_{f_{p}}(x_{n},x_{n-1}) \\ &{}+ \frac{\kappa _{q}\theta _{n}^{q}}{q} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q} \\ &{}-\alpha _{n}\tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}-\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}). \end{aligned}$$

(3.13)

From the definitions of $\xi _{n}(p,q)$ and $\delta _{n}(p,q)$, then (3.13) can be written in a short form as follows:

$$\begin{aligned} D_{f_{p}}(v,x_{n+1})\leq D_{f_{p}}(v,x_{n})+ \theta _{n}\bigl(D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr)+ \xi _{n}(p,q)-\delta _{n}(p,q). \end{aligned}$$

Thus, this lemma is proved. □

Theorem 3.6

Let $\{x_{n}\}$ be a sequence generated by Algorithm 1. Suppose, in addition, that $J_{p}^{E}$ is weakly sequentially continuous on E. Then, $\{x_{n}\}$ converges weakly to a point in Ω.

Proof

Using the fact that $\lim_{n\rightarrow \infty}\tau _{n}$ exists and $\mu \in (0,\frac{q}{\kappa _{q}} )$, we have

$$\begin{aligned} \lim_{n\rightarrow \infty} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr)=\frac{\mu}{q} \biggl( \frac{q}{\mu}-\kappa _{q} \biggr)>0. \end{aligned}$$

Then, there exists $n_{0}\in \mathbb{N}$ such that

$$\begin{aligned} 1-\frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1}>0, \quad \forall n\geq n_{0}, \end{aligned}$$

and, in consequence,

$$\begin{aligned} \delta _{n}(p,q):=\alpha _{n}\tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}+ \alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n})>0, \quad \forall n\geq n_{0}. \end{aligned}$$

Then, from Lemma 3.5, we can deduce that

$$\begin{aligned} D_{f_{p}}(v,x_{n+1})\leq D_{f_{p}}(v,x_{n})+ \theta _{n}\bigl(D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr)+ \xi _{n}(p,q). \end{aligned}$$

Since $\xi _{n}(p,q):=\theta _{n}D_{f_{p}}(x_{n},x_{n-1})+ \frac{\kappa _{q}\theta _{n}^{q}}{q}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \|^{q}$, it follows from (3.4) that

$$\begin{aligned} \sum_{n=1}^{\infty}\xi _{n}(p,q)\leq \sum_{n=1}^{\infty} \biggl( \theta _{n}D_{f_{p}}(x_{n},x_{n-1})+ \frac{\kappa _{q}\theta _{n}}{q} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q} \biggr)< \infty . \end{aligned}$$

(3.14)

From (3.14), we also have $\lim_{n\rightarrow \infty}\xi _{n}(p,q)=0$. From Lemma 2.5, we can conclude that $\lim_{n\rightarrow \infty}D_{f_{p}}(v,x_{n})$ exists and

$$\begin{aligned} \sum_{n=1}^{\infty}\bigl[D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr]_{+}< \infty . \end{aligned}$$

Thus, we have $\{D_{f_{p}}(v,x_{n})\}$ is bounded and so $\{x_{n}\}$ is also bounded by (2.4). Moreover, we obtain

$$\begin{aligned} \lim_{n\rightarrow \infty}\bigl[D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr]_{+}=0. \end{aligned}$$

(3.15)

From Lemma 3.5, we see that

$$\begin{aligned} \delta _{n}(p,q)\leq D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n+1})+ \theta _{n}\bigl(D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n-1}) \bigr)+ \xi _{n}(p,q). \end{aligned}$$

Since $\lim_{n\rightarrow \infty}D_{f_{p}}(v,x_{n})$ exists and $\lim_{n\rightarrow \infty}\xi _{n}(p,q)=0$, we have

$$\begin{aligned} \lim_{n\rightarrow \infty}\delta _{n}(p,q)=\lim _{n\rightarrow \infty} \biggl[\alpha _{n}\tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}+\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \biggr]=0. \end{aligned}$$

Consequently,

$$\begin{aligned} \lim_{n\rightarrow \infty} \Vert w_{n} \Vert =\lim _{n\rightarrow \infty} \Vert Tu_{n}-Q_{ \mu _{n}}Tu_{n} \Vert =0\quad \text{and}\quad \lim_{n\rightarrow \infty}D_{f_{p}}(y_{n},R_{ \tau _{n}}y_{n})=0. \end{aligned}$$

By the continuity of $J_{p}^{F}$, we have

$$\begin{aligned} \lim_{n\rightarrow \infty} \bigl\Vert J_{p}^{F}(w_{n}) \bigr\Vert =\lim _{n\rightarrow \infty} \bigl\Vert J_{p}^{F}(Tu_{n}-Q_{\mu _{n}}Tu_{n}) \bigr\Vert =0. \end{aligned}$$

(3.16)

Moreover, we have

$$\begin{aligned} \lim_{n\rightarrow \infty} \Vert y_{n}-R_{\tau _{n}}y_{n} \Vert =0. \end{aligned}$$

(3.17)

By the definition of $y_{n}$, the continuity of $T^{*}$, and from (3.16), we have

$$\begin{aligned} \lim_{n\rightarrow \infty} \bigl\Vert J_{p}^{E}(y_{n})-J_{p}^{E}(u_{n}) \bigr\Vert = \lim_{n\rightarrow \infty}\tau _{n} \bigl\Vert T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \bigr\Vert =0. \end{aligned}$$

(3.18)

On the other hand, by the definition of $u_{n}$ and from (3.4), we obtain

$$\begin{aligned} \lim_{n\rightarrow \infty} \bigl\Vert J_{p}^{E}(u_{n})-J_{p}^{E}(x_{n}) \bigr\Vert ^{q} =& \lim_{n\rightarrow \infty}\theta _{n}^{q} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q} \\ \leq &\lim_{n\rightarrow \infty}\beta ^{q-1}\theta _{n} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert ^{q} \\ =&0. \end{aligned}$$

Thus,

$$\begin{aligned} \lim_{n\rightarrow \infty} \bigl\Vert J_{p}^{E}(u_{n})-J_{p}^{E}(x_{n}) \bigr\Vert =0. \end{aligned}$$

(3.19)

From (3.19), we also have $\lim_{n\rightarrow \infty}\|u_{n}-x_{n}\|=0$ by the uniform continuity of $J_{q}^{E^{*}}$. Since $\{x_{n}\}$ is bounded, there exists a subsequence $\{x_{n_{k}}\}$ of $\{x_{n}\}$ such that $x_{n_{k}}\rightharpoonup w\in E$ and so $u_{n_{k}}\rightharpoonup w$. Put $v_{n}=R_{\tau _{n}}y_{n}$ for all $n\in \mathbb{N}$. From (3.17) and (3.18), we see that

$$\begin{aligned} \bigl\Vert J_{p}^{E}(u_{n})-J_{p}^{E}(v_{n}) \bigr\Vert =& \bigl\Vert J_{p}^{E}(u_{n})-J_{p}^{E}(R_{ \tau _{n}}y_{n}) \bigr\Vert \\ \leq & \bigl\Vert J_{p}^{E}(u_{n})-J_{p}^{E}(y_{n}) \bigr\Vert + \bigl\Vert J_{p}^{E}(y_{n})-J_{p}^{E}(R_{ \tau _{n}}y_{n}) \bigr\Vert \\ \rightarrow &0. \end{aligned}$$

(3.20)

Consequently, $\lim_{n\rightarrow \infty}\|u_{n}-v_{n}\|=0$. Thus,

$$\begin{aligned} \Vert v_{n}-x_{n} \Vert \leq \Vert v_{n}-u_{n} \Vert + \Vert u_{n}-x_{n} \Vert \rightarrow 0. \end{aligned}$$

Using the above inequality, we also obtain $v_{n_{k}}\rightharpoonup w$. Since $R_{\tau _{n}}$ is the resolvent of A for $\tau _{n}>0$, we have

$$\begin{aligned} v_{n}=R_{\tau _{n}}\bigl(J_{q}^{E^{*}} \bigl(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(u_{n}) \bigr)\bigr) \quad \Leftrightarrow \quad &\bigl(J_{p}^{E}+\tau _{n}A\bigr)^{-1}\bigl(J_{p}^{E}(u_{n})- \tau _{n}T^{*}J_{p}^{F}(u_{n}) \bigr)) \\ \quad \Leftrightarrow\quad &J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(u_{n}) \in J_{p}^{E}(v_{n})+ \tau _{n} Av_{n} \\ \quad \Leftrightarrow\quad &\frac{1}{\tau _{n}} \bigl(J_{p}^{E}(u_{n})-J_{p}^{E}(v_{n})- \tau _{n}T^{*}J_{p}^{F}(w_{n}) \bigr)\in Av_{n}. \end{aligned}$$

Replacing n by $n_{k}$ and using the fact that A is monotone, thus

$$\begin{aligned} \biggl\langle s-v_{n_{k}},s^{*}-\frac{1}{\tau _{n_{k}}} \bigl(J_{p}^{E}(u_{n_{k}})-J_{p}^{E}(v_{n_{k}})- \tau _{n_{k}}T^{*}J_{p}^{F}(w_{n_{k}}) \bigr) \biggr\rangle \geq 0 \end{aligned}$$

for all $(s,s^{*})\in A$. Now, $T^{*}$ is continuous, which is due to the fact that $T^{*}$ is a bounded and linear operator. Then, from (3.16), (3.20), and $\lim_{k\rightarrow \infty}\tau _{n_{k}}=\tau >0$, we obtain $\langle s-w,s^{*}-0\rangle \geq 0$ for all $(s,s^{*})\in A$. Note that A is maximal monotone, thus $w\in A^{-1}0$. On the other hand, we know that T is also continuous. This fact, together with $\|u_{n_{k}}-x_{n_{k}}\|\rightarrow 0$ and $\|Tu_{n_{k}}-Q_{\mu _{n_{k}}}Tu_{n_{k}}\|\rightarrow 0$, means that we have $Tu_{n_{k}}\rightharpoonup Tw$ and $Q_{\mu _{n_{k}}}u_{n_{k}}\rightharpoonup Tw$. Since $Q_{\mu _{n}}$ is the metric resolvent of B for $\mu _{n}>0$, we have

$$\begin{aligned} \frac{J_{p}^{F}(Tu_{n}-Q_{\mu _{n}}Tu_{n})}{\mu _{n}}\in BQ_{\mu _{n}}Tu_{n} \end{aligned}$$

for all $n\in \mathbb{N}$. Replacing n by $n_{k}$, it then follows from the monotonicity of B that

$$\begin{aligned} \biggl\langle u-Q_{\mu _{n_{k}}}Tu_{n_{k}},u^{*}- \frac{J_{p}^{F}(Tu_{n_{k}}-Q_{\mu _{n_{k}}}Tu_{n_{k}})}{\mu _{n_{k}}} \biggr\rangle \geq 0 \end{aligned}$$

for all $(u,u^{*})\in B$. Then, from (3.16) and $\liminf_{k\rightarrow \infty}\mu _{n_{k}}>0$, we obtain $\langle u-Tw,u^{*}-0\rangle \geq 0$ for all $(u,u^{*})\in B$. Note that B is maximal monotone, thus $Tw\in B^{-1}0$ and so $w\in T^{-1}(B^{-1}0)$. We thus obtain $w\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)$. In order to prove the weak convergence of the sequence $\{x_{n}\}$, it is sufficient to show that $\{x_{n}\}$ has a unique weak limit point in Ω. In this case, we can assume that $\{x_{m_{k}}\}$ is another subsequence of $\{x_{n}\}$ such that $x_{m_{k}}\rightharpoonup w'\in \Omega $. Note that $x_{n_{k}}\rightharpoonup w\in \Omega $. Indeed, suppose by contradiction that with $w'\neq w$. Since $\lim_{n\rightarrow \infty}D_{f_{p}}(v,x_{n})$ exists for any $v\in \Omega $, it then follows from (2.2) and the weak sequential continuity of $J_{p}^{E}$ that

$$\begin{aligned} \lim_{n\rightarrow \infty}D_{f_{p}}(w,x_{n}) =& \lim_{k\rightarrow \infty}D_{f_{p}}(w,x_{m_{k}})=\liminf _{k\rightarrow \infty}D_{f_{p}}(w,x_{m_{k}}) \\ =&\liminf_{k\rightarrow \infty} \bigl(D_{f_{p}} \bigl(w,w'\bigr)+D_{f_{p}}\bigl(w',x_{m_{k}} \bigr)+ \bigl\langle w-w',J_{p}^{E} \bigl(w'\bigr)-J_{p}^{E}(x_{m_{k}}) \bigr\rangle \bigr) \\ \geq &\liminf_{k\rightarrow \infty}D_{f_{p}}\bigl(w,w' \bigr)+\liminf_{k \rightarrow \infty}D_{f_{p}}\bigl(w',x_{m_{k}} \bigr) \\ &{}+\liminf_{k\rightarrow \infty}\bigl\langle w-w',J_{p}^{E} \bigl(w'\bigr)-J_{p}^{E}(x_{m_{k}}) \bigr\rangle \\ >&\liminf_{k\rightarrow \infty}D_{f_{p}}\bigl(w',x_{m_{k}} \bigr) \\ =&\lim_{n\rightarrow \infty}D_{f_{p}}\bigl(w',x_{n} \bigr). \end{aligned}$$

(3.21)

In the same way as above, we can show that

$$\begin{aligned} \lim_{n\rightarrow \infty}D_{f_{p}}\bigl(w',x_{n} \bigr)>\lim_{n\rightarrow \infty}D_{f_{p}}(w,x_{n}), \end{aligned}$$

which is a contradiction with (3.21). Hence, $w=w'$ and therefore, the sequence $\{x_{n}\}$ converges weakly to a point in Ω. This finishes the proof. □

Next, we present a second algorithm that is slightly different from the first proposed algorithm.

Algorithm 2

(Relaxed inertial self-adaptive algorithm for the split common null-point problem)

Step 0. Given $\tau _{1}>0$, $\beta \in (0,1)$ and $\mu \in (0,\frac{q}{\kappa _{q}} )$. Choose $\{s_{n}\}\subset [0,\infty )$ such that $\sum_{n=1}^{\infty}s_{n}<\infty $ and $\{\beta _{n}\}\subset (0,\infty )$ such that $\sum_{n=1}^{\infty}\beta _{n}<\infty $. Let $x_{0},x_{1}\in E$ be arbitrary and calculate $x_{n+1}$ as follows:
Step 1. Given the iterates $x_{n-1}$ and $x_{n}$ ($n\geq 1$). Choose $\theta _{n}$ such that $0\leq \theta _{n}\leq \bar{\theta}_{n}$, where
$$\begin{aligned} \begin{aligned} \bar{\theta}_{n}= \textstyle\begin{cases} \min \{\beta ,\frac {\beta _{n}}{ \Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \Vert } \}, & \text{if } x_{n}\neq x_{n-1},\\ \beta , &\text{otherwise.} \end{cases}\displaystyle \end{aligned} \end{aligned}$$
(3.22)
Step 2. Compute
$$ \textstyle\begin{cases} u_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})+\theta _{n}(J_{p}^{E}(x_{n-1})-J_{p}^{E}(x_{n}))),\\ y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}),\\ x_{n+1}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(u_{n})+\alpha _{n}J_{p}^{E}(R_{\tau _{n}}y_{n})), \end{cases} $$
(3.23)
where $\tau _{n}$ is defined the same as in (3.3).

Remark 3.7

It should be noted that Algorithm 2 is slightly different from Algorithm 1 but $\bar{\theta}_{n}$ of this algorithm is simpler to compute than $\bar{\theta}_{n}$ of Algorithm 1, that is, it is chosen without any prior knowledge of the Bregman distance $D_{f_{p}}$ at two points $x_{n}$ and $x_{n-1}$, which is flexible and easy to implement in solving the problem. This is why we call the technique proposed in this case a “relaxed inertial algorithm”.

Remark 3.8

From (3.22), we observe that $0\leq \theta _{n}\leq \beta <1$ for all $n\geq 1$. Also, we obtain $\theta _{n}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1})\|\leq \beta _{n}$ for all $n\geq 1$. Since $\sum_{n=1}^{\infty}\beta _{n}<\infty $, we have $\sum_{n=1}^{\infty}\theta _{n}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \|<\infty $ and so

$$\begin{aligned} \lim_{n\rightarrow \infty}\theta _{n} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert =0. \end{aligned}$$

Theorem 3.9

Let $\{x_{n}\}$ be a sequence generated by Algorithm 2. Suppose, in addition, that $J_{p}^{E}$ is weakly sequentially continuous on E. Then, $\{x_{n}\}$ converges weakly to a point in Ω.

Proof

Let $v\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)$. Similarly, by using the same argument as in the proof of Theorem 3.6, we have

$$\begin{aligned} D_{f_{p}}(v,y_{n})\leq D_{f_{p}}(v,u_{n})-\tau _{n} \biggl(1- \frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}. \end{aligned}$$

(3.24)

By the definition of $u_{n}$ in (3.23), we see that

$$\begin{aligned} D_{f_{p}}(v,u_{n}) =&D_{f_{p}} \bigl(v,J_{q}^{E^{*}}\bigl((1-\theta _{n})J_{p}^{E}(x_{n})+ \theta _{n} J_{p}^{E}(x_{n-1}) \bigr)\bigr) \\ \leq &(1-\theta _{n})D_{f_{p}}(v,x_{n})+ \theta _{n}D_{f_{p}}(v,x_{n-1}). \end{aligned}$$

(3.25)

Thus, we have

$$\begin{aligned} D_{f_{p}}(v,x_{n+1}) \leq &(1-\alpha _{n})D_{f_{p}}(v,u_{n})+\alpha _{n} D_{f_{p}}(v,y_{n})-\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ \leq &(1-\alpha _{n})D_{f_{p}}(v,u_{n})+ \alpha _{n} \biggl[D_{f_{p}}(v,u_{n})- \tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p} \biggr] \\ &{}- \alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ =&D_{f_{p}}(v,u_{n})-\alpha _{n}\tau _{n} \biggl(1- \frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}-\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ \leq &(1-\theta _{n})D_{f_{p}}(v,x_{n})+ \theta _{n}D_{f_{p}}(v,x_{n-1})- \alpha _{n}\tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p} \\ &-\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}). \end{aligned}$$

(3.26)

From Theorem 3.6, we know that

$$\begin{aligned} \delta _{n}(p,q):=\alpha _{n}\tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl(\frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}+ \alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n})>0, \quad \forall n\geq n_{0}. \end{aligned}$$

Thus, we can deduce that

$$\begin{aligned} D_{f_{p}}(v,x_{n+1}) \leq &(1-\theta _{n})D_{f_{p}}(v,x_{n})+ \theta _{n}D_{f_{p}}(v,x_{n-1}) \\ \leq &\max \bigl\{ D_{f_{p}}(v,x_{n}),D_{f_{p}}(v,x_{n-1}) \bigr\} \\ \leq &\cdots \max \bigl\{ D_{f_{p}}(v,x_{n_{0}}),D_{f_{p}}(v,x_{n_{0}-1}) \bigr\} . \end{aligned}$$

Hence, $\{D_{f_{p}}(v,x_{n})\}$ is bounded. From (2.4), we also obtain $\{x_{n}\}$ is bounded. From (3.26), we have

$$\begin{aligned} D_{f_{p}}(v,x_{n+1}) \leq &D_{f_{p}}(v,x_{n})+\theta _{n} \bigl[D_{f_{p}}(v,x_{n-1})-D_{f_{p}}(v,x_{n}) \bigr]_{+} \\ &{}- \alpha _{n}\tau _{n} \biggl(1- \frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p} \\ &{}-\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}), \end{aligned}$$

(3.27)

which implies that

$$\begin{aligned} D_{f_{p}}(v,x_{n+1})\leq D_{f_{p}}(v,x_{n})+\theta _{n} \bigl[D_{f_{p}}(v,x_{n-1})-D_{f_{p}}(v,x_{n}) \bigr]_{+} \end{aligned}$$

(3.28)

for all $n\geq n_{0}$. Using (2.2), we see that

$$\begin{aligned} \theta _{n}\bigl[D_{f_{p}}(v,x_{n-1})-D_{f_{p}}(v,x_{n}) \bigr]_{+} =&-\theta _{n}D_{f_{p}}(x_{n-1},x_{n})+ \theta _{n}\bigl\langle v-x_{n-1},J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle \\ \leq &\theta _{n}\bigl\langle v-x_{n-1},J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\rangle \\ \leq &\theta _{n} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert M, \end{aligned}$$

where $M:=\sup_{n\geq n_{0}}\{\|x_{n-1}-v\|\}$. From Remark 3.8, we can deduce that

$$\begin{aligned} \sum_{n=n_{0}}^{\infty}\theta _{n}\bigl[D_{f_{p}}(v,x_{n-1})-D_{f_{p}}(v,x_{n}) \bigr]_{+} \leq \sum_{n=n_{0}}^{\infty} \theta _{n} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert M< \infty . \end{aligned}$$

(3.29)

Thus, from (3.28) with Lemma 2.6, we have the limit of $\{D_{f_{p}}(v,x_{n})\}$ exists. Note that (3.29) implies that

$$\begin{aligned} \lim_{n\rightarrow \infty}\theta _{n} \bigl[D_{f_{p}}(v,x_{n-1})-D_{f_{p}}(v,x_{n}) \bigr]_{+}=0. \end{aligned}$$

(3.30)

From (3.27), we have

$$\begin{aligned} &\alpha _{n}\tau _{n} \biggl(1-\frac{\kappa _{q}\mu}{q} \biggl( \frac{\tau _{n}}{\tau _{n+1}} \biggr)^{q-1} \biggr) \Vert w_{n} \Vert ^{p}+\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n}) \\ &\quad \leq D_{f_{p}}(v,x_{n})-D_{f_{p}}(v,x_{n+1})+ \theta _{n}\bigl[D_{f_{p}}(v,x_{n-1})-D_{f_{p}}(v,x_{n}) \bigr]_{+}. \end{aligned}$$

This implies by (3.30) that

$$\begin{aligned} \lim_{n\rightarrow \infty} \Vert Tu_{n}-Q_{\mu _{n}}Tu_{n} \Vert =0\quad \text{and}\quad \lim_{n\rightarrow \infty}D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n})=0. \end{aligned}$$

Thus,

$$\begin{aligned} \lim_{n\rightarrow \infty} \bigl\Vert J_{p}^{F}(Tu_{n}-Q_{\mu _{n}}Tu_{n}) \bigr\Vert =0\quad \text{and}\quad \lim_{n\rightarrow \infty} \Vert y_{n}-R_{\tau _{n}}y_{n} \Vert =0. \end{aligned}$$

By the definition of $y_{n}$, we can show that $\lim_{n\rightarrow \infty}\|J_{p}^{E}(y_{n})-J_{p}^{E}(u_{n})\|=0$. Also, by the definition of $u_{n}$, we have

$$\begin{aligned} \lim_{n\rightarrow \infty} \bigl\Vert J_{p}^{E}(u_{n})-J_{p}^{E}(x_{n}) \bigr\Vert = \lim_{n\rightarrow \infty}\theta _{n} \bigl\Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \bigr\Vert =0. \end{aligned}$$

Consequently, $\lim_{n\rightarrow \infty}\|u_{n}-x_{n}\|=0$. Since the rest of the proof is the same as the proof of Theorem 3.6, we omit the details here. □

Next, we apply our algorithms to solve the split feasibility problem in Banach spaces.

Let C and Q be nonempty, closed, and convex subsets of E and F, respectively. Let T be a bounded linear operator with its adjoint operator $T^{*}$ and $T\neq 0$. We consider the split feasibility problem (SFP):

$$\begin{aligned} \text{find}\quad w\in C\quad \text{such that}\quad Tw\in Q. \end{aligned}$$

(3.31)

We denote the set of solutions of SFP by $\Gamma :=C\cap T^{-1}(Q)$. The SFP was first introduced in 1994 by Censor and Elfving [15] for inverse problems of intensity-modulated radiation therapy (IMRT) in the field of medical care (see [12, 14]).

Setting $A:=\partial \delta _{C}$ and $B:=\partial \delta _{Q}$ in Theorems 3.6 and 3.9, we obtain the following results.

Corollary 3.10

Let E, F, C, Q, T, and $T^{*}$ be the same as mentioned above. Let $\tau _{1}$, β, μ, $\{s_{n}\}$, $\{\beta _{n}\}$, $\{\theta _{n}\}$, and $\{\bar{\theta}_{n}\}$ be the same as in Algorithm 1, where $\{\bar{\theta}_{n}\}$ is defined the same as in (3.1). Suppose that $\Gamma \neq \emptyset $. Let $x_{0},x_{1}\in E$ and $\{x_{n}\}$ be a sequence generated by

$$ \textstyle\begin{cases} u_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})+\theta _{n}(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}))), \\ y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-P_{Q})Tu_{n}), \\ x_{n+1}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(u_{n})+\alpha _{n}J_{p}^{E}( \Pi _{C}^{f_{p}}y_{n})), \\ \tau _{n+1}= \textstyle\begin{cases} \min \{ ( \frac {\mu \Vert (I-P_{Q})Tu_{n}) \Vert ^{p}}{ \Vert T^{*}J_{p}^{F}(I-P_{Q})Tu_{n} \Vert ^{q}} )^{\frac{1}{q-1}},\tau _{n}+s_{n} \} ,& \textit{if } T^{*}J_{p}^{F}(I-P_{Q})Tu_{n}\neq 0, \\ \tau _{n}+s_{n},& \textit{otherwise}. \end{cases}\displaystyle \end{cases} $$

Suppose, in addition, that $J_{p}^{E}$ is weakly sequentially continuous on E. Then, the sequence $\{x_{n}\}$ converges weakly to a point in Γ.

Corollary 3.11

Let E, F, C, Q, T, and $T^{*}$ be the same as mentioned above. Let $\tau _{1}$, β, μ, $\{s_{n}\}$, $\{\beta _{n}\}$, $\{\theta _{n}\}$, and $\{\bar{\theta}_{n}\}$ be the same as in Algorithm 2, where $\{\bar{\theta}_{n}\}$ is defined the same as in (3.22). Suppose that $\Gamma \neq \emptyset $. Let $x_{0},x_{1}\in E$ and $\{x_{n}\}$ be a sequence generated by

$$ \textstyle\begin{cases} u_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})+\theta _{n}(J_{p}^{E}(x_{n-1})-J_{p}^{E}(x_{n}))), \\ y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-P_{Q})Tu_{n}), \\ x_{n+1}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(u_{n})+\alpha _{n}J_{p}^{E}( \Pi _{C}^{f_{p}}y_{n})), \\ \tau _{n+1}= \textstyle\begin{cases} \min \{ ( \frac {\mu \Vert (I-P_{Q})Tu_{n}) \Vert ^{p}}{ \Vert T^{*}J_{p}^{F}(I-P_{Q})Tu_{n} \Vert ^{q}} )^{\frac{1}{q-1}},\tau _{n}+s_{n} \} ,& \textit{if } T^{*}J_{p}^{F}(I-P_{Q})Tu_{n}\neq 0, \\ \tau _{n}+s_{n} ,& \textit{otherwise}. \end{cases}\displaystyle \end{cases} $$

Suppose, in addition, that $J_{p}^{E}$ is weakly sequentially continuous on E. Then, the sequence $\{x_{n}\}$ converges weakly to a point in Γ.

4 Numerical and experiments results

In this section, we apply our Algorithms 1 and 2 to numerically solve some problems in science and engineering and we also compare the numerical performances with the iterative scheme (1.8) proposed by Tang [43] (namely, Tang Algorithm) and the iterative scheme (1.5) proposed by Suantai et al. [39] (namely, Suantai et al. Algorithm).

Problem 4.1

Split feasibility problem in infinite-dimensional Banach spaces

Let $E=F=\ell _{p}^{0}(\mathbb{R})$ ($1< p<\infty ,p\neq 2$), where $\ell _{p}^{0}(\mathbb{R})$ is the subspace of $\ell _{p}(\mathbb{R})$, that is,

$$\begin{aligned} \ell _{p}^{0}(\mathbb{R})= \Biggl\{ x=(x_{1},x_{2}, \dots ,x_{i}, 0, 0, 0, \dots ), x_{i}\in \mathbb{R} \text{ and } \sum_{i=1}^{\infty} \vert x_{i} \vert ^{p}< \infty \Biggr\} \end{aligned}$$

with norm $\|x\|_{\ell _{p}}= (\sum_{i=1}^{\infty}|x_{i}|^{p} )^{1/p}$ and duality pairing $\langle x,y\rangle =\sum_{i=1}^{\infty}x_{i}y_{i}$ for all $x=(x_{1},x_{2},\dots , x_{i},\dots )\in l_{p}(\mathbb{R})$ and $y=(y_{1},y_{2},\dots ,y_{i},\dots )\in l_{q}(\mathbb{R})$, where $\frac{1}{p}+\frac{1}{q}=1$. The generalized duality mapping $J_{p}^{\ell _{p}(\mathbb{R})}$ is computed by the following explicit formula (see [3]):

$$\begin{aligned} J_{p}^{\ell _{p}(\mathbb{R})}(x)= \bigl( \vert x_{1} \vert ^{p-2}x_{1}, \vert x_{2} \vert ^{p-2}x_{2}, \ldots , \vert x_{i} \vert ^{p-2}x_{i},\ldots \bigr),\quad \forall x\in \ell _{p}( \mathbb{R}). \end{aligned}$$

In this example, let $p=3$, we have $q=\frac{3}{2}$. Then, the smoothness constant $\kappa _{q}\approx 1.3065$. Let $C=\{x\in \ell _{3}^{0}(\mathbb{R}):\|x\|_{\ell _{3}}\leq 1\}$ and $Q=\{x\in \ell _{3}^{0}(\mathbb{R}):\langle x,a\rangle \leq 1\}$, where $a:=(1,1,\dots ,1, 0, 0, 0,\dots )\in \ell _{3/2}^{0}(\mathbb{R})$. Define an operator $Tx=\frac{x}{2}$ with its adjoint $T^{*}=T$ and $\|T\|=\frac{1}{2}$. In this experiment, we only perform the numerical tests of our Algorithms 1, 2, and Suantai et al. Algorithm [39] since Tang Algorithm [43] cannot be implemented in $\ell _{3}(\mathbb{R})$. For Algorithms 1 and 2, we set $\tau _{1}=1.99$, $\beta =0.75$, $\mu =10^{-5}$, $s_{n}=\frac{1}{(n+1)^{4}}$, $\alpha _{n}=0.1$, and $\beta _{n}=\frac{1}{(n+10)^{5}}$. For Suantai et al. Algorithm [39], we set $\lambda _{n}=10^{-5} $, $\alpha _{n}=\frac{1}{n+1}$, $\beta _{n}=\frac{n}{n+1}$, and $u_{n}= (\frac{1}{n^{2}}, \frac{1}{n^{2}}, \frac{1}{n^{2}}, 0,0,0, \dots )^{\mathsf{T}}$. The initial points $x_{0}$ and $x_{1}$ are generated randomly in $\ell _{p}^{0}(\mathbb{R})$. We use $E_{n}=\|x_{n+1}-x_{n}\|_{l_{3}}< 10^{-6}$ to terminate iterations for all algorithms. To test the robustness of each algorithm, we run the experiment several times and choose the best four tests of sequences generated by each algorithm. The numerical results are presented in Figs. 1 and 2.

Problem 4.2

[39] Split minimization problem in finite-dimensional spaces

Let $E=F=\mathbb{R}^{3}$. For each $x\in \mathbb{R}^{3}$, let $f,g: \mathbb{R}^{3} \rightarrow (-\infty ,+\infty ]$ be defined by

$$\begin{aligned} f(x)=20 \Vert x \Vert ^{2}+(33,14,-95)x+40 \end{aligned}$$

and

$$\begin{aligned} g(x)=\frac{1}{2} \Vert Lx-y \Vert , \end{aligned}$$

where $L = (\begin{array}{c} 1 & 0 & 2 \\ - 1 & 3 & 0 \\ 2 & - 2 & 4 \end{array})$ and $y = (\begin{array}{c} 3 \\ 2 \\ 0 \end{array})$ . Let $T = (\begin{array}{c} 1 & - 2 & 3 \\ 0 & 5 & 2 \\ 2 & 1 & 0 \end{array})$ . In this case, the split common null-point problem becomes the split minimization problem, that is, find $w\in (\partial f)^{-1}0\cap T^{-1}(\partial g)^{-1}0$. Note that $T^{*} = (\begin{array}{c} 1 & 0 & 2 \\ - 2 & 5 & 1 \\ 3 & 2 & 0 \end{array})$ and $\|T\|^{2}$ is the largest eigenvalue of $T^{*}T$. In this experiment, we compare the numerical performances of our Algorithms 1 and 2 with Tang Algorithm [43] and Suantai et al. Algorithm [39]. For Algorithms 1 and 2, we set $\tau _{1}=1.99$, $\beta =0.75$, $\mu =0.1$, $\alpha _{n}=0.1$, $s_{n}= \frac{1}{(n+1)^{4}}$, $\beta _{n}=\frac{1}{(n+1)^{1.1}}$, and $\tau _{n}=\mu _{n}=0.01$. For Tang Algorithm [43], we set $\alpha =0.75$, $\rho _{n}=3-\frac{1}{n+1}$, $\epsilon _{n}=\frac{1}{(n+1)^{1.1}}$, and $r=\mu =10^{-5}$. For Suantai et al. Algorithm [39], we set $\alpha _{n}=\frac{1}{n+1}$, $\beta _{n}=0.5$, $r_{n}=1$, $\lambda _{n}=0.01$, and $u_{n}=(\frac{1}{n^{2}}, \frac{1}{n^{2}}, \frac{1}{n^{2}})^{ \mathsf{T}}$. The initial points $x_{0}$ and $x_{1}$ are generated randomly in $\mathbb{R}^{3}$. We use $E_{n}=\|x_{n+1}-x_{n}\|< 10^{-5}$ to terminate iterations for all algorithms. To test the robustness of each algorithm, we run the experiment several times and choose the best four testes of sequences generated by each algorithm. The numerical results are presented in Figs. 3 and 4.

Problem 4.3

Signal-recovery problem

In signal processing, compressed sensing involves the recovery of a “sparse signal” from measured data, aiming to reconstruct the original signal using fewer measurements (see, e.g [27, 47]). In this context, we can model the compressed sensing as the following uncertain linear system:

$$\begin{aligned} y=Tx+b, \end{aligned}$$

(4.1)

where $x\in \mathbb{R}^{N}$ is a K-sparse signal K ($K\ll N$), to be recovered, $y\in \mathbb{R}^{M}$ is the observed or measured data with noisy b and $T : \mathbb{R}^{N}\rightarrow \mathbb{R}^{M}$ is a bounded linear operator. It is known that the above problem can be seen as the following LASSO problem [46]:

$$\begin{aligned} \min_{x\in \mathbb{R}^{N}}\frac{1}{2} \Vert Tx-y \Vert ^{2}\quad \text{subject to}\quad \Vert x \Vert _{1}\leq t, \end{aligned}$$

(4.2)

where $t>0$ is a given constant and $\|\cdot \|_{1}$ is the $\ell _{1}$ norm. If $C=\{x\in \mathbb{R}^{N}:\|x\|_{1}\leq t\}$ and $Q=\{y\}$, then (4.2) is a particular case of the SFP (3.31) in the finite-dimensional spaces.

We generated a sparse signal $x\in \mathbb{R}^{N}$ with K nonzero entries having a of length $N=2048$ and made $M=1024$ observations. The values of the sparse signal are sampled uniformly from the interval $[-1,1]$. The observation y is generated from Gaussian noise of variance 10⁻⁴. The matrix $T\in \mathbb{R}^{M\times N}$ is generated from a normal distribution with mean zero and one variance. Additionally, the initial signals $x_{0}=x_{1}=T^{*}(Tx-y)$. In this experiment, we set $\tau _{1}=1.99$, $\beta =0.75$, $\mu =10^{-5}$, $\alpha _{n}=0.1$, $s_{n}= \frac{1}{(n+1)^{4}}$, $\beta _{n}=\frac{1}{(n+1)^{1.1}}$, and $\tau _{n}=\mu _{n}=0.01$ in Algorithms 1 and 2, we set $\alpha =0.75$, $\rho _{n}=0.5$, $\epsilon _{n}=\frac{1}{(n+1)^{1.1}}$, and $r=\mu =0.001$ in Tang Algorithm [43] and we set $\alpha _{n}=\beta _{n}=\frac{1}{n+1}$, $u_{n}=Tx-y$, and $\lambda _{n}=0.001$ in Suantai et al. Algorithm [39]. We consider five different tests for the spikes $K\in \{10,20,30,40,50\}$. Our stopping criterion is $E_{n}=\|x_{n+1}-x_{n}\|<10^{-7}$. The results of the numerical simulations are presented in Figs. 5–9.

Remark 4.4

We observe from the numerical simulations presented in Figs. 5–9 that our proposed Algorithms 1 and 2 outperform Tang Algorithm [43] and Suantai et al. Algorithm [39] in the sense that they satisfy the stopping criteria in fewer iterations and less computational time in the signal-recovery tests. Furthermore, we observe that while Tang Algorithm [43] requires fewer iterations to satisfy the stopping criteria compared to Suantai et al. Algorithm [39], the reconstructed signal by Tang Algorithm [43] is NOT very close to the original signal compared to that reconstructed via Suantai et al. Algorithm [39].

Problem 4.5

Data classifications

In this example, we apply our algorithms to data-classification problems, which are based on a learning technique called the extreme learning machine (ELM). Let $\mathcal{U}=\{(x_{n}, y_{n}): x_{n}\in \mathbb{R}^{N}, y_{n}\in \mathbb{R}^{M}, n=1,2,3, \ldots, K\}$ be a training set of K distinct samples, $x_{n}$ is an input training data and $y_{n}$ is a training target. For the output of ELM with a single hidden layer at the ith hidden node is $h_{i}(x)=U(\langle a_{i},x_{n}\rangle +b_{i})$, where U is an activation function, $a_{i}$ is the weight at the ith hidden node, and $b_{i}$ is the bias at the ith hidden node. The output function with L hidden nodes is the single hidden-layer feedforward neural networks (SLFNs)

$$\begin{aligned} O_{n}=\sum_{i=1}^{L} \omega _{i}h_{i}(x_{n}), \end{aligned}$$

where $\omega _{i}$ is the optimal output weight at the ith hidden node. The hidden-layer output matrix T is defined by

$$\begin{aligned} T= \begin{bmatrix} U(\langle a_{1},x_{1}\rangle +b_{1}) &\cdots & U(\langle a_{L},x_{1} \rangle +b_{L}) \\ \vdots & \ddots & \vdots \\ U(\langle a_{1},x_{K}\rangle +b_{1}) & \cdots & U(\langle a_{L},x_{K} \rangle +b_{L}) \end{bmatrix}. \end{aligned}$$

The main aim of ELM is to calculate an optimal weight $\omega =(\omega _{1},\omega _{2},\ldots,\omega _{L})^{\mathsf{T}}$ such that $T\omega =b$, where $b=(t_{1},t_{2},\ldots,t_{K})^{\mathsf{T}}$ is the training target data. A successful model used to find the solution ω can be translated into the following convex constraint minimization problem:

$$\begin{aligned} \min_{\omega \in \mathbb{R}^{L}}\frac{1}{2} \Vert T \omega -b \Vert ^{2}\quad \text{subject to}\quad \Vert \omega \Vert _{1}\leq \xi , \end{aligned}$$

(4.3)

where $\xi >0$ is a given constant. If $C=\{\omega \in \mathbb{R}^{L}: \|\omega \|_{1}\leq \xi \}$ and $Q=\{b\}$, then (4.3) is a particular case of the SFP (3.31) in the finite-dimensional spaces.

The binary crossentropy loss function along with sigmoid activation function for binary classification calculates the loss of an example by computing the following average:

$$\begin{aligned} \mathrm{Loss}=-\frac{1}{J}\sum _{j=1}^{J} \bigl( y_{j}\log \hat{y}_{j}+(1-y_{j}) \log (1-\hat{y}_{j}) \bigr), \end{aligned}$$

(4.4)

where $\hat{y}_{j}$ is the jth scalar value in the model output, $y_{j}$ is the corresponding target value, and J is the number of scalar values in the model output.

The performance evaluation in classification can be justified by precision and recall. The Recall/True Positive Rate can be defined as the level of accuracy of predictions in positive classes and the percentage of the number of predictions that are correct on the positive observations. Then, calculate the accuracy, prediction, and F1-score using the following standard criteria [22]:

(1)
Precision = $\frac{{\mathrm{TP}}}{{\mathrm{TP}+\mathrm{FP}}}\times 100\%$;
(2)
Recall = $\frac{{\mathrm{TP}}}{{\mathrm{TP}+\mathrm{FN}}}\times 100\%$;
(3)
Accuracy = $\frac{{\mathrm{TP}+\mathrm{TN}}}{{\mathrm{TP}+\mathrm{FP}+\mathrm{TN}+\mathrm{FN}}}\times 100\%$;
(4)
F1-score = $\frac{2\times \text{Precision}\times \text{Recall}}{\text{Precision}+\text{Recall}}$,

where a confusion matrix for original and predicted classes is shown in terms of TP := True Positive, TN := True Negative, FP := False Positive, and FN := False Negative.

Next, we consider the following two datasets:

Dataset 1

UCI Machine Learning Heart Disease dataset [20]. This dataset contains 14 attributes and 303 records. This dataset contains the attributes: Age, Gender, CP, Trestbps, Chol, Fbs, Restecg, Thalach, Exang, Oldpeak, Slope, Ca, Thal, and Num (the predicted attribute). The dataset consists of 138 normal instances versus 165 abnormal instances.

Dataset 2

PIMA Indians diabetes dataset [1]. The dataset contains 768 pregnant female patients of which 500 were nondiabetics and 268 that were diabetics. This dataset contains 9 attributes: Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes Pedigree Function, Age, and Outcome (the predicted attribute).

In particular, we apply our algorithms to the optimized weight parameter in training data for machine learning by using 5-fold crossvalidation [25] in the extreme learning machine (ELM).

For Dataset 1, we start computation by setting the activation function as a sigmoid, hidden nodes $L = 80$, regularization parameter $\xi =10$, $x_{0}={\mathbf{{1}}}:=(\underbrace{1,1,\ldots ,1)}_{N}\in \mathbb{R}^{N}$, $x_{1}={\mathbf{{0}}}:=(\underbrace{0,0,\ldots ,0)}_{N}\in \mathbb{R}^{N}$, $\tau _{1}=1$, $s_{n}=0$, $\beta _{n}=\frac{1}{(n+1)^{10}}$, and $\alpha _{n}=\frac{n+2}{2n+1}$. The stopping criteria is the number of iterations 300. We compare the performance of Algorithms 1 and 2 with different parameters β for Dataset 1, as seen in Tables 1 and 2.

Table 1 Numerical results of β for Algorithm 1 when $\mu =0.9$ for Dataset 1

Full size table

Table 2 Numerical results of β for Algorithm 2 when $\mu =0.9$ for Dataset 1

Full size table

From Table 1, we see that β increases from 0.1 to 0.9. The training loss and test loss decrease, it appears that $\beta =0.9$ performs better for Algorithm 1.

From Table 2, we see that the training loss and test loss increase when β increases, it appears that $\beta =0.1$ performs better for Algorithm 2. Also, we compare the performance of Algorithms 1 and 2 with different parameters μ for Dataset 1, as seen in Tables 3 and 4.

Table 3 Numerical results of μ for Algorithm 1 when $\beta =0.9$ for Dataset 1

Full size table

Table 4 Numerical results of μ for Algorithm 2 when $\beta =0.1$ for Dataset 1

Full size table

From Tables 3 and 4, we see that the training loss and test loss decrease when μ increases, it show that $\mu =0.9$ gives the highly improved the performance of Algorithm 1 and Algorithm 2 for Dataset 1.

Next, we compare the performance of our algorithms with Tang Algorithm [43] and Suantai et al. Algorithm [39] for Dataset 1. For Algorithms 1 and 2, we set $\tau _{1}=1$, $\mu =0.9$, $s_{n}=0$, and $\beta _{n}=\frac{1}{(n+1)^{10}}$. Moreover, we set $\beta =0.9$ and $\beta =0.1$ for Algorithms 1 and 2, respectively. For Tang Algorithm [43], we set $\alpha =0.6$, $\rho _{n}=3.5$, and $\epsilon _{n}=\frac{1}{n+1}$. For Suantai et al. Algorithm [39], we set $\alpha _{n}=\frac{1}{n+1}$, $\beta _{n}=\frac{n-1}{2n}$, $u_{n}={\mathbf{{1}}}\in \mathbb{R}^{N}$, and $\lambda _{n}=\frac{1}{\|T\|^{2}}$.

The comparison of all algorithms is presented in Table 5.

Table 5 The performance of each algorithm for Dataset 1

Full size table

From Table 5, we observe that our Algorithms 1 and 2 have fewer iterations than Tang Algorithm [43] and Suantai et al. Algorithm [39] with the same precision, recall, F1-score, and accuracy. This shows that our algorithms have the highest probability of correctly classifying heart disease compared to other algorithms.

Next, we present graphs of the accuracy and loss of training data and testing data for overfitting of Algorithms 1 and 2 to show that our algorithms have no overfitting in the training Dataset 1.

From Figs. 10 and 11, we see that our Algorithms 1 and 2 have suitably learned the training dataset for Dataset 1.

For Dataset 2, we present the comparison of our algorithms with Tang Algorithm [43] and Suantai et al. Algorithm [39]. We start computation by setting the activation function as sigmoid, hidden nodes $L = 160$. For Algorithms 1 and 2, we set $\tau _{1}=1$, $\mu =0.9$, $s_{n}=0$, and $\beta _{n}=\frac{1}{(n+1)^{10}}$. Moreover, we set $\beta =0.9$ and $\beta =0.1$ for Algorithms 1 and 2, respectively. For Tang Algorithm [43], we set $\alpha =0.6$, $\rho _{n}=3.5$, and $\epsilon _{n}=\frac{1}{(n+1)^{10}}$. For Suantai et al. Algorithm [39], we set $\alpha _{n}=\frac{1}{n+1}$, $\beta _{n}=\frac{n-1}{2n}$, $u_{n}={\mathbf{{1}}}\in \mathbb{R}^{N}$, and $\lambda _{n}=\frac{1}{\|T\|^{2}}$.

The comparison of all algorithms is presented in Table 6.

Table 6 The performance of each algorithm for Dataset 2

Full size table

From Table 6, we see that Algorithms 1 and 2 have the most efficiency in precision, recall, F1-score, and accuracy for Dataset 2. This mean that our algorithms have the highest probability of correctly classifying the PIMA Indians diabetes dataset (Dataset 2) compared to Tang Algorithm [43] and Suantai et al. Algorithm [39].

Next, we present graphs of the accuracy and loss of training data and testing data for overfitting of Algorithms 1 and 2 to show that our algorithms have no overfitting in the training Dataset 2.

From Figs. 12 and 13, we see that Algorithms 1 and 2 have suitably learned the training dataset for Dataset 2.

5 Conclusions

In this paper, we have proposed two inertial self-adaptive algorithms to solve the split common null-point problem for two set-valued mappings in Banach spaces. The step sizes used in our proposed algorithms are adaptively updated without the prior knowledge of the operator norm of the bounded linear operator. We have proved the weak-convergence theorems of the proposed algorithms under suitable conditions in p-uniformly convex, real Banach spaces that are also uniformly smooth. Finally, we have performed experiments to numerically solve some problems in science and engineering, such as, the split feasibility problem, the split minimization problem, signal recovery, and data classifications, and also have compared them with some existing methods to demonstrate the implementability and efficiency of our methods.

Availability of data and materials

Not applicable.

References

https://www.kaggle.com/datasets/whenamancodes/predict-diabities
Agarwal, R.P., O’Regan, D., Sahu, D.R.: Fixed Point Theory for Lipschitzian-Type Mappings with Applications. Springer, Berlin (2009)
MATH Google Scholar
Alber, Y., Ryazantseva, I.: Nonlinear Ill-Posed Problems of Monotone Type. Springer, Dordrecht (2006)
MATH Google Scholar
Alber, Y.I.: Metric and generalized projection operators in Banach spaces: properties and applications. In: Kartsatos, A.G. (ed.) Theory and Applications of Nonlinear Operator of Accretive and Monotone Type, pp. 15–50. Dekker, New York (1996)
Google Scholar
Alofi, A.S., Alsulami, S.M., Takahashi, W.: Strongly convergent iterative method for the split common null point problem in Banach spaces. J. Nonlinear Convex Anal. 17, 311–324 (2016)
MathSciNet MATH Google Scholar
Alvarez, F.: Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert spaces. SIAM J. Optim. 14, 773–782 (2004)
MathSciNet MATH Google Scholar
Alvarez, F., Attouch, H.: An inertial proximal method for monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Var. Anal. 9, 3–11 (2001)
MathSciNet MATH Google Scholar
Anh, P.K., Thong, D.V., Dung, V.T.: A strongly convergent Mann-type inertial algorithm for solving split variational inclusion problems. Optim. Eng. 22, 159–185 (2021)
MathSciNet MATH Google Scholar
Aoyama, K., Kohsaka, F., Takahashi, W.: Three generalizations of firmly nonexpansive mappings: their relations and continuity properties. J. Nonlinear Convex Anal. 10, 131–147 (2009)
MathSciNet MATH Google Scholar
Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42, 596–636 (2003)
MathSciNet MATH Google Scholar
Butnariu, D., Resmerita, E.: Bregman distances, totally convex functions and a method for solving operator equations in Banach spaces. Abstr. Appl. Anal. 2006, Article ID 084919 (2006)
MathSciNet MATH Google Scholar
Byrne, C.: Iterative oblique projection onto convex sets and the split feasibility problem. Inverse Probl. 18, 441–453 (2002)
MathSciNet MATH Google Scholar
Byrne, C., Censor, Y., Gibali, A., Reich, S.: The split common null point problem. J. Nonlinear Convex Anal. 13, 759–775 (2012)
MathSciNet MATH Google Scholar
Censor, Y., Bortfeld, T., Martin, B., Trofimov, A.: A unified approach for inversion problems in intensity modulated radiation therapy. Phys. Med. Biol. 51, 2353–2365 (2006)
Google Scholar
Censor, Y., Elfving, T.: A multiprojection algorithm using Bregman projections in product space. Numer. Algorithms 8, 221–239 (1994)
MathSciNet MATH Google Scholar
Chen, H.Y.: Weak and strong convergence of inertial algorithms for solving split common fixed point problems. J. Inequal. Appl. 2021, 26 (2021)
MathSciNet MATH Google Scholar
Chuang, C.S.: Hybrid inertial proximal algorithm for the split variational inclusion problem in Hilbert spaces with applications. Optimization 66, 777–792 (2017)
MathSciNet MATH Google Scholar
Cioranescu, I.: Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems. Kluwer Academic, Dordrecht (1990)
MATH Google Scholar
Combette, P.L.: The convex feasibility problem in image recovery. Adv. Imaging Electron Phys. 95, 155–270 (1996)
Google Scholar
Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2019). http://archive.ics.uci.edu/ml
Google Scholar
Duan, P., Zhang, Y., Bu, Q.: New inertial proximal gradient methods for unconstrained convex optimization problems. J. Inequal. Appl. 2020, 255 (2020)
MathSciNet MATH Google Scholar
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, vol. 10, 978-1. Kaufman, Waltham (2012)
MATH Google Scholar
Hanner, O.: On the uniform convexity of $L_{p}$ and $l_{p}$. Ark. Mat. 3, 239–244 (1956)
MathSciNet MATH Google Scholar
Kesornprom, S., Cholamjiak, K.: Proximal type algorithms involving linesearch and inertial technique for split variational inclusion problem in Hilbert spaces with applications. Optimization 68, 2369–2395 (2019)
MathSciNet MATH Google Scholar
Kumari, V.A., Chitra, R.: Classification of diabetes disease using support vector machine. Int. J. Eng. Res. Appl. 3, 1797–1801 (2013)
Google Scholar
Kuo, L.W., Sahu, D.R.: Bregman distance and strong convergence of proximal-type algorithms. Abstr. Appl. Anal. 2013, Article ID 590519 (2006)
MathSciNet MATH Google Scholar
Kutyniok, G.: Theory and applications of compressed sensing. GAMM-Mitt. 36, 79–101 (2013)
MathSciNet MATH Google Scholar
López, G., Martin-Marquez, V., Wang, F., Xu, H.K.: Solving the split feasibility problem without prior knowledge of matrix norms. Inverse Probl. 28, 085004 (2012)
MathSciNet MATH Google Scholar
Moudafi, M.: Split monotone variational inclusions. J. Optim. Theory Appl. 150, 275–283 (2011)
MathSciNet MATH Google Scholar
Ogbuisi, F.U., Shehu, Y., Yao, J.C.: Convergence analysis of new inertial method for the split common null point problem. Optimization 71, 3767–3795 (2022)
MathSciNet MATH Google Scholar
Reich, S.: A weak convergence theorem for the alternating method with Bregman distances. In: Theory and Applications of Nonlinear Operators of Accretive and Monotone Type, pp. 313–318. Dekker, New York (1996)
Google Scholar
Reich, S., Tuyen, T.M., Sunthrayuth, P., Cholamjiak, P.: Two new inertial algorithms for solving variational inequalities in reflexive Banach spaces. Numer. Funct. Anal. Optim. 42, 1954–1984 (2021)
MathSciNet MATH Google Scholar
Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pac. J. Math. 33, 209–216 (1970)
MathSciNet MATH Google Scholar
Schöpfer, F.: Iterative regularization method for the solution of the split feasibility problem in Banach spaces. Ph.D. thesis, Saabrücken (2007)
Schöpfer, F., Louis, A.K., Schuster, T.: Nonlinear iterative methods for linear ill-posed problems in Banach spaces. Inverse Probl. 22, 311–329 (2006)
MathSciNet MATH Google Scholar
Schöpfer, F., Schuster, T., Louis, A.K.: An iterative regularization method for the solution of the split feasibility problem in Banach spaces. Inverse Probl. 24, 055008 (2008)
MathSciNet MATH Google Scholar
Shehu, Y., Iyiola, O.S., Ogbuisi, F.U.: Iterative method with inertial terms for nonexpansive mappings: applications to compressed sensing. Numer. Algorithms 83, 1321–1347 (2020)
MathSciNet MATH Google Scholar
Suantai, S., Pholasa, N., Cholamjiak, P.: Relaxed CQ algorithms involving the inertial technique for multiple-sets split feasibility problems. Rev. R. Acad. Cienc. Exactas Fís. Nat., Ser. A Mat. 113, 1081–1099 (2019)
MathSciNet MATH Google Scholar
Suantai, S., Shehu, Y., Cholamjiak, P.: Nonlinear iterative methods for solving the split common null point problem in Banach spaces. Optim. Methods Softw. 34, 853–874 (2019)
MathSciNet MATH Google Scholar
Takahashi, E.: Nonlinear Functional Analysis. Yokohama Publishers, Yokohama (2000)
MATH Google Scholar
Tan, B., Sunthrayuth, P., Cholamjiak, P., Cho, Y.J.: Modified inertial extragradient methods for finding minimum-norm solution of the variational inequality problem with applications to optimal control problem. Int. J. Comput. Math. 100, 525–545 (2023)
MathSciNet MATH Google Scholar
Tan, K.K., Xu, H.K.: Approximating fixed points of nonexpensive mappings by the Ishikawa iteration process. J. Math. Anal. Appl. 178, 301–308 (1993)
MathSciNet MATH Google Scholar
Tang, Y.: New inertial algorithm for solving split common null point problem in Banach spaces. J. Inequal. Appl. 2019, 17 (2019)
MathSciNet MATH Google Scholar
Tang, Y., Sunthrayuth, P.: An iterative algorithm with inertial technique for solving the split common null point problem in Banach spaces. Asian-Eur. J. Math. 15, 2250120 (2022)
MathSciNet MATH Google Scholar
Tian, M., Jiang, B.N.: Inertial hybrid algorithm for variational inequality problems in Hilbert spaces. J. Inequal. Appl. 2020, 12 (2020)
MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Tropp, J.A.: A mathematical introduction to compressive sensing. Bull. Am. Math. Soc. 54, 151–165 (2017)
MATH Google Scholar
Tuyen, T.M., Cholamjiak, P., Sunthrayuth, P.: A new self-adaptive method for the multiple-sets split common null point problem in Banach spaces. Vietnam J. Math. (2022). https://doi.org/10.1007/s10013-022-00574-3
Article Google Scholar
Xu, H.K.: Inequalities in Banach spaces with applications. Nonlinear Anal., Theory Methods Appl. 16, 1127–1138 (1991)
MathSciNet MATH Google Scholar
Xu, Z.B., Roach, G.F.: Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces. J. Math. Anal. Appl. 157, 189–210 (1991)
MathSciNet MATH Google Scholar
Zhou, Z., Tan, B., Li, S.: Inertial algorithms with adaptive stepsizes for split variational inclusion problems and their applications to signal recovery problem. Math. Methods Appl. Sci. (2023). https://doi.org/10.1002/mma.9436
Article Google Scholar

Download references

Acknowledgements

The authors wish to thank the anonymous referees for their valuable comments and valuable suggestions that have led to considerable improvement of this paper.

Funding

This research was supported by The Science, Research and Innovation Promotion Funding (TSRI) (Grant no. FRB660012/0168). This research block grants was managed under Rajamangala University of Technology Thanyaburi (FRB66E0616I.5).

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Faculty of Science and Technology, Rajamangala University of Technology Thanyaburi (RMUTT), Thanyaburi, Pathumthani, 12110, Thailand
Ratthaprom Promkam, Pongsakorn Sunthrayuth & Ekapak Tanprayoon
Office of Research Administration, Chiang Mai University, Chiang Mai, 50200, Thailand
Suparat Kesornprom

Authors

Ratthaprom Promkam
View author publications
You can also search for this author in PubMed Google Scholar
Pongsakorn Sunthrayuth
View author publications
You can also search for this author in PubMed Google Scholar
Suparat Kesornprom
View author publications
You can also search for this author in PubMed Google Scholar
Ekapak Tanprayoon
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: P.S.; Writing-original draft: R.P., P.S.; Formal analysis: E.K., S.K.; Investigation: R.P., P.S.; Software: S.K., R.P.; Review and editing: P.S., R.P.; Project administration: P.S.; Funding: R.P., E.T. All authors have read and approved final version of the manuscript.

Corresponding author

Correspondence to Pongsakorn Sunthrayuth.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Promkam, R., Sunthrayuth, P., Kesornprom, S. et al. New inertial self-adaptive algorithms for the split common null-point problem: application to data classifications. J Inequal Appl 2023, 136 (2023). https://doi.org/10.1186/s13660-023-03049-2

Download citation

Received: 07 July 2023
Accepted: 11 October 2023
Published: 25 October 2023
DOI: https://doi.org/10.1186/s13660-023-03049-2

New inertial self-adaptive algorithms for the split common null-point problem: application to data classifications

Abstract

1 Introduction

2 Preliminaries

Definition 2.1

Remark 2.2

Lemma 2.3

Remark 2.4

Lemma 2.5

Lemma 2.6

3 Main results

Algorithm 1

Remark 3.1

Remark 3.2

Lemma 3.3

Proof

Remark 3.4

Lemma 3.5

Proof

Theorem 3.6

Proof

Algorithm 2

Remark 3.7

Remark 3.8

Theorem 3.9

Proof

Corollary 3.10

Corollary 3.11

4 Numerical and experiments results

Problem 4.1

Problem 4.2

Problem 4.3

Remark 4.4

Problem 4.5

Dataset 1

Dataset 2

5 Conclusions

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Keywords