- Research
- Open access
- Published:
New inertial self-adaptive algorithms for the split common null-point problem: application to data classifications
Journal of Inequalities and Applications volume 2023, Article number: 136 (2023)
Abstract
In this paper, we propose two inertial algorithms with a new self-adaptive step size for approximating a solution of the split common null-point problem in the framework of Banach spaces. The step sizes are adaptively updated over each iteration by a simple process without the prior knowledge of the operator norm of the bounded linear operator. Under suitable conditions, we prove the weak-convergence results for the proposed algorithms in p-uniformly convex and uniformly smooth Banach spaces. Finally, we give several numerical results in both finite- and infinite-dimensional spaces to illustrate the efficiency and advantage of the proposed methods over some existing methods. Also, data classifications of heart diseases and diabetes mellitus are presented as the applications of our methods.
1 Introduction
In this paper, we consider the following split common null-point problem [13] (see also [29]): find \(z\in H_{1}\) such that
where \(A:H_{1}\rightarrow 2^{H_{1}}\) and \(B:H_{2}\rightarrow 2^{H_{2}}\) are set-valued maximal monotone operators, \(T:H_{1}\rightarrow H_{2}\) is a bounded linear operator, and \(H_{1}\) and \(H_{2}\) are real Hilbert spaces. We denote the solution set of the split common null-point problem (1.1) by Ω. The split common null-point problem can be applied to solving many real-life problems, for instance, in practices as a model in intensity-modulated radiation-therapy treatment planning [14, 15] and in sensor networks in computerized tomography and data compression [19]. In addition, the split common null-point problem also generalizes several split-type problems that is the core the modeling of many inverse problems such as the split feasibility problem, the split equilibrium problem, and the split minimization problem as special cases.
Byrne et al. [13] introduced the following iterative scheme for solving the split common null-point problem: for given \(x_{0}\in H_{1}\) and the sequence \(\{x_{n}\}\) generated iteratively by
where \(R_{\tau}=(I+\tau A)^{-1}\) and \(Q_{\mu}=(I+\mu B)^{-1}\) are the resolvent operators of A for \(\tau >0\) and of B for \(\mu >0\), respectively. They proved the weak-convergence theorem for solving the split common null-point problem provided the step size \(\lambda \in (0,\frac{2}{\|T\|^{2}} )\).
Alofi et al. [5] introduced the following iterative scheme based on a modified Halpern’s iteration for solving the split common null-point problem in the case that \(H_{1}\) is a Hilbert space and F is a uniformly convex and smooth Banach space: for given \(x_{1}\in H_{1}\) and the sequence \(\{x_{n}\}\) generated iteratively by
where \(R_{\tau}\) is the resolvent of A for \(\tau >0\) and \(Q_{\mu}\) is the metric resolvent of B for \(\mu >0\), \(\{\tau _{n}\}, \{\mu _{n}\}\subset (0,\infty )\), \(\{\alpha _{n}\}\subset (0,1)\), and \(\{\beta _{n}\}\subset (0,1)\) that satisfies some appropriate assumptions on the parameters, J is the duality mapping on F, T is the bounded linear operator from \(H_{1}\) to F, and \(\{u_{n}\}\) is the sequence in \(H_{1}\) such that \(u_{n}\rightarrow u\). They proved that the sequence \(\{x_{n}\}\) generated by (1.3) converges strongly to a point of Ω provided \(\tau _{n}\) satisfies the following inequality:
for some \(a,b>0\).
Later, Suantai et al. [39] generalized the result of Alofi et al. [5] in the case that E is a p-uniformly convex and uniformly smooth Banach space, and F is a uniformly convex and smooth Banach space. To be more precise, they introduced the following scheme: for given \(x_{1}\in E\) and the sequence \(\{x_{n}\}\) generated iteratively by
where \(J_{p}^{E}\) and \(J_{q}^{E^{*}}\) are the generalized duality mapping of E into \(E^{*}\) and the duality mapping of \(E^{*}\) into E, respectively, where \(1< q\leq 2\leq p<\infty \) with \(\frac{1}{p}+\frac{1}{q}=1\), and T is the bounded linear operator from E to F. They also proved the strong convergence of the sequence \(\{x_{n}\}\) generated by (1.3) to a point of Ω provided \(\tau _{n}\) satisfies the following inequality:
for some \(a,b>0\).
However, several iterative methods involve a step size that requires to compute the norm of the bounded linear operator \(\|T\|\) prior to choosing \(\tau _{n}\). In general, it may not be easy to compute \(\|T\|\). In particular, it makes the algorithms not easily implemented when the computation of \(\|T\|\) is complicated. To overcome this drawback, a new step-size strategy without the prior knowledge of the operator norm of the bounded linear operator has been proposed by López et al. [28]. This method is known as a self-adaptive method that was first used to solve the split feasibility problem when the step-size criterion is independent of the operator norm of the bounded linear operator.
In optimization theory, the inertial technique has been widely used to accelerate the rate of convergence of the algorithms. This technique was motivated by the implicit time discretization of second-order dynamical systems (or a heavy ball with friction). Based on the inertial technique, Alvarez and Attouch [7] proposed the following so-called inertial proximal point algorithm for finding a zero point of a set-valued maximal monotone operator A: for given \(x_{0},x_{1}\in H\) and the sequence \(\{x_{n}\}\) generated iteratively by
where \(R_{\tau _{n}}\) is the resolvent of A and \(x_{n}+\theta _{n}(x_{n}-x_{n-1})\) is called the inertial term. They also proved that the sequence \(\{x_{n}\}\) generated by (1.7) converges weakly to a zero point of A provided \(\{\tau _{n}\}\) is increasing and \(\theta _{n}\in [0,1)\) is chosen so that \(\sum_{n=1}^{\infty}\theta _{n}\|x_{n}-x_{n-1}\|^{2}<\infty \). In recent years, the inertial method was further studied intensively and it also has been used to solve some other optimization problems (see, for example, [16, 21, 32, 37, 38, 41, 45]).
In 2019, Tang [43] proposed the following inertial algorithm for solving the split common null-point problem in the case that \(H_{1}\) is a Hilbert space, and F is a 2-uniformly convex and smooth Banach space: for given \(x_{1}\in H_{1}\) and \(\alpha \in [0,1)\), choose \(\theta _{n}\) such that \(0<\theta _{n}<\bar{\theta}_{n}\), where
where \(\{\epsilon _{n}\}\subset (0,\infty )\) such that \(\sum_{n=1}^{\infty}\epsilon _{n}<\infty \). Compute the sequence \(\{x_{n}\}\) generated iteratively by
with the step size
where \(\{\rho _{n}\}\subset (0,4)\), \(f(u_{n})=\frac{1}{2}\|J(I-Q_{\mu})Tu_{n}\|^{2}\), \(F(u_{n})=T^{*}J(I-Q_{\mu})Tw_{n}\), and \(H(u_{n})=(I-R_{r})u_{n}\). The weak convergence of the sequence \(\{x_{n}\}\) is established without the prior knowledge of the operator norm of the bounded linear operator.
Recently, several inertial algorithms for solving the split common null-point problem in Hilbert spaces have been studied by many authors (see, for example, [8, 17, 24, 30, 51]). However, such methods have been studied in Banach spaces by a few authors (see, for example, [43, 44]).
Inspired and motivated by the works mentioned above, in this paper, we introduce two new inertial self-adaptive algorithms that are based on the classical inertial method and relaxed inertial method for finding a solution of the split common null-point problem in Banach spaces. The weak-convergence theorems are proved without the prior knowledge of the operator norm of the bounded linear operator. We provide numerical implementations to show that our algorithms are efficient and competitive with some related algorithms. Our results are new and complement some previous results in the literature.
The contributions of this paper can be summarized as follows:
-
(1)
The weak-convergence result of iterative scheme (1.9) of Tang [43] is proved in a Hilbert space and a 2-uniformly convex smooth Banach space where this result can only be implemented in \(\ell _{p}\) for \(p\in (1,2]\) exclude the case of \(p>2\). This is limited in practical applications of such a method. In this paper, our results generalize the weak-convergence result of Tang [43] from between two of those spaces to p-uniformly convex and uniformly smooth Banach spaces, and as a result our results can be implemented in \(\ell _{p}\) for \(p>1\).
-
(2)
Even though the step size of the iterative scheme (1.9) of Tang [43] is computed without the prior knowledge of the operator norms it requires calculation of \(\|T^{*}J(I-Q_{\mu})u_{n}\|^{2}\) and \(\|(I-R_{\tau})u_{n}\|^{2}\) in order to choose the step size \(\tau _{n}\). This could be computationally expensive during implementations, especially in the case where the resolvent of A and the metric resolvent of B are difficult to compute. In this paper, our step size \(\tau _{n}\) defined by (3.3), is adaptively updated by a cheap computation without the prior knowledge of the operator norms and only requires us to compute one metric resolvent of B.
-
(3)
For the iterative scheme (1.5) of Suantai et al. [39], the choice of the sequence of step size depends on the bounded linear operator \(\|T\|\), which is a difficult task during the implementation of the algorithm. In this paper, the choice of the sequence of our step size \(\tau _{n}\) defined by (3.3), is independent of the operator norm of the bounded linear operator. As a result, we do not require to calculate the norm \(\|T\|\) in order to choose the step size \(\tau _{n}\), which is easier to implement than such a method.
-
(4)
We use the inertial and relaxed inertial techniques to improve the rate of convergence of our algorithms that makes the algorithms converge faster and computationally more efficient for solving the split common null-point problem in Banach spaces. Note that these inertial techniques in this paper are studied outside Hilbert spaces for solving such a problem.
-
(5)
We present numerical results of our algorithms in Banach spaces to illustrate the efficiency and advantage over iterative scheme (1.5) of Suantai et al. [39] that gives the strong convergence and we also present several numerical results of our algorithms in finite-dimensional spaces. Moreover, we apply our results to data classifications for two datasets of heart diseases and diabetes mellitus.
Our paper is organized as the following four parts. In Sect. 2, we give some of the basic facts and notation that will be used in the paper. In Sect. 3, we propose two new inertial self-adaptive algorithms and prove our convergence results, and finally, in Sect. 4, we present several numerical results to verify the advantages and efficiency of the proposed algorithms.
2 Preliminaries
In this section, we give some definitions and preliminary results that will be used in proving the main results.Throughout this paper, we denote the set of real numbers and the set of positive integers by \(\mathbb{R}\) and \(\mathbb{N}\), respectively. Let E be a real Banach space with norm \(\|\cdot \|\) with its the dual space \(E^{*}\). We denote \(\langle u,j\rangle \) by the value of a functional j in \(E^{*}\) at \(u\in E\), that is, \(\langle u,j\rangle =j(u)\) for all \(u\in E\). We write \(u_{n}\rightarrow x\) to indicate that a sequence \(\{u_{n}\}\) converges strongly to u. Similarly, \(u_{n}\rightharpoonup u\) and \(u_{n}\rightharpoonup ^{*} u\) will symbolize the weak and weak∗ convergence, respectively. Let \(S_{E}=\{u\in E:\|u\|=1\}\) and \(B_{E}=\{u\in E:\|u\|\leq 1\}\) be a unit sphere and unit ball of E, respectively.
Let \(1< q\leq 2\leq p<\infty \) with \(\frac{1}{p}+\frac{1}{q}=1\). The modulus of convexity of E is the function \(\delta _{E}:[0,2]\rightarrow [0,1]\) defined by
The modulus of smoothness of E is the function \(\rho _{E}:[0,\infty )\rightarrow [0,\infty )\) defined by
Definition 2.1
A Banach space E is said to be:
-
(1)
strictly convex if \(\frac{\|u+v\|}{2}<1\) for all \(u,v\in S_{E}\) and \(u\neq v\);
-
(2)
smooth if \(\lim_{t\rightarrow 0}\frac{\|u+tv\|-\|u\|}{t}\) exists for each \(u,v\in S_{E}\);
-
(3)
uniformly convex if \(\delta _{E}(\epsilon )>0\) for all \(\epsilon \in (0,2]\);
-
(4)
p-uniformly convex if there is a \(\kappa _{p} > 0\) such that \(\delta _{E}(\epsilon )\geq \kappa _{p}\epsilon ^{p}\) for all \(\epsilon \in (0,2]\);
-
(5)
uniformly smooth if \(\lim_{t\rightarrow 0}\frac{\rho _{E}(t)}{t}=0\);
-
(6)
q-uniformly smooth if there exists a \(\kappa _{q} > 0\) such that \(\rho _{E}(t)\leq \kappa _{q} t^{q}\) for all \(t>0\).
Remark 2.2
It is known that if E is uniformly convex, then E is reflexive and strictly convex; if E is uniformly smooth, then E is reflexive and smooth (see [2]). From the Definition 2.1, one can see that every p-uniformly convex (q-uniformly smooth) space is a uniformly convex (uniformly smooth) space. Moreover, it is also known that E is p-uniformly convex (q-uniformly smooth) if and only if \(E^{*}\) is q-uniformly smooth (p-uniformly convex) (see [2, 49]).
For the Lebesgue spaces \(L_{p}\), sequence spaces \(l_{p}\), and Sobolev spaces \(W_{p}^{m}\), it is also known that [23, 50]
For \(p>1\). The mapping \(J_{p}:E\rightarrow 2^{E^{*}}\) defined by
is called the generalized duality mapping of E. In particular, \(J_{2}=J\) is called the normalized duality mapping and if E is a Hilbert space, then \(J_{p}=I\), where I is the identity mapping. The duality mapping \(J_{p}\) of a smooth Banach space E is said to be weakly sequentially continuous if for any sequence \(\{u_{n}\}\subset E\) such that \(u_{n}\rightharpoonup u\) implies \(J_{p}(u_{n})\rightharpoonup ^{*} J_{p}(u)\). For the generalized duality mapping, the following facts are known [2, 18, 34]:
-
(i)
\(J_{p}\) is homogeneous degree \(p-1\), that is, \(J_{p}(\alpha u)=|\alpha |^{p-1}\operatorname{sign}(\alpha )J_{p}(u)\) for all \(u\in E\), \(\alpha \in \mathbb{R}\). In particular, \(J_{p}(-u)=-J_{p}(u)\) for all \(u\in E\).
-
(ii)
If E is smooth, then \(J_{p}\) is monotone, that is, \(\langle u-v,J_{p}(u)-J_{p}(v)\rangle \geq 0\) for all \(u,v\in E\). Moreover, if E is strictly convex, then \(J_{p}\) is strictly monotone.
-
(iii)
If E is uniformly smooth, then \(J_{p}\) is single valued from E into \(E^{*}\) and it is uniformly continuous on bounded subsets of E.
-
(iv)
If E is reflexive, smooth, and strictly convex, then the inverse \(J_{p}^{-1}=J_{q}^{*}\) is single valued, one-to-one, and surjective, where \(J_{q}^{*}\) is the duality mapping from \(E^{*}\) into E.
Lemma 2.3
([49]) If E is a q-uniformly smooth Banach space, then there is a constant \(\kappa _{q}>0\) such that
where \(\kappa _{q}\) is called the q-uniform smoothness coefficient of E.
Remark 2.4
The exact values of the constant \(\kappa _{q}\) can be found in [35, 50].
We next recall the definition of Bregman distance. Let E be a real smooth Banach space and f be a convex and Gâteaux differentiable function on E. The bifunction \(D_{f}:E\times E\rightarrow [0,\infty )\) defined by
is called the Bregman distance with respect to f. Note that the Bregman distance is not a metric due to its lack of symmetry and failure to satisfy the triangle inequality. If \(f_{p}(x)=\frac{1}{p}\|x\|^{p}\) for \(p>1\), then \(\nabla f=J_{p}\). Hence, we have the Bregman distance with respect to \(f=f_{p}\) given by
Moreover, if \(p=2\), then \(2D_{f_{2}}(u,v)=\|u\|^{2}-\|v\|^{2}-2\langle u,J(v)\rangle =\phi (u,v)\), where ϕ is called the Lyapunov function studied in [4, 31]. Also, if E is a Hilbert space, then \(\phi (u,v)=\|u-v\|^{2}\). The following properties of the Bregman distance are well known: for each \(u,v,w\in E\),
and
For a p-uniformly convex space, it holds that [36]
where \(\tau >0\) is some fixed number.
Also, we define a function \(V_{f_{p}}: E\times E^{*}\rightarrow [0,\infty )\) by
for all \(u\in E\) and \(u^{*}\in E^{*}\). Note that \(V_{f_{p}}\) is nonnegative, convex in the second variable and \(V_{f_{p}}(u,u^{*})=D_{f_{p}}(u,J_{q}(u^{*}))\) for all \(u\in E\) and \(u^{*}\in E^{*}\). Moreover, the following property is known:
for all \(u\in E\) and \(u^{*},v^{*}\in E^{*}\).
Let C be a nonempty, closed, and convex subset of a smooth, strictly convex, and reflexive Banach space. Then, for any \(u\in E\), there exists a unique element \(w\in C\) such that
The mapping \(P_{C}\) defined by \(w=P_{C}(u)\) is called the metric projection of E onto C. We know the following property [36]:
Recall that the Bregman projection with respect to \(f_{p}\) is defined by
If \(p=2\), then \(\Pi _{C}^{f_{p}}\) becomes the generalized projection and denoted by \(\Pi _{C}\). Also, in this case, if E is a Hilbert space, then \(\Pi _{C}\) becomes the metric projection denoted by \(P_{C}\). We also know the following property [11]:
Let C be a nonempty subset of E and \(T:C \rightarrow C\) be a mapping. We denote the fixed-point set of T by \(F(T)=\{u\in C: u=Tu\}\). Let \(A: E\rightarrow 2^{E^{*}}\) be a set-valued mapping. The domain of A is denoted by \(\mathcal{D}(A)=\{u\in E : Au\neq \emptyset \}\) and the range of A is also denoted by \(\mathcal{R}(A)=\bigcup \{Au:u\in \mathcal{D}(A)\}\). The set of zeros of A is defined by \(A^{-1}0=\{u\in \mathcal{D}(A):0\in Au\}\). It is known that \(A^{-1}0\) is closed and convex (see [40]). A set-valued mapping A is said to be monotone if
A monotone operator A on E is said to be maximal if its graph is not properly contained in the graph of any other monotone operator on E.
Let E be a p-uniformly convex and uniformly smooth Banach space and \(A: E\rightarrow 2^{E^{*}}\) be a maximal monotone operator. Following [10], for each \(u\in E\) and \(\tau >0\), we define the resolvent of A by
One can see that \(A^{-1}0=F(R_{\tau})\) for \(\tau >0\). We also know the following property [26]:
for all \(u\in E\) and \(v\in A^{-1}0\).
For each \(u\in E\) and \(\mu >0\), we define the metric resolvent of A for \(\mu >0\) by
It is clear that in a Hilbert space, the metric resolvent operator is equivalent to the resolvent operator. From (2.8), one can see that \(0\in J_{p}(Q_{\mu}(u)-u)+\mu AQ_{\mu}(u)\) and \(A^{-1}0=F(Q_{\mu})\) for \(\mu >0\). The monotonicity of A implies that
for all \(u, v \in E\). If \(A^{-1}0\neq \emptyset \), then
for all \(u\in E\) and \(v\in A^{-1}0\) (see [9]). For any sequence \(\{x_{n}\}\) in E, we see that
This implies that \(\|x_{n}-Q_{\mu}(x_{n})\|\leq \|x_{n}-v\|\). If \(\{x_{n}\}\) is bounded, then \(\{x_{n}-Q_{\mu}(x_{n})\}\) is also bounded.
Let E be a p-uniformly convex and uniformly smooth Banach space and \(f:E\rightarrow \mathbb{R}\rightarrow (-\infty ,+\infty ]\) be a proper, convex, and lower semicontinuous function. The subdifferential of f at x is defined by
Let C be a closed and convex subset of E. The indicator function \(\delta _{C}\) of C at x is defined by
The subdifferentiable \(\partial \delta _{C}\) is a maximal monotone operator since \(\delta _{C}\) is a proper, convex, and lower semicontinuous function (see [33]). Moreover, we also know that
where \(N_{C}\) is the normal cone of C. In particular, if we define the resolvent of \(\partial \delta _{C}\) for \(\tau >0\) by \(R_{\tau}(u)=(J_{p}+\tau \partial \delta _{C})^{-1}J_{p}(u)\) for all \(u\in E\), then \(R_{\tau}=\Pi _{C}^{f_{p}}\), where \(\Pi _{C}^{f_{p}}\) is the Bregman projection with respect to \(f_{p}\) (see [48]). Moreover, we also have \((\partial \delta _{C})^{-1}0=C\). Also, if we define the metric resolvent of \(\partial \delta _{C}\) for \(\mu >0\) by \(Q_{\mu}(u)=(I+\mu J_{p}^{-1}\partial \delta _{C})^{-1}(u)\) for all \(u\in E\), then
where \(P_{C}\) is the metric projection of E onto C.
Throughout this paper, we adopt the notation \([a]_{+}:=\max \{a,0\}\), where \(a\in \mathbb{R}\).
Lemma 2.5
([6]) Let \(\{\varphi _{n}\}\), \(\{\alpha _{n}\}\), and \(\{\beta _{n}\}\) be three nonnegative real sequences such that
with \(\sum_{n=1}^{\infty}\beta _{n}<\infty \) and there exists a real number α such that \(0\leq \alpha _{n}\leq \alpha <1\) for all \(n\in \mathbb{N}\). Then, the following results hold:
-
(i)
\(\sum_{n=1}^{\infty}[\varphi _{n}-\varphi _{n-1}]_{+}<\infty \);
-
(ii)
There exists \(\varphi ^{*}\in [0,\infty )\) such that \(\lim_{n\rightarrow \infty}\varphi _{n}=\varphi ^{*}\).
Lemma 2.6
([42]) Assume that \(\{s_{n}\}\) and \(\{t_{n}\}\) are two nonnegative real sequences such that \(s_{n+1}\leq s_{n}+t_{n}\) for all \(n\geq 1\). If \(\sum_{n=1}^{\infty}t_{n}<\infty \), then \(\lim_{n\rightarrow \infty}s_{n}\) exists.
3 Main results
In this paper, we propose two weakly convergent inertial self-adaptive algorithms to solve the split common null-point problem in Banach spaces. In what follows, we denote \(J_{p}^{E}\) and \(J_{q}^{E^{*}}\) by the generalized duality mapping of E into \(E^{*}\) and the duality mapping of \(E^{*}\) into E, respectively, where \(1< q\leq 2\leq p<\infty \) with \(\frac{1}{p}+\frac{1}{q}=1\).
In order to prove the results, the following assumptions are needed in the following.
-
(A1)
Let E be a p-uniformly convex and uniformly smooth Banach space and F be a uniformly convex and smooth Banach space.
-
(A2)
Let \(A:E\rightarrow 2^{E^{*}}\) and \(B:F\rightarrow 2^{F^{*}}\) be maximal monotone operators.
-
(A3)
Let \(T : E \rightarrow F\) be a bounded linear operator with \(T\neq 0\) and \(T^{*} : F^{*} \rightarrow E^{*}\) be the adjoint operator of T.
-
(A4)
Let \(R_{\tau}\) be a resolvent operator associated with A for \(\tau >0\) and \(Q_{\mu}\) be a metric resolvent associated with B for \(\mu >0\).
-
(A5)
The set solution \(\Omega := A^{-1}0\cap T^{-1}(B^{-1}0)\neq \emptyset \).
The following conditions are also assumed:
-
(C1)
Let \(\{\alpha _{n}\}\subset (0,1]\) with \(\liminf_{n\rightarrow \infty}\alpha _{n}>0\);
-
(C2)
Let \(\{\mu _{n}\}\subset (0,\infty )\) with \(\liminf_{n\rightarrow \infty}\mu _{n}>0\).
The first algorithm is stated as follows:
Algorithm 1
(Inertial self-adaptive algorithm for the split common null-point problem)
-
Step 0. Given \(\tau _{1}>0\), \(\beta \in (0,1)\) and \(\mu \in (0,\frac{q}{\kappa _{q}} )\). Choose \(\{s_{n}\}\subset [0,\infty )\) such that \(\sum_{n=1}^{\infty}s_{n}<\infty \) and \(\{\beta _{n}\}\subset (0,\infty )\) such that \(\sum_{n=1}^{\infty}\beta _{n}<\infty \). Let \(x_{0},x_{1}\in E\) be arbitrary and calculate \(x_{n+1}\) as follows:
-
Step 1. Given the iterates \(x_{n-1}\) and \(x_{n}\) (\(n\geq 1\)). Choose \(\theta _{n}\) such that \(0\leq \theta _{n}\leq \bar{\theta}_{n}\), where
$$\begin{aligned} \begin{aligned} \bar{\theta}_{n}= \textstyle\begin{cases} \min \{\beta ,\frac {\beta _{n}}{ \Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \Vert ^{q}},\frac {\beta _{n}}{D_{f_{p}}(x_{n},x_{n-1})} \}, & \text{if } x_{n}\neq x_{n-1},\\ \beta , &\text{otherwise.} \end{cases}\displaystyle \end{aligned} \end{aligned}$$(3.1) -
Step 2. Compute
$$ \textstyle\begin{cases} u_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})+\theta _{n}(J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}))),\\ y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}),\\ x_{n+1}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(u_{n})+\alpha _{n}J_{p}^{E}(R_{\tau _{n}}y_{n})), \end{cases} $$(3.2)where
$$ \begin{aligned} \tau _{n+1}= \textstyle\begin{cases} \min \{ (\frac {\mu \Vert (I-Q_{\mu _{n}})Tu_{n}) \Vert ^{p}}{ \Vert T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n} \Vert ^{q}} )^{\frac{1}{q-1}},\tau _{n}+s_{n} \} ,& \text{if } T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}\neq 0,\\ \tau _{n}+s_{n}, &\text{otherwise}. \end{cases}\displaystyle \end{aligned} $$(3.3)
Remark 3.1
If \(x_{n+1}=y_{n}=u_{n}\) for some n, then \(y_{n}\) is a solution in Ω. Indeed, if \(y_{n}=u_{n}\), we see that \(y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(y_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{ \mu _{n}})Tu_{n})\) for \(\tau _{n}>0\). This implies that \((I-Q_{\mu _{n}})Ty_{n}=0\), that is, \(Ty_{n}=Q_{\mu _{n}}Ty_{n}\). In addition, if \(x_{n+1}=y_{n}\), then \(y_{n}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(y_{n})+\alpha _{n}J_{p}^{E}(R_{ \tau _{n}}y_{n}))\). This implies that \(y_{n}=R_{\tau _{n}}y_{n}\). Now, since \(y_{n}=R_{\tau _{n}}y_{n}\) and \(Ty_{n}=Q_{\mu _{n}}Ty_{n}\), we have \(y_{n}\in A^{-1}0\) and \(y_{n}\in T^{-1}(B^{-1}0)\). Therefore, \(y_{n}\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)\).
Remark 3.2
From (3.1), we observe that \(0\leq \theta _{n}\leq \beta <1\) for all \(n\geq 1\). Also, we obtain \(\theta _{n}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1})\|^{q}\leq \beta _{n}\) and \(\theta _{n}D_{f_{p}}(x_{n},x_{n-1})\leq \beta _{n}\) for all \(n\geq 1\). Since \(\sum_{n=1}^{\infty}\beta _{n}<\infty \), we have
Lemma 3.3
Let \(\{\tau _{n}\}\) be a sequence generated by (3.3). Then, we have \(\lim_{n\rightarrow \infty}\tau _{n}=\tau \), where
Proof
In the case of \(T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}\neq 0\), we see that
By the definition of \(\tau _{n}\) and induction, we have
Thus, \(\tau _{n}\leq \tau _{1}+\sum_{n=1}^{\infty}s_{n}\) for all \(n\geq 1\). From (3.5), we see that
Hence \(\tau _{n}\geq \min \{ (\frac{\mu}{\|T\|^{q}} )^{ \frac{1}{q-1}},\tau _{1} \}\) for all \(n\geq 1\). Therefore, \(\min \{ (\frac{\mu}{\|T\|^{q}} )^{\frac{1}{q-1}},\tau _{1} \}\leq \tau _{n}\leq \tau _{1}+\sum_{n=1}^{\infty}s_{n}\) for all \(n\geq 1\). Since \(\tau _{n+1}\leq \tau _{n}+s_{n}\) for all \(n\geq 1\), we have \(\lim_{n\rightarrow \infty}\tau _{n}\) exists by Lemma 2.6. In this case, we denote \(\tau =\lim_{n\rightarrow \infty}\tau _{n}\). Obviously, \(\tau \in [\min \{ (\frac{\mu}{\|T\|^{q}} )^{ \frac{1}{q-1}},\tau _{1} \}, \tau _{1}+s ]\). □
Remark 3.4
The adaptive step size \(\tau _{n}\) generated by (3.3) is different from many adaptive step sizes as studied in [43, 44]. Note that \(\tau _{n}\) is allowed to increase when the iteration increases. Therefore, it reduces the dependence on the initial step size \(\tau _{1}\). Since \(\sum_{n=1}^{\infty}s_{n}<\infty \), we have \(\lim_{n\rightarrow \infty}s_{n}=0\). As a result, \(\tau _{n}\) may not increase when n is large.
Lemma 3.5
Let \(\{x_{n}\}\) be a sequence generated by Algorithm 1. Then, for each \(n\geq 1\), the following inequality holds for all \(v\in \Omega \):
where \(\xi _{n}(p,q):=\theta _{n}D_{f_{p}}(x_{n},x_{n-1})+ \frac{\kappa _{q}\theta _{n}^{q}}{q}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \|^{q}\), \(\delta _{n}(p,q):=\alpha _{n}\tau _{n} (1- \frac{\kappa _{q}\mu}{q} (\frac{\tau _{n}}{\tau _{n+1}} )^{q-1} )\|w_{n}\|^{p}+\alpha _{n} D_{f_{p}}(y_{n},R_{\tau _{n}}y_{n})\) and \(w_{n}:=Tu_{n}-Q_{\mu _{n}}Tu_{n}\).
Proof
Let \(v\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)\). From Lemma 2.3, we have
Note that \(w_{n}:=Tu_{n}-Q_{\mu _{n}}Tu_{n}\) and \(v\in A^{-1}0\cap T^{-1}(B^{-1}0)\), we have \(v\in A^{-1}0\) and \(Tv\in B^{-1}0\). It then follows from (2.10) that
From the definition of \(\tau _{n+1}\), we have
Combining (3.6), (3.7), and (3.8), we obtain
Now, we estimate \(D_{f_{p}}(v,u_{n})\). From Lemma 2.3, we have
We observe that
Combining (3.10) and (3.11), we obtain
Then, from (2.7) and (3.10), we obtain
From the definitions of \(\xi _{n}(p,q)\) and \(\delta _{n}(p,q)\), then (3.13) can be written in a short form as follows:
Thus, this lemma is proved. □
Theorem 3.6
Let \(\{x_{n}\}\) be a sequence generated by Algorithm 1. Suppose, in addition, that \(J_{p}^{E}\) is weakly sequentially continuous on E. Then, \(\{x_{n}\}\) converges weakly to a point in Ω.
Proof
Using the fact that \(\lim_{n\rightarrow \infty}\tau _{n}\) exists and \(\mu \in (0,\frac{q}{\kappa _{q}} )\), we have
Then, there exists \(n_{0}\in \mathbb{N}\) such that
and, in consequence,
Then, from Lemma 3.5, we can deduce that
Since \(\xi _{n}(p,q):=\theta _{n}D_{f_{p}}(x_{n},x_{n-1})+ \frac{\kappa _{q}\theta _{n}^{q}}{q}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \|^{q}\), it follows from (3.4) that
From (3.14), we also have \(\lim_{n\rightarrow \infty}\xi _{n}(p,q)=0\). From Lemma 2.5, we can conclude that \(\lim_{n\rightarrow \infty}D_{f_{p}}(v,x_{n})\) exists and
Thus, we have \(\{D_{f_{p}}(v,x_{n})\}\) is bounded and so \(\{x_{n}\}\) is also bounded by (2.4). Moreover, we obtain
From Lemma 3.5, we see that
Since \(\lim_{n\rightarrow \infty}D_{f_{p}}(v,x_{n})\) exists and \(\lim_{n\rightarrow \infty}\xi _{n}(p,q)=0\), we have
Consequently,
By the continuity of \(J_{p}^{F}\), we have
Moreover, we have
By the definition of \(y_{n}\), the continuity of \(T^{*}\), and from (3.16), we have
On the other hand, by the definition of \(u_{n}\) and from (3.4), we obtain
Thus,
From (3.19), we also have \(\lim_{n\rightarrow \infty}\|u_{n}-x_{n}\|=0\) by the uniform continuity of \(J_{q}^{E^{*}}\). Since \(\{x_{n}\}\) is bounded, there exists a subsequence \(\{x_{n_{k}}\}\) of \(\{x_{n}\}\) such that \(x_{n_{k}}\rightharpoonup w\in E\) and so \(u_{n_{k}}\rightharpoonup w\). Put \(v_{n}=R_{\tau _{n}}y_{n}\) for all \(n\in \mathbb{N}\). From (3.17) and (3.18), we see that
Consequently, \(\lim_{n\rightarrow \infty}\|u_{n}-v_{n}\|=0\). Thus,
Using the above inequality, we also obtain \(v_{n_{k}}\rightharpoonup w\). Since \(R_{\tau _{n}}\) is the resolvent of A for \(\tau _{n}>0\), we have
Replacing n by \(n_{k}\) and using the fact that A is monotone, thus
for all \((s,s^{*})\in A\). Now, \(T^{*}\) is continuous, which is due to the fact that \(T^{*}\) is a bounded and linear operator. Then, from (3.16), (3.20), and \(\lim_{k\rightarrow \infty}\tau _{n_{k}}=\tau >0\), we obtain \(\langle s-w,s^{*}-0\rangle \geq 0\) for all \((s,s^{*})\in A\). Note that A is maximal monotone, thus \(w\in A^{-1}0\). On the other hand, we know that T is also continuous. This fact, together with \(\|u_{n_{k}}-x_{n_{k}}\|\rightarrow 0\) and \(\|Tu_{n_{k}}-Q_{\mu _{n_{k}}}Tu_{n_{k}}\|\rightarrow 0\), means that we have \(Tu_{n_{k}}\rightharpoonup Tw\) and \(Q_{\mu _{n_{k}}}u_{n_{k}}\rightharpoonup Tw\). Since \(Q_{\mu _{n}}\) is the metric resolvent of B for \(\mu _{n}>0\), we have
for all \(n\in \mathbb{N}\). Replacing n by \(n_{k}\), it then follows from the monotonicity of B that
for all \((u,u^{*})\in B\). Then, from (3.16) and \(\liminf_{k\rightarrow \infty}\mu _{n_{k}}>0\), we obtain \(\langle u-Tw,u^{*}-0\rangle \geq 0\) for all \((u,u^{*})\in B\). Note that B is maximal monotone, thus \(Tw\in B^{-1}0\) and so \(w\in T^{-1}(B^{-1}0)\). We thus obtain \(w\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)\). In order to prove the weak convergence of the sequence \(\{x_{n}\}\), it is sufficient to show that \(\{x_{n}\}\) has a unique weak limit point in Ω. In this case, we can assume that \(\{x_{m_{k}}\}\) is another subsequence of \(\{x_{n}\}\) such that \(x_{m_{k}}\rightharpoonup w'\in \Omega \). Note that \(x_{n_{k}}\rightharpoonup w\in \Omega \). Indeed, suppose by contradiction that with \(w'\neq w\). Since \(\lim_{n\rightarrow \infty}D_{f_{p}}(v,x_{n})\) exists for any \(v\in \Omega \), it then follows from (2.2) and the weak sequential continuity of \(J_{p}^{E}\) that
In the same way as above, we can show that
which is a contradiction with (3.21). Hence, \(w=w'\) and therefore, the sequence \(\{x_{n}\}\) converges weakly to a point in Ω. This finishes the proof. □
Next, we present a second algorithm that is slightly different from the first proposed algorithm.
Algorithm 2
(Relaxed inertial self-adaptive algorithm for the split common null-point problem)
-
Step 0. Given \(\tau _{1}>0\), \(\beta \in (0,1)\) and \(\mu \in (0,\frac{q}{\kappa _{q}} )\). Choose \(\{s_{n}\}\subset [0,\infty )\) such that \(\sum_{n=1}^{\infty}s_{n}<\infty \) and \(\{\beta _{n}\}\subset (0,\infty )\) such that \(\sum_{n=1}^{\infty}\beta _{n}<\infty \). Let \(x_{0},x_{1}\in E\) be arbitrary and calculate \(x_{n+1}\) as follows:
-
Step 1. Given the iterates \(x_{n-1}\) and \(x_{n}\) (\(n\geq 1\)). Choose \(\theta _{n}\) such that \(0\leq \theta _{n}\leq \bar{\theta}_{n}\), where
$$\begin{aligned} \begin{aligned} \bar{\theta}_{n}= \textstyle\begin{cases} \min \{\beta ,\frac {\beta _{n}}{ \Vert J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \Vert } \}, & \text{if } x_{n}\neq x_{n-1},\\ \beta , &\text{otherwise.} \end{cases}\displaystyle \end{aligned} \end{aligned}$$(3.22) -
Step 2. Compute
$$ \textstyle\begin{cases} u_{n}=J_{q}^{E^{*}}(J_{p}^{E}(x_{n})+\theta _{n}(J_{p}^{E}(x_{n-1})-J_{p}^{E}(x_{n}))),\\ y_{n}=J_{q}^{E^{*}}(J_{p}^{E}(u_{n})-\tau _{n}T^{*}J_{p}^{F}(I-Q_{\mu _{n}})Tu_{n}),\\ x_{n+1}=J_{q}^{E^{*}}((1-\alpha _{n})J_{p}^{E}(u_{n})+\alpha _{n}J_{p}^{E}(R_{\tau _{n}}y_{n})), \end{cases} $$(3.23)where \(\tau _{n}\) is defined the same as in (3.3).
Remark 3.7
It should be noted that Algorithm 2 is slightly different from Algorithm 1 but \(\bar{\theta}_{n}\) of this algorithm is simpler to compute than \(\bar{\theta}_{n}\) of Algorithm 1, that is, it is chosen without any prior knowledge of the Bregman distance \(D_{f_{p}}\) at two points \(x_{n}\) and \(x_{n-1}\), which is flexible and easy to implement in solving the problem. This is why we call the technique proposed in this case a “relaxed inertial algorithm”.
Remark 3.8
From (3.22), we observe that \(0\leq \theta _{n}\leq \beta <1\) for all \(n\geq 1\). Also, we obtain \(\theta _{n}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1})\|\leq \beta _{n}\) for all \(n\geq 1\). Since \(\sum_{n=1}^{\infty}\beta _{n}<\infty \), we have \(\sum_{n=1}^{\infty}\theta _{n}\|J_{p}^{E}(x_{n})-J_{p}^{E}(x_{n-1}) \|<\infty \) and so
Theorem 3.9
Let \(\{x_{n}\}\) be a sequence generated by Algorithm 2. Suppose, in addition, that \(J_{p}^{E}\) is weakly sequentially continuous on E. Then, \(\{x_{n}\}\) converges weakly to a point in Ω.
Proof
Let \(v\in \Omega := A^{-1}0\cap T^{-1}(B^{-1}0)\). Similarly, by using the same argument as in the proof of Theorem 3.6, we have
By the definition of \(u_{n}\) in (3.23), we see that
Thus, we have
From Theorem 3.6, we know that
Thus, we can deduce that
Hence, \(\{D_{f_{p}}(v,x_{n})\}\) is bounded. From (2.4), we also obtain \(\{x_{n}\}\) is bounded. From (3.26), we have
which implies that
for all \(n\geq n_{0}\). Using (2.2), we see that
where \(M:=\sup_{n\geq n_{0}}\{\|x_{n-1}-v\|\}\). From Remark 3.8, we can deduce that
Thus, from (3.28) with Lemma 2.6, we have the limit of \(\{D_{f_{p}}(v,x_{n})\}\) exists. Note that (3.29) implies that
From (3.27), we have
This implies by (3.30) that
Thus,
By the definition of \(y_{n}\), we can show that \(\lim_{n\rightarrow \infty}\|J_{p}^{E}(y_{n})-J_{p}^{E}(u_{n})\|=0\). Also, by the definition of \(u_{n}\), we have
Consequently, \(\lim_{n\rightarrow \infty}\|u_{n}-x_{n}\|=0\). Since the rest of the proof is the same as the proof of Theorem 3.6, we omit the details here. □
Next, we apply our algorithms to solve the split feasibility problem in Banach spaces.
Let C and Q be nonempty, closed, and convex subsets of E and F, respectively. Let T be a bounded linear operator with its adjoint operator \(T^{*}\) and \(T\neq 0\). We consider the split feasibility problem (SFP):
We denote the set of solutions of SFP by \(\Gamma :=C\cap T^{-1}(Q)\). The SFP was first introduced in 1994 by Censor and Elfving [15] for inverse problems of intensity-modulated radiation therapy (IMRT) in the field of medical care (see [12, 14]).
Setting \(A:=\partial \delta _{C}\) and \(B:=\partial \delta _{Q}\) in Theorems 3.6 and 3.9, we obtain the following results.
Corollary 3.10
Let E, F, C, Q, T, and \(T^{*}\) be the same as mentioned above. Let \(\tau _{1}\), β, μ, \(\{s_{n}\}\), \(\{\beta _{n}\}\), \(\{\theta _{n}\}\), and \(\{\bar{\theta}_{n}\}\) be the same as in Algorithm 1, where \(\{\bar{\theta}_{n}\}\) is defined the same as in (3.1). Suppose that \(\Gamma \neq \emptyset \). Let \(x_{0},x_{1}\in E\) and \(\{x_{n}\}\) be a sequence generated by
Suppose, in addition, that \(J_{p}^{E}\) is weakly sequentially continuous on E. Then, the sequence \(\{x_{n}\}\) converges weakly to a point in Γ.
Corollary 3.11
Let E, F, C, Q, T, and \(T^{*}\) be the same as mentioned above. Let \(\tau _{1}\), β, μ, \(\{s_{n}\}\), \(\{\beta _{n}\}\), \(\{\theta _{n}\}\), and \(\{\bar{\theta}_{n}\}\) be the same as in Algorithm 2, where \(\{\bar{\theta}_{n}\}\) is defined the same as in (3.22). Suppose that \(\Gamma \neq \emptyset \). Let \(x_{0},x_{1}\in E\) and \(\{x_{n}\}\) be a sequence generated by
Suppose, in addition, that \(J_{p}^{E}\) is weakly sequentially continuous on E. Then, the sequence \(\{x_{n}\}\) converges weakly to a point in Γ.
4 Numerical and experiments results
In this section, we apply our Algorithms 1 and 2 to numerically solve some problems in science and engineering and we also compare the numerical performances with the iterative scheme (1.8) proposed by Tang [43] (namely, Tang Algorithm) and the iterative scheme (1.5) proposed by Suantai et al. [39] (namely, Suantai et al. Algorithm).
Problem 4.1
Split feasibility problem in infinite-dimensional Banach spaces
Let \(E=F=\ell _{p}^{0}(\mathbb{R})\) (\(1< p<\infty ,p\neq 2\)), where \(\ell _{p}^{0}(\mathbb{R})\) is the subspace of \(\ell _{p}(\mathbb{R})\), that is,
with norm \(\|x\|_{\ell _{p}}= (\sum_{i=1}^{\infty}|x_{i}|^{p} )^{1/p}\) and duality pairing \(\langle x,y\rangle =\sum_{i=1}^{\infty}x_{i}y_{i}\) for all \(x=(x_{1},x_{2},\dots , x_{i},\dots )\in l_{p}(\mathbb{R})\) and \(y=(y_{1},y_{2},\dots ,y_{i},\dots )\in l_{q}(\mathbb{R})\), where \(\frac{1}{p}+\frac{1}{q}=1\). The generalized duality mapping \(J_{p}^{\ell _{p}(\mathbb{R})}\) is computed by the following explicit formula (see [3]):
In this example, let \(p=3\), we have \(q=\frac{3}{2}\). Then, the smoothness constant \(\kappa _{q}\approx 1.3065\). Let \(C=\{x\in \ell _{3}^{0}(\mathbb{R}):\|x\|_{\ell _{3}}\leq 1\}\) and \(Q=\{x\in \ell _{3}^{0}(\mathbb{R}):\langle x,a\rangle \leq 1\}\), where \(a:=(1,1,\dots ,1, 0, 0, 0,\dots )\in \ell _{3/2}^{0}(\mathbb{R})\). Define an operator \(Tx=\frac{x}{2}\) with its adjoint \(T^{*}=T\) and \(\|T\|=\frac{1}{2}\). In this experiment, we only perform the numerical tests of our Algorithms 1, 2, and Suantai et al. Algorithm [39] since Tang Algorithm [43] cannot be implemented in \(\ell _{3}(\mathbb{R})\). For Algorithms 1 and 2, we set \(\tau _{1}=1.99\), \(\beta =0.75\), \(\mu =10^{-5}\), \(s_{n}=\frac{1}{(n+1)^{4}}\), \(\alpha _{n}=0.1\), and \(\beta _{n}=\frac{1}{(n+10)^{5}}\). For Suantai et al. Algorithm [39], we set \(\lambda _{n}=10^{-5} \), \(\alpha _{n}=\frac{1}{n+1}\), \(\beta _{n}=\frac{n}{n+1}\), and \(u_{n}= (\frac{1}{n^{2}}, \frac{1}{n^{2}}, \frac{1}{n^{2}}, 0,0,0, \dots )^{\mathsf{T}}\). The initial points \(x_{0}\) and \(x_{1}\) are generated randomly in \(\ell _{p}^{0}(\mathbb{R})\). We use \(E_{n}=\|x_{n+1}-x_{n}\|_{l_{3}}< 10^{-6}\) to terminate iterations for all algorithms. To test the robustness of each algorithm, we run the experiment several times and choose the best four tests of sequences generated by each algorithm. The numerical results are presented in Figs. 1 and 2.
Problem 4.2
[39] Split minimization problem in finite-dimensional spaces
Let \(E=F=\mathbb{R}^{3}\). For each \(x\in \mathbb{R}^{3}\), let \(f,g: \mathbb{R}^{3} \rightarrow (-\infty ,+\infty ]\) be defined by
and
where and . Let . In this case, the split common null-point problem becomes the split minimization problem, that is, find \(w\in (\partial f)^{-1}0\cap T^{-1}(\partial g)^{-1}0\). Note that and \(\|T\|^{2}\) is the largest eigenvalue of \(T^{*}T\). In this experiment, we compare the numerical performances of our Algorithms 1 and 2 with Tang Algorithm [43] and Suantai et al. Algorithm [39]. For Algorithms 1 and 2, we set \(\tau _{1}=1.99\), \(\beta =0.75\), \(\mu =0.1\), \(\alpha _{n}=0.1\), \(s_{n}= \frac{1}{(n+1)^{4}}\), \(\beta _{n}=\frac{1}{(n+1)^{1.1}}\), and \(\tau _{n}=\mu _{n}=0.01\). For Tang Algorithm [43], we set \(\alpha =0.75\), \(\rho _{n}=3-\frac{1}{n+1}\), \(\epsilon _{n}=\frac{1}{(n+1)^{1.1}}\), and \(r=\mu =10^{-5}\). For Suantai et al. Algorithm [39], we set \(\alpha _{n}=\frac{1}{n+1}\), \(\beta _{n}=0.5\), \(r_{n}=1\), \(\lambda _{n}=0.01\), and \(u_{n}=(\frac{1}{n^{2}}, \frac{1}{n^{2}}, \frac{1}{n^{2}})^{ \mathsf{T}}\). The initial points \(x_{0}\) and \(x_{1}\) are generated randomly in \(\mathbb{R}^{3}\). We use \(E_{n}=\|x_{n+1}-x_{n}\|< 10^{-5}\) to terminate iterations for all algorithms. To test the robustness of each algorithm, we run the experiment several times and choose the best four testes of sequences generated by each algorithm. The numerical results are presented in Figs. 3 and 4.
Problem 4.3
Signal-recovery problem
In signal processing, compressed sensing involves the recovery of a “sparse signal” from measured data, aiming to reconstruct the original signal using fewer measurements (see, e.g [27, 47]). In this context, we can model the compressed sensing as the following uncertain linear system:
where \(x\in \mathbb{R}^{N}\) is a K-sparse signal K (\(K\ll N\)), to be recovered, \(y\in \mathbb{R}^{M}\) is the observed or measured data with noisy b and \(T : \mathbb{R}^{N}\rightarrow \mathbb{R}^{M}\) is a bounded linear operator. It is known that the above problem can be seen as the following LASSO problem [46]:
where \(t>0\) is a given constant and \(\|\cdot \|_{1}\) is the \(\ell _{1}\) norm. If \(C=\{x\in \mathbb{R}^{N}:\|x\|_{1}\leq t\}\) and \(Q=\{y\}\), then (4.2) is a particular case of the SFP (3.31) in the finite-dimensional spaces.
We generated a sparse signal \(x\in \mathbb{R}^{N}\) with K nonzero entries having a of length \(N=2048\) and made \(M=1024\) observations. The values of the sparse signal are sampled uniformly from the interval \([-1,1]\). The observation y is generated from Gaussian noise of variance 10−4. The matrix \(T\in \mathbb{R}^{M\times N}\) is generated from a normal distribution with mean zero and one variance. Additionally, the initial signals \(x_{0}=x_{1}=T^{*}(Tx-y)\). In this experiment, we set \(\tau _{1}=1.99\), \(\beta =0.75\), \(\mu =10^{-5}\), \(\alpha _{n}=0.1\), \(s_{n}= \frac{1}{(n+1)^{4}}\), \(\beta _{n}=\frac{1}{(n+1)^{1.1}}\), and \(\tau _{n}=\mu _{n}=0.01\) in Algorithms 1 and 2, we set \(\alpha =0.75\), \(\rho _{n}=0.5\), \(\epsilon _{n}=\frac{1}{(n+1)^{1.1}}\), and \(r=\mu =0.001\) in Tang Algorithm [43] and we set \(\alpha _{n}=\beta _{n}=\frac{1}{n+1}\), \(u_{n}=Tx-y\), and \(\lambda _{n}=0.001\) in Suantai et al. Algorithm [39]. We consider five different tests for the spikes \(K\in \{10,20,30,40,50\}\). Our stopping criterion is \(E_{n}=\|x_{n+1}-x_{n}\|<10^{-7}\). The results of the numerical simulations are presented in Figs. 5–9.
Remark 4.4
We observe from the numerical simulations presented in Figs. 5–9 that our proposed Algorithms 1 and 2 outperform Tang Algorithm [43] and Suantai et al. Algorithm [39] in the sense that they satisfy the stopping criteria in fewer iterations and less computational time in the signal-recovery tests. Furthermore, we observe that while Tang Algorithm [43] requires fewer iterations to satisfy the stopping criteria compared to Suantai et al. Algorithm [39], the reconstructed signal by Tang Algorithm [43] is NOT very close to the original signal compared to that reconstructed via Suantai et al. Algorithm [39].
Problem 4.5
Data classifications
In this example, we apply our algorithms to data-classification problems, which are based on a learning technique called the extreme learning machine (ELM). Let \(\mathcal{U}=\{(x_{n}, y_{n}): x_{n}\in \mathbb{R}^{N}, y_{n}\in \mathbb{R}^{M}, n=1,2,3, \ldots, K\}\) be a training set of K distinct samples, \(x_{n}\) is an input training data and \(y_{n}\) is a training target. For the output of ELM with a single hidden layer at the ith hidden node is \(h_{i}(x)=U(\langle a_{i},x_{n}\rangle +b_{i})\), where U is an activation function, \(a_{i}\) is the weight at the ith hidden node, and \(b_{i}\) is the bias at the ith hidden node. The output function with L hidden nodes is the single hidden-layer feedforward neural networks (SLFNs)
where \(\omega _{i}\) is the optimal output weight at the ith hidden node. The hidden-layer output matrix T is defined by
The main aim of ELM is to calculate an optimal weight \(\omega =(\omega _{1},\omega _{2},\ldots,\omega _{L})^{\mathsf{T}}\) such that \(T\omega =b\), where \(b=(t_{1},t_{2},\ldots,t_{K})^{\mathsf{T}}\) is the training target data. A successful model used to find the solution ω can be translated into the following convex constraint minimization problem:
where \(\xi >0\) is a given constant. If \(C=\{\omega \in \mathbb{R}^{L}: \|\omega \|_{1}\leq \xi \}\) and \(Q=\{b\}\), then (4.3) is a particular case of the SFP (3.31) in the finite-dimensional spaces.
The binary crossentropy loss function along with sigmoid activation function for binary classification calculates the loss of an example by computing the following average:
where \(\hat{y}_{j}\) is the jth scalar value in the model output, \(y_{j}\) is the corresponding target value, and J is the number of scalar values in the model output.
The performance evaluation in classification can be justified by precision and recall. The Recall/True Positive Rate can be defined as the level of accuracy of predictions in positive classes and the percentage of the number of predictions that are correct on the positive observations. Then, calculate the accuracy, prediction, and F1-score using the following standard criteria [22]:
-
(1)
Precision = \(\frac{{\mathrm{TP}}}{{\mathrm{TP}+\mathrm{FP}}}\times 100\%\);
-
(2)
Recall = \(\frac{{\mathrm{TP}}}{{\mathrm{TP}+\mathrm{FN}}}\times 100\%\);
-
(3)
Accuracy = \(\frac{{\mathrm{TP}+\mathrm{TN}}}{{\mathrm{TP}+\mathrm{FP}+\mathrm{TN}+\mathrm{FN}}}\times 100\%\);
-
(4)
F1-score = \(\frac{2\times \text{Precision}\times \text{Recall}}{\text{Precision}+\text{Recall}}\),
where a confusion matrix for original and predicted classes is shown in terms of TP := True Positive, TN := True Negative, FP := False Positive, and FN := False Negative.
Next, we consider the following two datasets:
Dataset 1
UCI Machine Learning Heart Disease dataset [20]. This dataset contains 14 attributes and 303 records. This dataset contains the attributes: Age, Gender, CP, Trestbps, Chol, Fbs, Restecg, Thalach, Exang, Oldpeak, Slope, Ca, Thal, and Num (the predicted attribute). The dataset consists of 138 normal instances versus 165 abnormal instances.
Dataset 2
PIMA Indians diabetes dataset [1]. The dataset contains 768 pregnant female patients of which 500 were nondiabetics and 268 that were diabetics. This dataset contains 9 attributes: Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes Pedigree Function, Age, and Outcome (the predicted attribute).
In particular, we apply our algorithms to the optimized weight parameter in training data for machine learning by using 5-fold crossvalidation [25] in the extreme learning machine (ELM).
For Dataset 1, we start computation by setting the activation function as a sigmoid, hidden nodes \(L = 80\), regularization parameter \(\xi =10\), \(x_{0}={\mathbf{{1}}}:=(\underbrace{1,1,\ldots ,1)}_{N}\in \mathbb{R}^{N}\), \(x_{1}={\mathbf{{0}}}:=(\underbrace{0,0,\ldots ,0)}_{N}\in \mathbb{R}^{N}\), \(\tau _{1}=1\), \(s_{n}=0\), \(\beta _{n}=\frac{1}{(n+1)^{10}}\), and \(\alpha _{n}=\frac{n+2}{2n+1}\). The stopping criteria is the number of iterations 300. We compare the performance of Algorithms 1 and 2 with different parameters β for Dataset 1, as seen in Tables 1 and 2.
From Table 1, we see that β increases from 0.1 to 0.9. The training loss and test loss decrease, it appears that \(\beta =0.9\) performs better for Algorithm 1.
From Table 2, we see that the training loss and test loss increase when β increases, it appears that \(\beta =0.1\) performs better for Algorithm 2. Also, we compare the performance of Algorithms 1 and 2 with different parameters μ for Dataset 1, as seen in Tables 3 and 4.
From Tables 3 and 4, we see that the training loss and test loss decrease when μ increases, it show that \(\mu =0.9\) gives the highly improved the performance of Algorithm 1 and Algorithm 2 for Dataset 1.
Next, we compare the performance of our algorithms with Tang Algorithm [43] and Suantai et al. Algorithm [39] for Dataset 1. For Algorithms 1 and 2, we set \(\tau _{1}=1\), \(\mu =0.9\), \(s_{n}=0\), and \(\beta _{n}=\frac{1}{(n+1)^{10}}\). Moreover, we set \(\beta =0.9\) and \(\beta =0.1\) for Algorithms 1 and 2, respectively. For Tang Algorithm [43], we set \(\alpha =0.6\), \(\rho _{n}=3.5\), and \(\epsilon _{n}=\frac{1}{n+1}\). For Suantai et al. Algorithm [39], we set \(\alpha _{n}=\frac{1}{n+1}\), \(\beta _{n}=\frac{n-1}{2n}\), \(u_{n}={\mathbf{{1}}}\in \mathbb{R}^{N}\), and \(\lambda _{n}=\frac{1}{\|T\|^{2}}\).
The comparison of all algorithms is presented in Table 5.
From Table 5, we observe that our Algorithms 1 and 2 have fewer iterations than Tang Algorithm [43] and Suantai et al. Algorithm [39] with the same precision, recall, F1-score, and accuracy. This shows that our algorithms have the highest probability of correctly classifying heart disease compared to other algorithms.
Next, we present graphs of the accuracy and loss of training data and testing data for overfitting of Algorithms 1 and 2 to show that our algorithms have no overfitting in the training Dataset 1.
From Figs. 10 and 11, we see that our Algorithms 1 and 2 have suitably learned the training dataset for Dataset 1.
For Dataset 2, we present the comparison of our algorithms with Tang Algorithm [43] and Suantai et al. Algorithm [39]. We start computation by setting the activation function as sigmoid, hidden nodes \(L = 160\). For Algorithms 1 and 2, we set \(\tau _{1}=1\), \(\mu =0.9\), \(s_{n}=0\), and \(\beta _{n}=\frac{1}{(n+1)^{10}}\). Moreover, we set \(\beta =0.9\) and \(\beta =0.1\) for Algorithms 1 and 2, respectively. For Tang Algorithm [43], we set \(\alpha =0.6\), \(\rho _{n}=3.5\), and \(\epsilon _{n}=\frac{1}{(n+1)^{10}}\). For Suantai et al. Algorithm [39], we set \(\alpha _{n}=\frac{1}{n+1}\), \(\beta _{n}=\frac{n-1}{2n}\), \(u_{n}={\mathbf{{1}}}\in \mathbb{R}^{N}\), and \(\lambda _{n}=\frac{1}{\|T\|^{2}}\).
The comparison of all algorithms is presented in Table 6.
From Table 6, we see that Algorithms 1 and 2 have the most efficiency in precision, recall, F1-score, and accuracy for Dataset 2. This mean that our algorithms have the highest probability of correctly classifying the PIMA Indians diabetes dataset (Dataset 2) compared to Tang Algorithm [43] and Suantai et al. Algorithm [39].
Next, we present graphs of the accuracy and loss of training data and testing data for overfitting of Algorithms 1 and 2 to show that our algorithms have no overfitting in the training Dataset 2.
From Figs. 12 and 13, we see that Algorithms 1 and 2 have suitably learned the training dataset for Dataset 2.
5 Conclusions
In this paper, we have proposed two inertial self-adaptive algorithms to solve the split common null-point problem for two set-valued mappings in Banach spaces. The step sizes used in our proposed algorithms are adaptively updated without the prior knowledge of the operator norm of the bounded linear operator. We have proved the weak-convergence theorems of the proposed algorithms under suitable conditions in p-uniformly convex, real Banach spaces that are also uniformly smooth. Finally, we have performed experiments to numerically solve some problems in science and engineering, such as, the split feasibility problem, the split minimization problem, signal recovery, and data classifications, and also have compared them with some existing methods to demonstrate the implementability and efficiency of our methods.
Availability of data and materials
Not applicable.
References
https://www.kaggle.com/datasets/whenamancodes/predict-diabities
Agarwal, R.P., O’Regan, D., Sahu, D.R.: Fixed Point Theory for Lipschitzian-Type Mappings with Applications. Springer, Berlin (2009)
Alber, Y., Ryazantseva, I.: Nonlinear Ill-Posed Problems of Monotone Type. Springer, Dordrecht (2006)
Alber, Y.I.: Metric and generalized projection operators in Banach spaces: properties and applications. In: Kartsatos, A.G. (ed.) Theory and Applications of Nonlinear Operator of Accretive and Monotone Type, pp. 15–50. Dekker, New York (1996)
Alofi, A.S., Alsulami, S.M., Takahashi, W.: Strongly convergent iterative method for the split common null point problem in Banach spaces. J. Nonlinear Convex Anal. 17, 311–324 (2016)
Alvarez, F.: Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert spaces. SIAM J. Optim. 14, 773–782 (2004)
Alvarez, F., Attouch, H.: An inertial proximal method for monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Var. Anal. 9, 3–11 (2001)
Anh, P.K., Thong, D.V., Dung, V.T.: A strongly convergent Mann-type inertial algorithm for solving split variational inclusion problems. Optim. Eng. 22, 159–185 (2021)
Aoyama, K., Kohsaka, F., Takahashi, W.: Three generalizations of firmly nonexpansive mappings: their relations and continuity properties. J. Nonlinear Convex Anal. 10, 131–147 (2009)
Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42, 596–636 (2003)
Butnariu, D., Resmerita, E.: Bregman distances, totally convex functions and a method for solving operator equations in Banach spaces. Abstr. Appl. Anal. 2006, Article ID 084919 (2006)
Byrne, C.: Iterative oblique projection onto convex sets and the split feasibility problem. Inverse Probl. 18, 441–453 (2002)
Byrne, C., Censor, Y., Gibali, A., Reich, S.: The split common null point problem. J. Nonlinear Convex Anal. 13, 759–775 (2012)
Censor, Y., Bortfeld, T., Martin, B., Trofimov, A.: A unified approach for inversion problems in intensity modulated radiation therapy. Phys. Med. Biol. 51, 2353–2365 (2006)
Censor, Y., Elfving, T.: A multiprojection algorithm using Bregman projections in product space. Numer. Algorithms 8, 221–239 (1994)
Chen, H.Y.: Weak and strong convergence of inertial algorithms for solving split common fixed point problems. J. Inequal. Appl. 2021, 26 (2021)
Chuang, C.S.: Hybrid inertial proximal algorithm for the split variational inclusion problem in Hilbert spaces with applications. Optimization 66, 777–792 (2017)
Cioranescu, I.: Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems. Kluwer Academic, Dordrecht (1990)
Combette, P.L.: The convex feasibility problem in image recovery. Adv. Imaging Electron Phys. 95, 155–270 (1996)
Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2019). http://archive.ics.uci.edu/ml
Duan, P., Zhang, Y., Bu, Q.: New inertial proximal gradient methods for unconstrained convex optimization problems. J. Inequal. Appl. 2020, 255 (2020)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, vol. 10, 978-1. Kaufman, Waltham (2012)
Hanner, O.: On the uniform convexity of \(L_{p}\) and \(l_{p}\). Ark. Mat. 3, 239–244 (1956)
Kesornprom, S., Cholamjiak, K.: Proximal type algorithms involving linesearch and inertial technique for split variational inclusion problem in Hilbert spaces with applications. Optimization 68, 2369–2395 (2019)
Kumari, V.A., Chitra, R.: Classification of diabetes disease using support vector machine. Int. J. Eng. Res. Appl. 3, 1797–1801 (2013)
Kuo, L.W., Sahu, D.R.: Bregman distance and strong convergence of proximal-type algorithms. Abstr. Appl. Anal. 2013, Article ID 590519 (2006)
Kutyniok, G.: Theory and applications of compressed sensing. GAMM-Mitt. 36, 79–101 (2013)
López, G., Martin-Marquez, V., Wang, F., Xu, H.K.: Solving the split feasibility problem without prior knowledge of matrix norms. Inverse Probl. 28, 085004 (2012)
Moudafi, M.: Split monotone variational inclusions. J. Optim. Theory Appl. 150, 275–283 (2011)
Ogbuisi, F.U., Shehu, Y., Yao, J.C.: Convergence analysis of new inertial method for the split common null point problem. Optimization 71, 3767–3795 (2022)
Reich, S.: A weak convergence theorem for the alternating method with Bregman distances. In: Theory and Applications of Nonlinear Operators of Accretive and Monotone Type, pp. 313–318. Dekker, New York (1996)
Reich, S., Tuyen, T.M., Sunthrayuth, P., Cholamjiak, P.: Two new inertial algorithms for solving variational inequalities in reflexive Banach spaces. Numer. Funct. Anal. Optim. 42, 1954–1984 (2021)
Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pac. J. Math. 33, 209–216 (1970)
Schöpfer, F.: Iterative regularization method for the solution of the split feasibility problem in Banach spaces. Ph.D. thesis, Saabrücken (2007)
Schöpfer, F., Louis, A.K., Schuster, T.: Nonlinear iterative methods for linear ill-posed problems in Banach spaces. Inverse Probl. 22, 311–329 (2006)
Schöpfer, F., Schuster, T., Louis, A.K.: An iterative regularization method for the solution of the split feasibility problem in Banach spaces. Inverse Probl. 24, 055008 (2008)
Shehu, Y., Iyiola, O.S., Ogbuisi, F.U.: Iterative method with inertial terms for nonexpansive mappings: applications to compressed sensing. Numer. Algorithms 83, 1321–1347 (2020)
Suantai, S., Pholasa, N., Cholamjiak, P.: Relaxed CQ algorithms involving the inertial technique for multiple-sets split feasibility problems. Rev. R. Acad. Cienc. Exactas Fís. Nat., Ser. A Mat. 113, 1081–1099 (2019)
Suantai, S., Shehu, Y., Cholamjiak, P.: Nonlinear iterative methods for solving the split common null point problem in Banach spaces. Optim. Methods Softw. 34, 853–874 (2019)
Takahashi, E.: Nonlinear Functional Analysis. Yokohama Publishers, Yokohama (2000)
Tan, B., Sunthrayuth, P., Cholamjiak, P., Cho, Y.J.: Modified inertial extragradient methods for finding minimum-norm solution of the variational inequality problem with applications to optimal control problem. Int. J. Comput. Math. 100, 525–545 (2023)
Tan, K.K., Xu, H.K.: Approximating fixed points of nonexpensive mappings by the Ishikawa iteration process. J. Math. Anal. Appl. 178, 301–308 (1993)
Tang, Y.: New inertial algorithm for solving split common null point problem in Banach spaces. J. Inequal. Appl. 2019, 17 (2019)
Tang, Y., Sunthrayuth, P.: An iterative algorithm with inertial technique for solving the split common null point problem in Banach spaces. Asian-Eur. J. Math. 15, 2250120 (2022)
Tian, M., Jiang, B.N.: Inertial hybrid algorithm for variational inequality problems in Hilbert spaces. J. Inequal. Appl. 2020, 12 (2020)
Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
Tropp, J.A.: A mathematical introduction to compressive sensing. Bull. Am. Math. Soc. 54, 151–165 (2017)
Tuyen, T.M., Cholamjiak, P., Sunthrayuth, P.: A new self-adaptive method for the multiple-sets split common null point problem in Banach spaces. Vietnam J. Math. (2022). https://doi.org/10.1007/s10013-022-00574-3
Xu, H.K.: Inequalities in Banach spaces with applications. Nonlinear Anal., Theory Methods Appl. 16, 1127–1138 (1991)
Xu, Z.B., Roach, G.F.: Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces. J. Math. Anal. Appl. 157, 189–210 (1991)
Zhou, Z., Tan, B., Li, S.: Inertial algorithms with adaptive stepsizes for split variational inclusion problems and their applications to signal recovery problem. Math. Methods Appl. Sci. (2023). https://doi.org/10.1002/mma.9436
Acknowledgements
The authors wish to thank the anonymous referees for their valuable comments and valuable suggestions that have led to considerable improvement of this paper.
Funding
This research was supported by The Science, Research and Innovation Promotion Funding (TSRI) (Grant no. FRB660012/0168). This research block grants was managed under Rajamangala University of Technology Thanyaburi (FRB66E0616I.5).
Author information
Authors and Affiliations
Contributions
Conceptualization: P.S.; Writing-original draft: R.P., P.S.; Formal analysis: E.K., S.K.; Investigation: R.P., P.S.; Software: S.K., R.P.; Review and editing: P.S., R.P.; Project administration: P.S.; Funding: R.P., E.T. All authors have read and approved final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Promkam, R., Sunthrayuth, P., Kesornprom, S. et al. New inertial self-adaptive algorithms for the split common null-point problem: application to data classifications. J Inequal Appl 2023, 136 (2023). https://doi.org/10.1186/s13660-023-03049-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-023-03049-2