Simultaneous and semi-alternating projection algorithms for solving split equality problems

In this article, we first introduce two simultaneous projection algorithms for solving the split equality problem by using a new choice of the stepsize, and then propose two semi-alternating projection algorithms. The weak convergence of the proposed algorithms is analyzed under standard conditions. As applications, we extend the results to solve the split feasibility problem. Finally, a numerical example is presented to illustrate the efficiency and advantage of the proposed algorithms.


Introduction
Let H 1 , H 2 and H 3 be real Hilbert spaces, let C ⊆ H 1 and Q ⊆ H 2 be two nonempty closed convex sets, and let A : H 1 → H 3 and B : H 2 → H 3 be two bounded linear operators.
In this article, we consider the classical split equality problem (SEP), which was first introduced by Moudafi [1]. The SEP can mathematically be formulated as follows: Find x ∈ C, y ∈ Q such that Ax = By.
( 1 ) Throughout this paper, assume that SEP (1) is consistent and denote by = {x ∈ C, y ∈ Q : Ax = By} the solution of SEP (1). Then is closed, convex and nonempty. The split equality problem (1) is actually an optimization problem with weak coupling in the constraint (see [1] for details) and its interest covers many situations, for instance, in domain decomposition for PDEs, game theory and intensity-modulated radiation therapy (IMRT). In domain decomposition for PDEs, this equals to the variational form of a PDE in a domain that can be decomposed into two non-overlapping subdomains with a common interface (see, e.g., [2]). In decision sciences, this allows to consider agents who interplay only via some components of their decision variables (see, e.g., [3]). In IMRT, this amounts to envisaging a weak coupling between the vector of doses absorbed in all voxels and that of the radiation intensity (see [4] for further details). Attouch [5] pointed out more applications of the SEP in optimal control theory, surface energy and potential games, whose variational form can be seen as a SEP.
Next we present an example, in which a separable optimization problem can be rewritten as a split equality problem. subject to Ax = By, (2) with x ∈ R N and y ∈ R M , where A ∈ R J×N and B ∈ R J×M . Assume that f and g are convex and the solution set of problem (2) is nonempty.
Set C = argmin{f (x) | x ∈ R N } and Q = argmin{g(y) | y ∈ R M }. Then the optimization problem (2) equals to the following split equality problem: Find x ∈ C, y ∈ Q such that Ax = By. ( A great deal of literature on algorithms for solving SEP has been published, most of which are projection methods [1][2][3][6][7][8][9][10][11]. Based on the classical projection gradient algorithm, Byrne and Moudafi [12] introduced the following algorithm, which is also called the simultaneous iterative method [13]: ⎧ ⎨ ⎩ x k+1 = P C (x kβ k A * (Ax k -By k )), y k+1 = P Q (y k + β k B * (Ax k -By k )), (4) where β k ∈ (ε, 2/(λ A + λ B )ε), λ A and λ B are the operator (matrix) norms A and B (or the largest eigenvalues of A * A and B * B), respectively. To determine stepsize β k , one needs first to calculate (or estimate) the operator norms A and B . In general, it is difficult or even impossible. On the other hand, even if we know the norm of A and B, the algorithm (4) method with fixed stepsize may be slow.
In order to deal with this, the authors [9] introduced a self-adaptive projection algorithm, in which the stepsize is computed by using an Armijo search.
Define the function F : H 1 × H 2 → H 1 by The self-adaptive projection algorithm in [9] is defined as follows.
In fact, Algorithm 1.1 can be seen as an extension of the classical extragradient method first proposed by Korpelevich [14]. Notice that, in Algorithm 1.1, the stepsize of the prediction (5) and that of the correction (7) are equal. Thus these two steps seem to be 'symmetric' .
Recently, Chuang and Du [15] proposed the following projection algorithm (which is called the hybrid projected Landweber algorithm). Algorithm 1.2 Given constants σ > 0, α ∈ (0, 1) and θ ∈ (0, 1), let x 0 ∈ H 1 and y 0 ∈ H 2 be taken arbitrarily. For k = 0, 1, 2, . . . , compute where β k is chosen via (6) and (8). Compute next iterates x k+1 and y k+1 by where Note that Algorithm 1.2 with ρ k ≡ 1 in (9) can be seen as a special case of Tseng's method [8,16]. The projections in the second step of Tseng's method are made onto two nonempty closed convex sets X ⊆ H 1 and Y ⊆ H 2 , other than C and Q. X and Y can be any sets such that the intersections of X and C (and Y and Q) are nonempty, and they may be taken to have simple structures so that the projections onto them are easy to calculate.
Chuang and Du [15] proved the convergence of Algorithm 1.2 and also presented the convergence property of Algorithm 1.2 as follows: where (x * , y * ) ∈ . The stepsize β k in Algorithms 1.1 and 1.2 is obtained through the Armijo search (6). In general, the computational cost of a self-adaptive algorithm is large, since one may need to calculate (5) several times to get the stepsize β k .
To overcome this difficulty, the authors [17] introduced a projection algorithm for which the stepsize does not depend on the operator norms A and B , and one can directly compute the stepsize instead of using the Armijo search. Algorithm 1.3 Choose initial guesses x 0 ∈ H 1 , y 0 ∈ H 2 arbitrarily. Assume that the kth iterate x k ∈ C, y k ∈ Q has been constructed and Ax k -By k = 0; then we calculate the (k + 1)th iterate (x k+1 , y k+1 ) via the formula where the stepsize β k is chosen in such a way that where 0 < σ k < 1. If Ax k -By k = 0, then (x k+1 , y k+1 ) = (x k , y k ) is a solution of SEP (1) and the iterative process stops; otherwise, we set k := k + 1 and go onto (12) to evaluate the next iterate (x k+2 , y k+2 ).
Note that the choice in (13) of the stepsize β k is independent of the norms A and B . Polyak [18,19] first introduced the inertial extrapolation algorithms, which were widely studied as an acceleration process. The authors [20] made an inertial modification for Algorithm 1.3 and introduced the following inertial projection methods for SEP. Algorithm 1.4 Choose initial guesses x 0 , x 1 ∈ H 1 , y 0 , y 1 ∈ H 2 arbitrarily. Compute where α k ∈ (0, 1) and the stepsize γ k is chosen in the same way as (13).
They showed the weak convergence of Algorithm 1.4 under some conditions on the inertial parameter α k .
In fact, Algorithm 1.4 can be seen as a FISTA (fast iterative shrinkage-thresholding algorithm) introduced by Beck and Teboulle [21] to solve the linear inverse problems, if we take the inertial parameter α k = t k -1 t k+1 , where t 1 = 1 and t k+1 = 1+ 1+4t 2 k 2 , k ≥ 1, and choose a constant stepsize β k or choose β k via a backtracking stepsize rule. A shortcoming of the method of Beck and Teboulle is that they could not prove the convergence of the iterative sequence (x k , y k ). Chambolle and Dossal [22] improved the choice of the inertial parameter, took α k = k-1 k+a , where a > 2, and presented the convergence of the iterative sequence (x k , y k ).
In this paper, inspired by the work in [17,23,24], we introduce two simultaneous projection algorithms by improving the stepsizes β k and ρ k of the second step (7) and (9) in Algorithms 1.1 and 1.2, respectively. We also present two alternating projection algorithms, in which we take an alternating technique in the first step.
The structure of the paper is as follows. In the next section, we present some concepts and lemmas which will be used in the main results. In Section 3, two classes of projection algorithms are provided and their weak convergence is analyzed. In Section 4, we extend the results to the split feasibility problem. In the final section, some numerical results are provided, which show the advantages of the proposed algorithms.

Preliminaries
Let H be a real Hilbert space with the inner product ·, · and the induced norm · , and let D be a nonempty, closed and convex subset of H. We write x k x to indicate that the sequence {x k } ∞ k=0 converges weakly to x and x k → x to indicate that the sequence {x k } ∞ k=0 converges strongly to x. Given a sequence {x k } ∞ k=0 , denote by ω w (x k ) its weak ωlimit set, that is, any which converges weakly to x. In this paper, an important tool of our work is the projection. Let H be a real Hilbert space and C be a closed convex subset of H. Recall that the projection from H onto C, denoted by P C , is defined in such a way that, for each x ∈ H, P C (x) is the unique point in C such that The following two lemmas are useful characterizations of projections. xz, yz ≤ 0, ∀y ∈ C. Lemma 2.2 ([25, 26]) For any x, y ∈ H and z ∈ C, it holds

Definition 2.1
The normal cone of C at v ∈ C, denoted by N C (v), is defined as

Definition 2.2 Let
is not properly contained in the graph of any other monotone operator.

Lemma 2.3 ([26]) Let D be a nonempty, closed and convex subset of a Hilbert space H.
Let (x k ) be a bounded sequence which satisfies the following properties: Then {x k } converges weakly to a point in D.

Main results
In this section, we present two classes of projection algorithms and establish their weak convergence under standard conditions.

Simultaneous projection algorithms
and let K * be the adjoint operator of K , then the original problem (1) can be modified as Note that if the solution set of (15) is nonempty, it equals the following minimization problem: which is a standard (and a simple) problem from the convex optimization point of view. There are many methods for solving the minimization problem (16) such as the classical projection gradient method. Algorithm (4) (also Algorithm 1.3) is the exact projection gradient method when applied to (16).
For convenience, we call the projection algorithms which use update forms (19) (or (28)) and (20) (or (29)) Algorithm 3.1(I) and Algorithm 3.1(II), respectively. Remark 3.2 For Algorithm 3.1, we can get the following conclusions: (i) The only difference between Algorithm 3.1(II) and Algorithm 1.1 is that they use different stepsizes in the definitions of x k+1 and y k+1 . (ii) There are two differences between Algorithm 3.1(I) and Algorithm 1.2. Firstly, the stepsize ρ k in (10) of Algorithm 3.1(I) is larger than that in (36) of Algorithm 1.2. Secondly, there are no projections on the second step (19). (21), the projection equation (17) can be written as

Remark 3.3 By the definitions of d k in
Proof Obviously, β k ≤ σ k ≤ σ 0 . In the latter case, we know that β k /α must violate inequality (18), that is, On the other hand, we have Consequently, we get (34).
and (w k ) be generated by Algorithm 3.1, and let d k and ρ k be given by (21) and (36), respectively. Then we have Proof By the Cauchy-Schwarz inequality, we have So, we get (37).

Lemma 3.3
Let (z k ) and (w k ) be generated by Algorithm 3.1, and let d k be given by (21).
Proof Take arbitrarily z * ∈ , that is, z * ∈ S, and K(z * ) = 0. By setting z = z * in (33), we get It is easy to show that So we have which implies (39).
If is nonempty, then we have and (z k ) converges weakly to a solution of SEP (1).
By the boundedness of K , we get Letẑ ∈ ω w (z k ), then there exists a subsequence (z k i ) of (z k ) which converges weakly tô z. From (44), it follows that the subsequence (w k i ) also converges weakly toẑ. We will show thatẑ is a solution of SEP (1). The weak convergence of (K(z k i )) to K(ẑ) and lower semicontinuity of the squared norm imply that that is, K(ẑ) = 0. By noting that the equality in (17) can be rewritten as and that the graph of the maximal monotone operator N S is weakly-strongly closed, and by passing to the limit in the last inclusions, we obtain, from (44) and (45), that z ∈ S.
Henceẑ ∈ . Now we can apply Lemma 2.3 to D := to get that the full sequence (z k ) converges weakly to a point in . This completes the proof.
Remark 3.4 By using Remark 3.1, the contraction inequality (41) can be rewritten as follows: x k+1 It is obvious that the ρ k in (31) is larger than that in (10). Let γ = 1. Comparing (46) and (11), we claim that Algorithm 3.1(I) has a better contraction property than Algorithm 1.2.
Theorem 3.2 Let (z k ) be generated by Algorithm 3.1(II). Assume that is nonempty. Then we have Furthermore, (z k ) converges weakly to a solution of SEP (1).
Proof Let z * ∈ , that is, z * ∈ S, and K(z * ) = 0. Using Lemma 2.2(ii), we have By setting z = z k+1 II in (33), we get It holds Substituting (50) in the right-hand side of (49) and using z kγρ k d(z k , w k ) = z k+1 I , we obtain From (40), we get So, adding (51) and (52) and using the definition of ρ k , we obtain Adding (48) and (53), we obtain (47). Employing arguments which are similar to those used in the proof of Theorem 3.1, we obtain that the whole sequence (z k ) weakly converges to a solution of SEP (1), which completes proof. (41) and (47), we conclude that Algorithm 3.1(II) seems to have a better contraction property than Algorithm 3.1(I) since z k+1 II is closer to z * than z k+1 I when z k is the same.
For convenience, we call the projection algorithms which use update forms (56) and (57) Algorithm 3.2(I) and Algorithm 3.2(II), respectively.
So, from Lemma 2.1 we have

Lemma 3.4 The search rule (55) is well defined. Besides
Proof Obviously, β k ≤ σ k ≤ σ 0 . In the latter case, we know that β k /α must violate inequality (55), that is, On the other hand, we have So, we get (62).

Lemma 3.5
Let (x k , y k ) and (u k , v k ) be generated by Algorithm 3.2 , and let c k , d k and ρ k be given by (58) and (59), respectively. Then we have Proof By the Cauchy-Schwarz inequality, we have It holds So, we obtain From the definition of F and G, we have by (66), we get Combining (65) and (67), we complete the proof.

Lemma 3.6
Let (x k , y k ) and (u k , v k ) be generated by Algorithm 3.2, and let c k and d k be given by (58). Then, for all (x * , y * ) ∈ , we have Proof Take arbitrarily (x * , y * ) ∈ , that is, x * ∈ C, y * ∈ Q, and Ax * = By * . By setting (x, y) = (x * , y * ) in (61), we get By the definition of F and G, we have and So, we complete the proof.

Theorem 3.3
Let (x k , y k ) be generated by Algorithm 3.2(I). If is nonempty, then we have and (x k , y k ) converges weakly to a solution of SEP (1).
Adding the above inequalities and using Lemma 3.6, we have which yields (69). Since γ ∈ (0, 2), (70) implies that the sequence x kx * 2 + y ky * 2 is decreasing and thus converges. Moreover, (x k ) and (y k ) are bounded. This implies that From the definition of ρ k , Lemmas 3.4 and 3.5, we have which implies Using (71), we get and lim k→∞ Au k -Bv k = 0.
Let (x,ŷ) ∈ ω w (x k , y k ), then there exist two subsequences (x k i ) and (y k i ) of (x k ) and (y k ) which converge weakly tox andŷ, respectively. From (72), it follows that (u k i ) and (v k i ) also converge weakly tox andŷ, respectively. We will show that (x,ŷ) is a solution of SEP (1). The weak convergence of (Ax k i -By k i ) to Ax -Bŷ and the lower semicontinuity of the squared norm imply that Ax -Bŷ ≤ lim inf i→∞ Ax k i -By k i = 0, that is, Ax = Bŷ. By noting that the two equalities in (54) can be rewritten as and that the graphs of the maximal monotone operators N C and N Q are weakly-strongly closed, and by passing to the limit in the last inclusions, we obtain, from (72) and (73), that x ∈ C,ŷ ∈ Q.
Hence (x,ŷ) ∈ . Now we can apply Lemma 2.3 to D := to get that the full sequence (x k , y k ) converges weakly to a point in . This completes the proof.

Remark 3.7
Employing arguments which are similar to those used in Remark 3.4, comparing (69) and (2.48) in [15], we conclude that Algorithm 3.2(I) has a better contraction property than the hybrid alternating CQ-algorithm in [15].

Theorem 3.4
Let (x k , y k ) be generated by Algorithm 3.2(II). If is nonempty, then we have and (x k , y k ) converges weakly to a solution of SEP (1).
Proof Let (x * , y * ) ∈ , that is, x * ∈ C, y * ∈ Q, and Ax * = By * . Using Lemma 2.2(ii), we have Similarly, we get Adding the above inequalities, we obtain By setting (x, y) = (x k+1 II , y k+1 II ) in (61), we get It holds Similarly, we get -2γρ k y k+1 Substituting (77) and (78) in the right-hand side of (76) and using x kγρ k c k = x k+1 I and y kγρ k d k = y k+1 I , we obtain From (68), we have So, adding (79) and (80) and using the definition of ρ k , we obtain Adding (75) and (81), we obtain (74). Employing arguments which are similar to those used in the proof of Theorem 3.3, we obtain that the whole sequence (x k , y k ) weakly converges to a solution of SEP (1), which completes proof.

Applications
The split feasibility problem (SFP) formulated as follows: was originally introduced in Censor and Elfving [29]. The SFP can be a model for many inverse problems where constraints are imposed on the solutions in the domain of a linear operator as well as in the operator's range. It has a variety of specific applications in real world, such as medical care, image reconstruction and signal processing (see [30][31][32][33] for details).
In fact, the SEP is equivalent to the SFP. Firstly, observe that the equality Ax = By in (1) equals to So, if we define the linear and bounded operator L = (B * B) -1 B * BA : H 1 → H 2 , then the SEP becomes a special case of the SFP with the operator L (e.g., see [34,35]).
On the other hand, if H 2 = H 3 and B = I, then the split equality problem (1) reduces to the split feasibility problem.
Based on this equivalence, we can construct iterative algorithms for the SEP by using the algorithms for the SFP if the operator (B * B) -1 is easily computed. We also can extend the algorithms for the SEP to the SFP.
Next, we present an algorithm for the SFP based on Algorithm 3.2.
Using Theorem 3.3, we get the convergence of Algorithm 4.1.

Numerical examples
In this section, we use the numerical example in [8] to demonstrate the efficiency and advantage of Algorithms 3.1 and 3.2 by comparing them with Algorithms 1.1, 1.2 and 1.3. We denote the vector with all elements 0 by e 0 , and the vector with all elements 1 by e 1 in what follows. In the numerical results listed in the following table, 'Iter. ' and 'Sec. ' denote the number of iterations and the cpu time in seconds, respectively. For Algorithms 1.1, 1.2, 3.1 and 3.2, 'InIt. ' denotes the number of total iterations of finding suitable β k .

Example 5.1 The SEP with
and u ∈ [e 1 , 2e 1 ] are all generated uniformly randomly.
We make comparison of Algorithms 1.1, 1.2, 1.3, 3.1, 3.2 and FISTA with different J, N , M, and report the results in Tables 1, 2, 3 and Figure 1. We take the stepsize β k via a backtracking stepsize rule. For comparison, we tried to choose best parameters through numerical experiments. We take γ = 0.8, θ = 0.99, σ = 50, ρ = 0.1 and α = 0.1 in Algorithms 1.1, 1.2, 3.1 and 3.2. And we take σ k = 0.65 in Algorithm 1.3. So the stepsize β k is chosen in such a way that  We take L 0 = 13, η = 2 and a = 7 for FISTA with backtracking (see [21]). For comparison, the same random values are taken in each test for different algorithms.
The numerical results are listed in Tables 1, 2, 3 and Figures 1-6, from which we can get some conclusions: (1) Algorithm 1.2 behaves worst, and Algorithm 1.1 is superior to it, while inferior to Algorithms 1.3, 3.1 and 3.2.

Conclusion
In this article we introduce two simultaneous projection algorithms and two semialternating projection algorithms to solve the SEP. We present larger stepsizes in (31) and (59) than those in Algorithms 2.1 and 2.2 in [15], which leads to a better contraction property and faster convergence speed of Algorithms 3.1 and 3.2. The weak convergence for the proposed methods is proved under standard conditions. A numerical experiment is provided to illustrate that, except for FISTA, Algorithms 1.3, 3.1 and 3.2 have peak values. It is thus natural to combine our methods with inertial effects. This is one of our future research topics.