# Some algorithms for classes of split feasibility problems involving paramonotone equilibria and convex optimization

## Abstract

In this paper, we first introduce a new algorithm which involves projecting each iteration to solve a split feasibility problem with paramonotone equilibria and using unconstrained convex optimization. The strong convergence of the proposed algorithm is presented. Second, we also revisit this split feasibility problem and replace the unconstrained convex optimization by a constrained convex optimization. We introduce some algorithms for two different types of objective function of the constrained convex optimization and prove some strong convergence results of the proposed algorithms. Third, we apply our algorithms for finding an equilibrium point with minimal environmental cost for a model in electricity production. Finally, we give some numerical results to illustrate the effectiveness and advantages of the proposed algorithms.

## Introduction and the problem statement

Let $$H_{1}$$ and $$H_{2}$$ be two real Hilbert spaces with inner product $$\langle\cdot,\cdot\rangle$$ and reduced norm $$\|\cdot\|$$, C and Q be nonempty closed convex subsets in $$H_{1}$$ and $$H_{2}$$, respectively.

In , Censor and Elving first introduced the split feasibility problem (shortly, SFP) in Euclidean space, which is formulated as follows:

$$\text{Find } x^{*}\in C \text{ such that } Ax^{*}\in Q,$$

where $$A:H_{1}\rightarrow H_{2}$$ is a bounded linear operator. The SFP can be a model for many inverse problems where constraints are imposed on the solutions in the domain of the linear operator as well as in its range. It has a variety of specific applications in real world such as medical care, image reconstruction and signal processing (see [5, 14,15,16,17, 21, 38, 39] for more details).

Let $$f:C\times C \rightarrow\mathbb{R}$$ be a bifunction such that $$f(x,x) = 0$$ for all $$x\in C$$. The equilibrium problem (shortly, EP)

$$\text{Find } x^{*}\in C \text{ such that } f\bigl(x^{*},y\bigr)\geq0 \text{ for all } y\in C$$

was firstly introduced by Fan  and further studied by Blum and Oettli . The solution set of the EP is denoted by $$\operatorname{Sol}(E P)$$. The EP is a generalization of many mathematical models, including variational inequality, fixed point, optimization, complementarity problems (see, for instance, [2, 4, 13, 18, 20, 22, 23, 25, 31, 33,34,35,36]).

Recently, Yen et al.  investigated the following split feasibility problem involving paramonotone equilibria and convex optimization (shortly, SEO):

### Problem 1.1

Find $$x^{*}\in C$$ such that $$f(x^{*},y)\geq0$$ for all $$y\in C$$ and $$g(Ax^{*})\leq g(z)$$ for all $$z\in H_{2}$$,

where g is a proper lower semi-continuous convex function on $$H_{2}$$. Also, they introduced the following algorithm to solve Problem 1.1:

### Algorithm 1.1

For any $$x^{k}\in C$$, take $$\eta^{k}\in\partial_{2}^{\epsilon _{k}}f(x^{k},x^{k})$$ and define

$$\alpha_{k}=\frac{\beta_{k}}{\gamma_{k}},$$

where $$\gamma_{k}=\max\{\delta_{k},\|\eta^{k}\|\}$$. Compute

$$y^{k}=P_{C}\bigl(x^{k}-\alpha_{k} \eta^{k}\bigr).$$

Take

$$\mu_{k}:= \textstyle\begin{cases} 0, &\text{if } \nabla h(y^{k})=0,\\ \rho_{k}\frac{h(y^{k})}{ \Vert \nabla h(y^{k}) \Vert ^{2}}, &\text{if } \nabla h(y^{k})\neq0, \end{cases}$$

and compute

$$z^{k}=P_{C}\bigl(y^{k}-\mu_{k}A^{*}(I- \operatorname{prox}_{\lambda g}) \bigl(Ay^{k}\bigr)\bigr).$$

Let

$$x^{k+1}=a_{k}x^{k}+(1-a_{k})z^{k}.$$

In Algorithm 1.1, $$\operatorname{prox}_{\lambda g}$$ denotes proximal mapping of the convex function g with $$\lambda> 0$$, and the parameters $$\{a_{k}\}$$, $$\{\delta_{k}\}$$, $$\{\beta_{k}\}$$, $$\{\epsilon_{k}\}$$ and $$\{\rho_{k}\}$$ are taken as in Algorithm 3.1 (see below Sect. 3).

Note that Algorithm 1.1 involves two exact projections onto the feasible set C, which limits the applicability of the method, especially when such projections are hard to compute. It is well known that only in a few specific instances the projection onto a convex set has an explicit formula. When the feasible set C is a general closed convex set, we must solve a nontrivial quadratic problem in order to compute the projection onto C.

In this paper, by expanding the domain of function f, we introduce a new algorithm which just involves a projection onto C. Also, we revisit Problem 1.1 and replace the unconstrained convex optimization by a constrained convex optimization. Further, we introduce two iterative algorithms to solve the new model and prove some strong convergence results of the proposed algorithms.

The paper is organized as follows: Sect. 2 deals with some definitions and lemmas for the main results in this paper. In Sect. 3, we introduce a new algorithm, which involves a projection in each iteration. In Sect. 4, we introduce two algorithms and study their convergence. In Sect. 5, we provide a practical model for an electricity market and some computational results for the model.

## Preliminaries

The following definitions and lemmas are useful for the validity and convergence of the algorithms.

### Definition 2.1

Let H be a Hilbert space, $$T:H\rightarrow H$$ be a mapping and let $$K\subseteq H$$.

1. (i)

T is said to be nonexpansive if

$$\Vert Tx-Ty \Vert \leq \Vert x-y \Vert$$

for all $$x,y\in H$$.

2. (ii)

T is said to be firmly nonexpansive if

$$\Vert Tx-Ty \Vert \leq\langle x-y,Tx-Ty\rangle$$

for all $$x,y\in H$$, or

$$0\leq\bigl\langle Tx-Ty,(I-T)x-(I-T)y\bigr\rangle$$

for all $$x,y\in H$$.

3. (iii)

T is said to be Lipschitz continuous with Lipschitz constant L if

$$\Vert Tx-Ty \Vert \leq L \Vert x-y \Vert$$

for all $$x,y\in H$$.

4. (iv)

T is said to be α-averaged if

$$T=(1-\alpha)I+\alpha S,$$

where $$\alpha\in(0,1)$$ and $$S:H\rightarrow H$$ is a nonexpansive mapping.

### Lemma 2.1

([1, Proposition 4.4])

Let H be a Hilbert space and $$T: H\rightarrow H$$ be a mapping. Then the following are equivalent:

1. (i)

T is firmly nonexpansive;

2. (ii)

$$I-T$$ is firmly nonexpansive.

### Lemma 2.2

([3, 9])

The composition of finitely many averaged mappings is averaged. In particular, if $$T_{1}$$ is $$\alpha_{1}$$-averaged and $$T_{2}$$ is $$\alpha_{2}$$-averaged, where $$\alpha_{1},\alpha_{2}\in(0,1)$$, then the composition $$T_{1}T_{2}$$ is α-averaged, where $$\alpha=\alpha _{1}+\alpha_{2}-\alpha_{1}\alpha_{2}$$.

It is easy to show that firmly nonexpansive mappings are $$\frac{1}{2}$$-averaged, and averaged mappings are nonexpansive.

For a mapping $$T:H\rightarrow H$$, $$\operatorname{Fix}(T)$$ denotes the set of fixed points of T, i.e.,

$$\operatorname{Fix}(T):=\{x\in H: Tx=x\}.$$

It is well known that every nonexpansive operator $$T:H\rightarrow H$$ satisfies the following inequality:

$$\bigl\langle (x-Tx)-(y-Ty),Ty-Tx\bigr\rangle \leq\frac{1}{2} \bigl\Vert (x-Tx)-(y-Ty) \bigr\Vert ^{2}$$

for all $$x,y\in H$$ and so

$$\langle x-Tx,y-Tx\rangle\leq\frac{1}{2} \Vert x-Tx \Vert ^{2}$$

for all $$x\in H$$ and $$y\in\operatorname{Fix}(T)$$ (see, for example, [11, Theorem 3], [12, Theorem 1]).

Let H be a real Hilbert space and K be a nonempty convex closed subset of H. For each point $$x\in{H}$$, there exists a unique nearest point in K, denoted by $$P_{K}(x)$$, such that

$$\bigl\Vert x-P_{K} ( x ) \bigr\Vert \leq \Vert x-y \Vert$$

for all $$y\in K$$. The mapping $$P_{K}:H\rightarrow K$$ is called the metric projection of H onto K. It is well known that $$P_{K}$$ is a nonexpansive mapping of H onto K and even a firmly nonexpansive mapping. So, $$P_{K}$$ is also $$\frac{1}{2}$$-averaged, which is captured in the following lemma:

### Lemma 2.3

For any $$x,y\in H$$ and $$z\in K$$, the following hold:

1. (i)

$$\Vert P_{K}(x)-P_{K}(y)\Vert^{2}\leq\Vert x-y \Vert$$;

2. (ii)

$$\Vert P_{K}(x)-z\Vert^{2}\leq\Vert x-z\Vert ^{2}-\Vert P_{K}(x)-x\Vert^{2}$$.

Some characterizations of the metric projection $$P_{K}$$ are given by the two properties in the following lemma:

### Lemma 2.4

Let $$x\in H$$ and $$z\in K$$. Then $$z=P_{K}(x)$$ if and only if $$P_{K}(x)\in K$$ and

$$\bigl\langle x-P_{K} ( x ),P_{K} ( x ) -y \bigr\rangle \geq0$$

for all $$x\in H$$ and $$y\in K$$.

### Lemma 2.5

Let C be a nonempty closed convex subset in a Hilbert space H and $$P_{C}(x)$$ be the metric projection of x onto C. Then we have

1. (i)

$$\langle x-y,P_{C}(x)-P_{C}(y)\rangle\geq\|P _{C}(x)-P_{C}(y)\|^{2}$$ for all $$x,y\in C$$;

2. (ii)

$$\|z^{k}-y^{k}\|\leq\beta_{k}$$.

### Lemma 2.6

Let $$\{v^{k}\}$$ and $$\{\delta_{k}\}$$ be the nonnegative sequences of real numbers satisfying $$v^{k+1}\leq v^{k}+\delta_{k}$$ with $$\sum_{k=1}^{\infty}\delta_{k}<+\infty$$. Then the sequence $$\{v^{k}\}$$ is convergent.

### Lemma 2.7

Let H be a real Hilbert space, $$\{a_{k}\}$$ be a sequence of real numbers such that $$0< a< a_{k}< b<1$$ for all $$k\geq1$$ and $$\{v^{k}\}$$, $$\{w^{k}\}$$ be the sequences in H such that

$$\limsup_{k\rightarrow+\infty} \bigl\Vert v^{k} \bigr\Vert \leq c,\qquad \limsup_{k\rightarrow+\infty} \bigl\Vert w^{k} \bigr\Vert \leq c$$

and, for some $$c>0$$,

$$\limsup_{k\rightarrow+\infty} \bigl\Vert a_{k}v^{k}+ \bigl(1-a^{k}\bigr)w^{k} \bigr\Vert =c.$$

Then $$\lim_{k\rightarrow+\infty}\|v^{k}-w^{k}\|=0$$.

### Definition 2.2

()

The normal cone of K at $$v\in K$$, denote by $$N_{K}$$, is defined as follows:

$$N_{K}(v):=\bigl\{ d\in H: \langle d,y-v\rangle\leq0, \forall y\in K \bigr\} .$$

### Definition 2.3

([1, Definition 16.1])

The subdifferential set of a convex function c at a point x is defined as follows:

$$\partial c(x):=\bigl\{ \xi\in H: c(y)\geq c(x)+\langle\xi,y-x\rangle, \forall y \in H\bigr\} .$$

Define by $$\iota_{K}$$ the indicator function of the set K, i.e.,

$$\iota_{K}(x)= \textstyle\begin{cases} 0 &x\in K, \\ +\infty &\text{otherwise}. \end{cases}$$

It is well known that $$\partial\iota_{K}(x)=N_{K}(x)$$ and $$(I+\lambda N_{K})^{-1}=P_{K}$$ for any $$\lambda>0$$.

Let $$f:H\times H\rightarrow\mathbb{R}$$ be a bifunction. We need the following assumptions on $$f(x,y)$$ for our algorithms and convergence:

(A1) For each $$x\in C$$, $$f(x,x)=0$$ and $$f(x,\cdot)$$ is lower semi-continuous and convex on C;

(A2) $$\partial_{2}^{\epsilon}f(x,x)$$ is nonempty for any $$\epsilon> 0$$ and $$x\in C$$ and is bounded on any bounded subset of C, where $$\partial_{2}^{\epsilon}f(x,x)$$ denotes ϵ-subdifferential of the convex function $$f(x,\cdot)$$ at x, that is,

$$\partial_{2}^{\epsilon}f(x,x):=\bigl\{ \eta\in H_{1}: \langle\eta,y-x \rangle+f(x,x)\leq f(x,y)+\epsilon, \forall y\in C\bigr\} ;$$
(1)

(A3) f is pseudo-monotone on C with respect to every solution of the EP, that is, $$f(x,x^{*})\leq0$$ for any $$x\in C$$, $$x^{*}\in\operatorname{Sol}(EP)$$ and f satisfies the following condition, which is called the para-monotonicity property:

$$x^{*}\in\operatorname{Sol}(EP),\qquad y\in C, \qquad f \bigl(x^{*},y\bigr)=f\bigl(y,x^{*}\bigr)=0 \quad\Longrightarrow\quad y\in\operatorname{Sol}(EP);$$

(A4) For all $$x \in K$$, $$f(\cdot,x)$$ is weakly upper semi-continuous on C.

## A new algorithm for Problem 1.1 and its convergence analysis

In this section we give a new algorithm for Problem 1.1 and analyze its convergence.

Recall that the proximal mapping of the convex function g with $$\lambda> 0$$, denoted by $$\operatorname{prox}_{\lambda g}$$, is defined as the unique solution of the strongly convex programming problem:

$$\operatorname{prox}_{\lambda g}(u)=\mathop {\operatorname {argmin}}_{v\in H_{2}} \biggl\{ g(v)+ \frac{1}{2\lambda } \Vert v-u \Vert ^{2} \biggr\} .$$
(P(u))

The proximal mapping has some good properties, namely, it is firmly nonexpansive and $$\operatorname{prox}_{\lambda g}=P_{Q}$$ when $$g=\delta _{Q}$$ (see, e.g., ).

For any $$\lambda> 0$$, we set

$$h(x):= \frac{1}{2} \bigl\Vert (I - \operatorname{prox}_{\lambda g})Ax \bigr\Vert ^{2}.$$

By using the necessary and sufficient optimality condition for convex programming, we can see that $$h(x) = 0$$ if and only if Ax solves $$P(u)$$ with $$u = Ax$$. Note that, even though g may not be differentiable, h is always differentiable and $$\nabla h(x) = A^{*}(I-\operatorname{prox} _{\lambda g})Ax$$ (see, for example, ).

### Algorithm 3.1

Take positive parameters δ, ξ and the real sequences $$\{a_{k}\}$$, $$\{\delta_{k}\}$$, $$\{\beta_{k}\}$$, $$\{\epsilon_{k}\}$$, $$\{\rho_{k}\}$$ satisfying the following conditions: for each $$k\in\mathbb{N}$$,

\begin{aligned} &0< a\leq a_{k}\leq b< 1, \qquad 0< \xi\leq\rho_{k}\leq4-\xi, \\ &\delta_{k}>\delta>0, \qquad \beta_{k}>0,\qquad \epsilon_{k} \geq0, \\ &\lim_{k\rightarrow+\infty}a_{k}=\frac{1}{2}, \\ &\sum_{k=1}^{\infty}\frac{\beta_{k}}{\delta_{k}}=+\infty,\qquad \sum_{k=1}^{\infty}\beta_{k}^{2}=+ \infty,\qquad \sum_{k=1}^{\infty}\frac{\beta_{k}\epsilon_{k}}{\delta_{k}}< + \infty . \end{aligned}

Step 1. Choose $$x^{1}\in C$$ and let $$k:=1$$.

Step k. Have $$x^{k}\in C$$ and take

$$\mu_{k}:= \textstyle\begin{cases} 0 &\text{if } \nabla h(x^{k})=0,\\ \rho_{k} \frac{h(x^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}} &\text{if } \nabla h(x^{k})\neq0, \end{cases}$$

then compute

$$y^{k}=x^{k}-\mu_{k}A^{*}(I-\operatorname{prox}_{\lambda g})Ax^{k}.$$

Take $${{\eta_{k}}}\in\partial_{2}^{\epsilon_{k}}f(y^{k},y^{k})$$ and define

$$\alpha_{k}=\frac{\beta_{k}}{\gamma_{k}},$$

where $$\gamma_{k}=\max\{\delta_{k},\|\eta_{k}\|\}$$. Compute

$$z^{k}=P_{C}\bigl(y^{k}- \alpha_{k}{{\eta_{k}}}\bigr).$$
(2)

Let

$$x^{k+1}=a_{k}x^{k}+(1-a_{k})z^{k}.$$

### Remark 3.1

It is obvious that Algorithm 3.1 involves only one projection onto C per each iteration. Note that the domain of function f is $$H\times H$$.

### Lemma 3.1

()

Let S be the set of solutions of Problem 1.1 and $$y \in S$$. If $$\nabla h(x^{k})\neq0$$, then

$$\bigl\Vert y^{k}-y \bigr\Vert ^{2} \leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}-\rho_{k}(1- \rho_{k})\frac{h^{2}(x ^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}}.$$

### Lemma 3.2

()

For each $$k\geq1$$, the following inequalities hold:

1. (i)

$$\alpha_{k}\|\eta_{k}\|\leq\beta_{k}$$;

2. (ii)

$$\|z^{k}-y^{k}\|\leq\beta_{k}$$.

### Lemma 3.3

Let $$y\in S$$. Then, for each $$k\geq1$$ such that $$\nabla h(x^{k}) \neq0$$, we have

\begin{aligned} \bigl\Vert x^{k+1}-y \bigr\Vert ^{2}\leq{}& \bigl\Vert x^{k}-y \bigr\Vert ^{2}-(1-a_{k}) \rho_{k}(4-\rho_{k})\frac{h ^{2}(x^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}} \\ &{} +2(1-a_{k})\alpha_{k}f\bigl(y ^{k},y \bigr)+A_{k}, \end{aligned}

and, for each $$k\geq1$$ such that $$\nabla h(x^{k})=0$$, we have

$$\bigl\Vert x^{k+1}-y \bigr\Vert ^{2}\leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}+2(1-a_{k}) \alpha_{k}f\bigl(y^{k},y\bigr)+A _{k},$$

where $$A_{k}=2(1-a_{k})(\alpha_{k}\epsilon_{k}+\beta_{k}^{2})$$.

### Proof

By the definition of $$x^{k+1}$$, we have

\begin{aligned} \bigl\Vert x^{k+1}-y \bigr\Vert ^{2} &= \bigl\Vert a_{k}x^{k}+(1-a_{k})z^{k}-y \bigr\Vert ^{2} \\ &\leq a_{k} \bigl\Vert x^{k}-y \bigr\Vert ^{2}+(1-a_{k}) \bigl\Vert z^{k}-y \bigr\Vert ^{2}. \end{aligned}
(3)

Moreover, we have

\begin{aligned} \bigl\Vert z^{k}-y \bigr\Vert ^{2} &= \bigl\Vert y-y^{k}+y^{k}-z^{k} \bigr\Vert ^{2} \\ &= \bigl\Vert y^{k}-y \bigr\Vert ^{2}- \bigl\Vert y ^{k}-z^{k} \bigr\Vert ^{2}+2\bigl\langle y^{k}-z^{k},y-z^{k}\bigr\rangle \\ &\leq \bigl\Vert y^{k}-y \bigr\Vert ^{2}+2\bigl\langle y^{k}-z^{k},y-z^{k}\bigr\rangle . \end{aligned}

Since it follows from Lemma 2.4 and (2) that

$$\bigl\langle z^{k}-y^{k}+\alpha_{k} \eta_{k},x-z^{k}\bigr\rangle \geq0$$

for all $$x\in C$$, by taking $$x=y$$, we obtain

$$\bigl\langle z^{k}-y^{k}+\alpha_{k} \eta_{k},y-z^{k}\bigr\rangle \geq0 \quad\Longleftrightarrow \quad \bigl\langle \alpha_{k} \eta_{k},y-z^{k}\bigr\rangle \geq\bigl\langle y^{k}-z^{k},y-z ^{k}\bigr\rangle$$

and hence

\begin{aligned} \bigl\Vert z^{k}-y \bigr\Vert ^{2} &\leq \bigl\Vert y^{k}-y \bigr\Vert ^{2}+2 \bigl\langle \alpha_{k}\eta_{k},y-z ^{k}\bigr\rangle \\ &= \bigl\Vert y^{k}-y \bigr\Vert ^{2}+2\bigl\langle \alpha_{k}\eta_{k},y-y^{k} \bigr\rangle +2\bigl\langle \alpha_{k}\eta_{k},y^{k}-z^{k}\bigr\rangle . \end{aligned}
(4)

It follows from $$\eta_{k}\in\partial_{2}^{\epsilon_{k}}f(y^{k},y^{k})$$ that

\begin{aligned} &f\bigl(y^{k},y\bigr)-f\bigl(y^{k},y^{k} \bigr) \\ &\quad\geq\bigl\langle \eta_{k},y-y^{k}\bigr\rangle - \epsilon_{k} \quad \Longleftrightarrow \quad f\bigl(y^{k},y\bigr)+ \epsilon_{k}\geq\bigl\langle \eta_{k},y-y^{k}\bigr\rangle . \end{aligned}
(5)

On the other hand, from Lemma 3.2(ii), it follows that

$$\bigl\langle \alpha_{k}\eta_{k},y^{k}-z^{k} \bigr\rangle \leq\alpha_{k} \Vert \eta _{k} \Vert \bigl\Vert y^{k}-z^{k} \bigr\Vert \leq\beta_{k}^{2}.$$

From (4), (5) and $$\alpha_{k}>0$$, it follows that

$$\bigl\Vert z^{k}-y \bigr\Vert ^{2}\leq \bigl\Vert y^{k}-y \bigr\Vert ^{2}+2 \alpha_{k}f\bigl(y^{k},y\bigr)+2\alpha_{k} \epsilon_{k}+2\beta_{k}^{2}.$$
(6)

Now, we consider two cases:

Case 1. If $$\nabla h(x^{k})\neq0$$, then, thanks to Lemma 3.1, we have

$$\bigl\Vert y^{k}-y \bigr\Vert ^{2} = \bigl\Vert x^{k}-y \bigr\Vert ^{2}-\rho_{k}(4- \rho_{k})\frac{h^{2}(x ^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}}.$$

Combining this inequality with (3) and (6), we obtain

\begin{aligned} \bigl\Vert x^{k+1}-y \bigr\Vert ^{2}\leq{}& \bigl\Vert x^{k}-y \bigr\Vert ^{2}+2(1-a_{k}) \alpha_{k}f\bigl(y^{k},y\bigr) \\ &{} -(1-a_{k})\rho_{k}(4-\rho_{k}) \frac{h^{2}(x_{k})}{ \Vert \nabla h(x_{k}) \Vert ^{2}}+A_{k}, \end{aligned}

where $$A_{k}=2(1-a_{k})(\alpha_{k}\epsilon_{k}+\beta_{k}^{2})$$.

Case 2. If $$\nabla h(y_{k})=0$$, then, by the definition of $$y^{k}$$, we can write $$y^{k}=x^{k}$$. Now, by the same argument as in Case 1, we have

$$\bigl\Vert z^{k}-y \bigr\Vert ^{2}\leq \bigl\Vert y^{k}-y \bigr\Vert ^{2}+2\alpha_{k}f \bigl(y^{k},y\bigr)+2\alpha_{k} \epsilon_{k}+2 \beta_{k}^{2}.$$

Then we have

$$\bigl\Vert x^{k+1}-y \bigr\Vert ^{2}\leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}+2(1-a_{k}) \alpha_{k}f\bigl(y^{k},y\bigr)+A _{k},$$

where $$A_{k}=2(1-a_{k})(\alpha_{k}\epsilon_{k}+\beta_{k}^{2})$$. This completes the proof. □

### Theorem 3.1

Suppose that Problem 1.1 admits a solution. Then, under Assumptions (A1)(A4), the sequence $$\{x^{k}\}$$ generated by Algorithm 3.1 strongly converges to a solution of Problem 1.1.

### Proof

Claim 1. The sequence $$\{\|x^{k}-y\|^{2}\}$$ is convergent for all $$y\in S$$. Indeed, let $$y\in S$$. Since $$y\in \operatorname{Sol}(E P)$$ and f is pseudomonotone on C with respect to every solution of $$(E P)$$, we have

$$f\bigl(y^{k},y\bigr)\leq0.$$

If $$\nabla h(x^{k})\neq0$$, then, since

$$\rho_{k}(4-\rho_{k})\frac{h^{2}(x^{k})}{ \Vert \nabla h(x_{k}) \Vert ^{2}} \geq0,$$

it follows from Lemma 3.3 that

$$\bigl\Vert x^{k+1}-y \bigr\Vert ^{2}\leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}+A_{k},$$

where $$A_{k}=2(1-a_{k})(\alpha_{k}\epsilon_{k}+\beta_{k}^{2})$$. Since $$\alpha_{k}=\frac{\beta_{k}}{\gamma_{k}}$$ with $$\gamma_{k}=\max\{ \delta_{k},\|\eta_{k}\|\}$$,

$$\sum^{+\infty}_{k=1}\alpha_{k} \epsilon_{k}=\sum^{+\infty }_{k=1} \frac{ \beta_{k}}{\gamma_{k}}\epsilon_{k}\leq\sum^{+\infty}_{k=1} \frac{ \beta_{k}}{\delta_{k}}\epsilon_{k}< +\infty.$$

Note that $$\sum_{k=1}^{+\infty}\beta_{k}^{2}<+\infty$$ and $$0< a< a_{k}< b<1$$ and so we have

$$\sum^{+\infty}_{k=1}A_{k}< 2(1-a)\sum ^{+\infty}_{k=1}\bigl(\alpha_{k} \epsilon_{k}+\beta_{k}^{2}\bigr)< +\infty.$$

Now, using Lemma 2.6, we see that $$\{\|x^{k}-y\|^{2}\}$$ is convergent for all $$y\in S$$. Hence the sequence $$\{x^{k}\}$$ is bounded. Then, by Lemma 3.2, we can see that $$\{y^{k}\}$$ is bounded too.

Claim 2. $$\limsup_{k\rightarrow\infty}f(y^{k},y)=0$$ for all $$y\in S$$. By Lemma 3.3, for each $$k\geq1$$, we have

$$-2(1-a_{k})\alpha_{k}f\bigl(y^{k},y \bigr)\leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}- \bigl\Vert x^{k+1}-y \bigr\Vert ^{2}+A _{k}.$$

Summing up both sides in the above inequality, we obtain

$$\sum_{k=1}^{\infty}-2(1-a_{k}) \alpha_{k}f\bigl(y^{k},y\bigr)< +\infty.$$

On the other hand, using Assumption (A2) and the fact that $$\{x^{k}\}$$ is bounded, we see that $$\{\|\eta_{k}\|\}$$ is bounded. Then there exists $$L>\delta$$ such that $$\|\eta_{k}\|\leq L$$ for each $$k\geq1$$. Therefore, we have

$$\frac{\gamma_{k}}{\delta_{k}}=\max \biggl\{ 1,\frac{ \Vert \eta_{k} \Vert }{ \delta+k} \biggr\} \leq \frac{L}{\delta}$$

and hence

$$\alpha_{k}=\frac{\beta_{k}}{\gamma_{k}}\geq\frac{\delta}{L}\frac{ \beta_{k}}{\delta_{k}}.$$

Since y is a solution, by pseudomonotonicity of f, we have $$-f(y^{k},y)\geq0$$, which together with $$0< a< a_{k}< b<1$$ implies

$$\sum_{k=1}^{\infty}(1-b)\frac{\beta_{k}}{\delta_{k}} \bigl[-f\bigl(y^{k},y\bigr)\bigr]< + \infty.$$

But, from $$\sum^{\infty}_{k=1}\frac{\beta_{k}}{\delta_{k}}=+\infty$$, it follows that

$$\limsup_{k\rightarrow+\infty}f\bigl(y^{k},y\bigr)=0$$

for all $$y\in S$$.

Claim 3. For any $$y\in S$$, suppose that $$\{y^{k_{j}} \}$$ is the subsequence of $$\{y^{k}\}$$ such that

$$\limsup_{k\rightarrow+\infty}f\bigl(y^{k},y\bigr)= \lim_{j\rightarrow+\infty }f\bigl(y^{k_{j}},y\bigr)$$
(7)

and $$y^{*}$$ is a weakly cluster point of $$\{y^{k_{j}} \}$$. Then $$y^{*}$$ belongs to $$\operatorname{Sol}(EP)$$.

Without loss of generality, we can assume that $$\{y^{k_{j}}\}$$ weakly converges to $$y^{*}$$ as $$j\rightarrow\infty$$. Since $$f(\cdot,y)$$ is upper semi-continuous, by Claim 2, we have

$$f\bigl(y^{*},y\bigr)\geq\limsup_{j\rightarrow+\infty}f \bigl(y^{k_{j}},y\bigr)=0.$$

Since $$y\in S$$ and f is pseudomonotone, we have $$f(y^{*},y)\leq0$$ and so $$f(y^{*},y)=0$$. Again, by pseudomonotonicity of f, $$f(y,y^{*}) \leq0$$ and hence $$f(y^{*},y)=f(y,y^{*})=0$$. Then, by paramonotonicity (Assumption (A3)), we can conclude that $$y^{*}$$ is also a solution of $$(EP)$$.

Claim 4. Every weakly cluster point of the sequence $$\{x^{k}\}$$ satisfies $$\bar{x}\in K$$ and $$A\bar{x}\in \mathop {\operatorname {argmin}}g$$. Let be a weakly cluster point of $$\{x^{k}\}$$ and $$\{x^{k_{j}}\}$$ be a subsequence of $$\{x^{k}\}$$ weakly converging to . Then $$\bar{x}\in K$$. From Lemma 3.3, if $$\nabla h(x^{k})\neq0$$, then we have

$$(1-a_{k})\rho_{k}(4-\rho_{k})\frac{h^{2}(x^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}} \leq \bigl\Vert x^{k}-z \bigr\Vert ^{2}- \bigl\Vert x^{k+1}-z \bigr\Vert ^{2}+A_{k}.$$

If $$\nabla h(x^{k})=0$$, then we have

$$0\leq \bigl\Vert x^{k}-z \bigr\Vert ^{2}- \bigl\Vert x^{k+1}-z \bigr\Vert ^{2}+A_{k}.$$

Let $$N_{1}:=\{k:\nabla h(x^{k})\neq0\}$$. Summing up, we can write

$$\sum_{k\in N_{1}}(1-a_{k})\rho_{k}(4- \rho_{k})\frac{h^{2}(x^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}}\leq \bigl\Vert x^{0}-z \bigr\Vert ^{2}+\sum^{\infty}_{k=1}A_{k}< + \infty.$$

Combining this fact with the assumption $$\xi\leq\rho_{k}\leq4- \xi$$ (for some $$\xi>0$$) and $$0< a< a_{k}< b<1$$, we can conclude that

$$\sum_{k\in N_{1}}\frac{h^{2}(x^{k})}{ \Vert \nabla h(x^{k}) \Vert ^{2}}< +\infty .$$

Moreover, since h is Lipschitz continuous with constant $$\|A\|^{2}$$, we see that $$\|\nabla h(x^{k})\|^{2}$$ is bounded. So, $$h(x^{k})\rightarrow0$$ as $$k\in N_{1}$$ and $$k\rightarrow\infty$$. Note that $$h(x^{k})=0$$ for $$k\notin N_{1}$$. Consequently, we have

$$\lim_{k\rightarrow+\infty}h\bigl(x^{k} \bigr)=0.$$
(8)

By the lower semi-continuity of h,

$$0\leq h(\bar{x})\leq\liminf_{j\rightarrow+\infty}h \bigl(x^{k_{j}}\bigr)= \lim_{k\rightarrow+\infty}h\bigl(x^{k} \bigr)=0,$$

which implies that Ax̄ is a fixed point of the proximal mapping of g. Thus Ax̄ is a minimizer of g. From (8) and the fact that $$\|\nabla h(x^{k})\|^{2}$$ is bounded, it follows that

$$\lim_{k\rightarrow+\infty}\mu_{k}=0,$$

which yields

$$\lim_{k\rightarrow+\infty} \bigl\Vert y^{k}-x^{k} \bigr\Vert =\lim_{k\rightarrow+ \infty}\mu_{k} \bigl\Vert A^{*}(I-\operatorname{prox}_{\lambda g}) \bigl(Ax^{k}\bigr) \bigr\Vert =0.$$

Thus $$\{y^{k_{j}}\}$$ weakly converges to .

Claim 5. $$\lim_{k\rightarrow+\infty}x^{k}= \lim_{k\rightarrow+\infty}y^{k}=\lim_{k\rightarrow+\infty}P(x^{k})=x ^{*}$$, where $$x^{*}$$ is a weakly cluster point of the sequence satisfying (7). From Claims 3 and 4, we can deduce that $$x^{*}$$ belongs to S. By Claim 1, we can assume that

$$\lim_{k\rightarrow+\infty} \bigl\Vert x^{k}-x^{*} \bigr\Vert =c< +\infty.$$

By Lemma 3.2, we have

\begin{aligned} \bigl\Vert z^{k}-x^{*} \bigr\Vert &\leq \bigl\Vert y^{k}-x^{*} \bigr\Vert + \bigl\Vert z^{k}-y^{k} \bigr\Vert \\ &\leq \bigl\Vert x^{k}-x ^{*} \bigr\Vert + \bigl\Vert x^{k}-y^{k} \bigr\Vert +\beta_{k}, \end{aligned}

which implies that

$$\limsup_{k\rightarrow+\infty} \bigl\Vert z^{k}-x^{*} \bigr\Vert \leq \limsup_{k\rightarrow+\infty}\bigl( \bigl\Vert x^{k}-x^{*} \bigr\Vert + \bigl\Vert x^{k}-y^{k} \bigr\Vert +\beta _{k}\bigr)=c.$$

On the other hand, we have

$$\lim_{k\rightarrow+\infty} \bigl\Vert a_{k}\bigl(x^{k}-x^{*} \bigr)+(1-a_{k}) \bigl(z^{k}-x ^{*}\bigr) \bigr\Vert =\lim_{k\rightarrow+\infty} \bigl\Vert x^{k+1}-x^{*} \bigr\Vert =c.$$

By applying Lemma 2.7 with $$v^{k}:=x^{k}-x^{*}$$, $$w^{k}:=z ^{k}-x^{*}$$, we obtain

$$\lim_{k\rightarrow+\infty} \bigl\Vert z^{k}-x^{k} \bigr\Vert =0.$$

Employing arguments, similar to those used in the proof of Theorem 1 in , we have

$$\lim_{k\rightarrow+\infty}x^{k}=x^{*}.$$

This completes the proof. □

## Algorithms and convergence analysis

In , Yen et al. presented an application of Problem 1.1 to a model of electricity production, in which z denotes the quantity of the materials and $$g(z)$$ is the total environmental fee that companies have to pay for environmental pollution while using materials z for production. So, from $$x\in C$$, it follows that $$z=Ax\in\{z: z=Ax, x\in C\}$$.

However, in actual production, since the resources are limited, there are usually stricter constraints on the quantity of the materials such as $$z\in Q$$, where Q is a nonempty closed convex set of $$H_{2}$$. Therefore, it is necessary to replace the unconstrained convex optimization problem $$\min_{x\in H_{2}}g(x)$$ with the constrained convex optimization as follows:

$$\min_{x\in Q}g(x),$$
(9)

whose solution is denoted by $$\operatorname{Sol}(Q,g)$$. By using (9), Problem 1.1 becomes the following problem:

### Problem 4.1

Find $$x^{*}\in C$$ such that $$f(x^{*},y)\geq0$$ for all $${y}\in C$$ such that $$Ax^{*}\in Q$$ and $$g(Ax^{*})\leq g(z)$$ for all $$z\in Q$$,

whose solution is denoted by

$$\varGamma:=\varGamma(C,Q,f,g,A):=\bigl\{ z\in\operatorname{Sol}(EP):Az\in \operatorname{Sol}(Q,g)\bigr\} .$$

Throughout this paper, we assume $$\varGamma\neq\emptyset$$.

In this section, we discuss two cases that the function g is differentiable or non-differentiable. The corresponding algorithms and their convergence are provided next.

### The case when g is differentiable

We need to make the following assumption on the mapping g:

(B) g is L-Lipschitz differentiable with $$L>0$$, i.e.,

$$\bigl\Vert \nabla g(x)-\nabla g(y) \bigr\Vert \leq L \Vert x-y \Vert$$

for all $$x,y\in H_{2}$$.

It is easy to verify that the constrained convex optimization problem (9) is equivalent to the following variational inequality problem:

$$\bigl\langle \nabla g\bigl(z^{*}\bigr),z-z^{*} \bigr\rangle \geq0 \quad\text{for all } z\in Q,$$
(10)

and the variational inequality problem (10) is equivalent to the following fixed point problem:

$$z^{*}=P_{Q}(I-\nu\nabla g)z^{*},$$
(11)

where $$\nu>0$$. So, the constrained optimization problem (9) and the fixed point problem (11) are equivalent. From the optimality condition of (9), we can also deduce the equivalence of problems (9) and (11) (see ). Next we construct an iterative algorithm based on this equivalence and Algorithm 3.1.

Firstly, define two functions

$$h(x):= \bigl\Vert \bigl(I-P_{Q}(I-\nu\nabla g) \bigr)Ax \bigr\Vert ^{2}$$
(12)

and

$$l(x):= \bigl\Vert A^{*}\bigl(I-P_{Q}(I-\nu\nabla g) \bigr)Ax \bigr\Vert ^{2},$$

where $$\nu\in(0,\frac{2}{L})$$.

### Algorithm 4.1

Take the real sequences $$\{a_{k}\}$$, $$\{\delta_{k}\}$$, $$\{\beta_{k}\}$$, $$\{\epsilon_{k}\}$$ and $$\{\rho_{k}\}$$ as in Algorithm 3.1.

Step 1. Choose $$x^{1}\in C$$ and let $$k:=1$$.

Step k. Have $$x^{k}\in C$$ and take

$$\mu_{k}:= \textstyle\begin{cases} 0 &\text{if } l(x^{k})=0,\\ \rho_{k}\frac{h(x^{k})}{l(x^{k})} &\text{if } l(x^{k})\neq0, \end{cases}$$

then compute

$$y^{k}=x^{k}-\mu_{k}A^{*} \bigl(I-P_{Q}(I-\nu\nabla g)\bigr) \bigl(Ax^{k}\bigr).$$
(13)

Take $${{\eta_{k}}}\in\partial_{2}^{\epsilon_{k}}f(y^{k},y^{k})$$ and define

$$\alpha_{k}=\frac{\beta_{k}}{\gamma_{k}},$$

where $$\gamma_{k}=\max\{\delta_{k},\|\eta_{k}\|\}$$. Compute

$$z^{k}=P_{C}\bigl(y^{k}-\alpha_{k}{{ \eta_{k}}}\bigr).$$

Let

$$x^{k+1}=a_{k}x^{k}+(1-a_{k})z^{k}.$$

Now, we need the following lemmas to prove the convergence of Algorithm 4.1:

### Lemma 4.1

([8, Lemma 6.2])

Assume that a mapping $$g:H_{2}\rightarrow H_{2}$$ satisfies Assumption (B) and $$\nu\in(0,\frac{2}{L})$$. Let $$y\in\varGamma$$. If $$\|l(x^{k})\|\neq0$$, then it follows that

\begin{aligned} \bigl\Vert y^{k}-y \bigr\Vert ^{2} &\leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}-\rho_{k}(1- \rho_{k})\frac{h^{2}(x ^{k})}{l(x^{k})}. \end{aligned}

### Proof

Let $$T=P_{Q}(I-\nu\nabla g)$$. Since $$y\in\varGamma$$, it follows from (11) that Ay is a fixed point of T. From the proof of [32, Theorem 4.1], it follows that T is $$\frac{2+\nu L}{4}$$-averaged and so it is nonexpansive. By (13) and Lemma 2.3(i), we have

\begin{aligned} \bigl\Vert y^{k}-y \bigr\Vert ^{2}\leq{}& \bigl\Vert x^{k}-y \bigr\Vert ^{2}+ \mu_{k}^{2} \bigl\Vert A^{*}(I-T) \bigl(Ax^{k}\bigr) \bigr\Vert ^{2} \\ &{} -2\mu_{k}\bigl\langle x^{k}-y, A^{*}(I-T) \bigl(Ax^{k}\bigr)\bigr\rangle . \end{aligned}
(14)

By the nonexpansivity of T and (2), we have

\begin{aligned} &\bigl\langle x^{k}-y, A^{*}(I-T) \bigl(Ax^{k}\bigr)\bigr\rangle \\ &\quad=\bigl\langle A\bigl(x^{k}-y\bigr), (I-T) \bigl(Ax ^{k} \bigr)\bigr\rangle \\ &\quad=\bigl\langle A\bigl(x^{k}-y\bigr)-(I-T) \bigl(Ax^{k} \bigr)+(I-T) \bigl(Ax^{k}\bigr), (I-T) \bigl(Ax ^{k}\bigr) \bigr\rangle \\ &\quad=\bigl\langle T\bigl(Ax^{k}\bigr)-Ay, Ax^{k}-T \bigl(Ax^{k}\bigr)\bigr\rangle + \bigl\Vert (I-T) \bigl(Ax ^{k}\bigr) \bigr\Vert ^{2} \\ &\quad\geq\frac{1}{2} \bigl\Vert (I-T) \bigl(Ax^{k}\bigr) \bigr\Vert ^{2}. \end{aligned}
(15)

Combining (14) and (15) and using the definitions of $$h(x)$$ and $$l(x)$$, we obtain

\begin{aligned} \bigl\Vert y^{k}-y \bigr\Vert ^{2} &\leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}+\mu_{k}^{2}l \bigl(x^{k}\bigr)-\mu_{k}h\bigl(x ^{k}\bigr) \\ &= \bigl\Vert x^{k}-y \bigr\Vert ^{2}- \rho_{k}(1-\rho_{k})\frac{h^{2}(x^{k})}{l(x ^{k})}. \end{aligned}

This completes the proof. □

### Remark 4.1

From (15), it follows that $$l(x)=0$$ implies $$h(x)=0$$.

Using Lemma 4.1 and following the lines of the proof of Lemma 3.3, we have the following:

### Lemma 4.2

Let $$y\in\varGamma$$. Then, for each $$k\geq1$$ such that $$l(x^{k}) \neq0$$, we have

\begin{aligned} \bigl\Vert x^{k+1}-y \bigr\Vert ^{2}\leq{}& \bigl\Vert x^{k}-y \bigr\Vert ^{2}-\rho_{k}(1- \rho_{k})\frac{h ^{2}(x^{k})}{l(x^{k})} \\ &{} +2(1-a_{k})\alpha_{k}f\bigl(y^{k},y \bigr)+A_{k} \end{aligned}

and, for each $$k\geq1$$ such that $$l(x^{k})=0$$, we have

$$\bigl\Vert x^{k+1}-y \bigr\Vert ^{2} \leq \bigl\Vert x^{k}-y \bigr\Vert ^{2}+2(1-a_{k}) \alpha_{k}f\bigl(y^{k},y\bigr)+A _{k},$$

where $$A_{k}=(1-a_{k})(\alpha_{k}\epsilon_{k}+\beta_{k}^{2})$$.

Next we establish the convergence of Algorithm 4.1.

### Theorem 4.1

Under Assumptions (A1)(A4) and (B), the sequence $$\{x^{k}\}$$ generated by Algorithm 4.1 strongly converges to a solution of Problem 4.1.

The proof of Theorem 4.1 is similar with that of Theorem 3.1, so here we omit it.

The only thing to note about the proof of Theorem 4.1 is that from $$h(\bar{x})=0$$ it follows that Ax̄ is a fixed point of $$P_{Q}(I-\nu\nabla g)$$. Thus Ax̄ is a solution of (9).

### The case when g is non-differentiable

Let $$g:Q\rightarrow\mathbb{R}\cup\{+\infty\}$$ be a proper convex lower semi-continuous function. Denote by

$$g_{\lambda}(z):=\min_{u\in Q}\biggl\{ g(u)+ \frac{1}{2\lambda} \Vert u-y \Vert ^{2}\biggr\}$$
(16)

the Moreau–Yosida approximate of the function g with the parameter λ. It is easy to see that the solution of (16) converges to that of $$\min_{x\in Q}g(x)$$ as $$\lambda\rightarrow\infty$$.

For the mapping defined in (16), we have following result.

### Lemma 4.3

The constrained optimization problem

$$\min_{x\in Q}g_{\lambda}(x)$$
(17)

is equivalent to the fixed point formulation

$$x^{*}=P_{Q}\bigl(x^{*}-\nu(I- \operatorname{prox}_{\lambda g})x^{*}\bigr),$$
(18)

where $$\nu\in(0,+\infty)$$.

### Proof

It is well known that problem (17) is equivalent to the following problem:

$$\min_{x\in H_{2}}\bigl\{ \iota_{Q}(x)+g_{\lambda}(x) \bigr\} .$$
(19)

Note that the differentiability of the Yosida-approximate $$g_{\lambda }$$ (see, for instance, ) secures the additivity of the subdifferentials, and so we can write

$$\partial\bigl(\iota_{Q}(x)+g_{\lambda}(x)\bigr)=\partial \iota_{Q}(x)+\frac{I-\operatorname{prox} _{\lambda g}}{\lambda}(x).$$

The optimality condition of (19) can be then written as follows:

$$0\in\lambda\partial\iota_{Q}(x)+(I-\operatorname{prox}_{\lambda g}) (x),$$
(20)

where the subdifferential of $$\iota_{C}$$ at x is $$N_{C}(x)$$. The inclusion (20) in turn yields (18). This completes the proof. □

Set

$$h(x)= \bigl\Vert \bigl(I-P_{Q}\bigl(I-\nu(I-\operatorname{prox}_{\lambda g}) \bigr)\bigr)Ax \bigr\Vert ^{2}$$

and

$$l(x)= \bigl\Vert A^{*}\bigl(I-P_{Q}\bigl(I-\nu(I- \operatorname{prox}_{\lambda g})\bigr)\bigr)Ax \bigr\Vert ^{2}.$$

Similar to Algorithm 3.1, using Lemma 4.3, we introduce the following algorithm:

### Algorithm 4.2

Take the real sequences $$\{a_{k}\}$$, $$\{\delta_{k}\}$$, $$\{\beta_{k}\}$$, $$\{\epsilon_{k}\}$$ and $$\{\rho_{k}\}$$ as in Algorithm 3.1. Take a positive parameter ν.

Step 1. Choose $$x^{1}\in C$$ and let $$k:=1$$.

Step k. Have $$x^{k}\in C$$ and take

$$\mu_{k}:= \textstyle\begin{cases} 0 & \text{if } l(x^{k})=0,\\ \rho_{k}\frac{h(x^{k})}{l(x^{k})} &\text{if } l (x^{k})\neq0, \end{cases}$$

then compute

$$y^{k}=x^{k}-\mu_{k} \bigl(A^{*}\bigl(I-P_{Q}\bigl(I-\nu(I-\operatorname{prox}_{\lambda g}) \bigr)\bigr) \bigl(Ax^{k}\bigr)\bigr).$$
(21)

Take $${{\eta_{k}}}\in\partial_{2}^{\epsilon_{k}}f(y^{k},y^{k})$$ and define

$$\alpha_{k}=\frac{\beta_{k}}{\gamma_{k}},$$

where $$\gamma_{k}=\max\{\delta_{k},\|\eta_{k}\|\}$$. Compute

$$z^{k}=P_{C}\bigl(y^{k}-\alpha_{k}{{ \eta_{k}}}\bigr).$$

Let

$$x^{k+1}=a_{k}x^{k}+(1-a_{k})z^{k}.$$

### Remark 4.2

Let $$\nu=1$$ in Algorithm 4.2, then formula (21) becomes

$$y^{k}=x^{k}-\mu_{k}\bigl(A^{*}(I-P_{Q} \circ\operatorname{prox}_{\lambda g}) \bigl(Ax^{k}\bigr)\bigr),$$

which yields

$$y^{k}=x^{k}-\mu_{k}A^{*}(I- \operatorname{prox}_{\lambda g}) \bigl(Ax^{k}\bigr),$$

when $$Q=H_{2}$$. So, Algorithm 3.1 is a special case of Algorithm 4.2.

We need the following lemmas for the proof of the convergence of Algorithm 4.2.

### Lemma 4.4

Let $$\nu\in(0,1]$$. Then operator $$P_{Q}(I-\nu(I-\operatorname{prox}_{\lambda g}))$$ is nonexpansive.

### Proof

By the fact that $$\operatorname{prox}_{\lambda g}$$ is firmly nonexpansive and Lemma 2.1, $$I-\operatorname{prox}_{\lambda g}$$ and $$\nu(I-\operatorname{prox}_{\lambda g})$$ are also firmly nonexpansive. So, using Lemma 2.1 again, $$I-\nu(I-\operatorname{prox}_{\lambda g})$$ is firmly nonexpansive. Thus, from Lemma 2.2, it follows that $$P_{Q}(I-\nu (I-\operatorname{prox} _{\lambda g}))$$ is $$\frac{3}{4}$$-averaged and hence nonexpansive. This completes the proof. □

Using Lemma 4.4 and following the proof of Theorem 4.1, we obtain the convergence result of Algorithm 4.2.

### Theorem 4.2

Let $$\nu\in(0,1]$$. Then, under Assumptions (A1)(A4), the sequence $$\{x^{k}\}$$ generated by Algorithm 4.2 strongly converges to a solution of Problem 4.1.

The proof of Theorem 4.2 is similar to that of Theorem 3.1, so here we omit it.

One thing to note about proof of Theorem 4.2 is that from $$h( \bar{x})=0$$ it follows that Ax̄ is a fixed point of $$P_{Q}(I-\nu(I-\operatorname{prox}_{\lambda g}))$$. Thus, by Lemma 4.3, Ax̄ is a solution of (17).

## Numerical examples

In this section, we provide two numerical examples to compare different algorithms. All programs are written in Matlab version 7.0 and performed on a desktop PC with Intel(R) Core(TM) i5-4200U CPU @ 2.30 GHz, RAM 4.00 GB.

### Example 5.1

First, we consider an equilibrium-optimization model which was investigated by Yen et al. . This model can be regarded as an extension of a Nash–Cournot oligopolistic equilibrium model in electricity markets. The latter model has been investigated in some research papers (see, for example, [10, 27]).

In this equilibrium model, it is assumed that there are n companies. Let x denote the vector whose entry $$x_{i}$$ stands for the power generated by company i. Following Contreras et al. , we suppose that the price $$p_{i}(s)$$ is a decreasing affine function of s with $$s = \sum_{i=1}^{n} x_{i}$$, that is,

$$p_{i}(s) = \alpha-\beta_{i} s.$$

Then the profit made by company i is given by

$$f_{i}(x) = p_{i}(s)x_{i}-c_{i}(x_{i}),$$

where $$c_{i}(x_{i})$$ is the cost for generating $$x_{i}$$ by the company i.

Suppose that $$C_{i}$$ is the strategy set of company i, that is, condition $$x_{i}\in C_{i}$$ must be satisfied for each i. Then the strategy set of the model is $$C:=C_{1}\times C_{2}\times\cdots \times C_{n}$$.

Actually, each company seeks to maximize its profit by choosing the corresponding production level under the presumption that the production of the other companies are parametric input. A commonly used approach to this model is based upon the famous Nash equilibrium concept.

Now, we recall that a point $$x^{*}\in C = C_{1}\times C_{2}\times \cdots\times C_{n}$$ is an equilibrium point of the model if

$$f_{i}\bigl(x^{*}\bigr)\geq f_{i} \bigl(x^{*}[x_{i}]\bigr)$$

for all $$x_{i}\in C_{i}$$ and $$i=1,2,\dots,n$$, where $$x^{*}[x_{i}]$$ stands for the vector obtained from $$x^{*}$$ by replacing $$x_{i}^{*}$$ with $$x_{i}$$. By taking

$$f(x,y):=\varPsi(x,y)-\varPsi(x,x)$$

with

$$\varPsi(x,y):=-\sum_{i=1}^{n}f_{i} \bigl(x[y_{i}]\bigr),$$
(22)

the problem of finding a Nash equilibrium point of the model can be formulated as follows:

$$\text{Find } x^{*}\in C \text{ such that } f\bigl(x^{*},x\bigr)\geq0 \text{ for all } x\in C.$$
(EP)

In , Yen et al. extended this equilibrium model by additionally assuming that the companies use some materials to produce electricity.

Let $$a_{l,i}$$ denote the quantity of material l $$(l = 1,\dots, m)$$ for producing one unit of electricity by company i $$(i=1, \dots,n)$$. Let A be the matrix whose entries are $$a_{l,i}$$. Then entry l of the vector Ax is the quantity of material l for producing x. Using materials for production may cause environmental pollution, for which companies have to pay a fee. Suppose that $$g(Ax)$$ is the total environmental fee for producing x.

The task now is to find a production $$x^{*}$$ such that it is a Nash equilibrium point with a minimum environmental fee, while the quantity of the materials satisfies constraint Q. This problem can be formulated as the split feasibility problem of the following form:

\begin{aligned} &\text{Find } x^{*}\in C \text{ such that } f\bigl(x^{*},x\bigr)\geq0 \text{ for all } x\in C \\ &\quad\text{and } g \bigl(Ax^{*}\bigr)\leq g(Ax) \text{ for all } Ax\in Q. \end{aligned}
(SEP)

Suppose that, for every i, cost $$c_{i}$$ for production and environmental fee g are increasing convex functions. The convexity assumption here means that both the cost and fee for producing a unit of product increase as the quantity of the product gets larger.

Under this convexity assumption, it is not hard to see (see also Quoc et al. ) that problem (EP) with f given by (22) can be formulated as follows:

$$\text{Find } x^{*}\in C \text{ such that } f\bigl(x^{*},x\bigr):=\bigl\langle \tilde{B}_{1}x^{*}-\bar{a}, x-x^{*}\bigr\rangle + \varphi(x)-\varphi\bigl(x^{*}\bigr)\geq0 \text{ for all } x\in C,$$
(23)

where

$\begin{array}{rl}& \overline{a}:={\left(\alpha ,\alpha ,\dots ,\alpha \right)}^{T},\\ & {B}_{1}:=\left(\begin{array}{ccccc}{b}_{1}& 0& 0& \dots & 0\\ 0& {b}_{2}& 0& \dots & 0\\ ⋮& ⋮& ⋮& \ddots & ⋮\\ 0& 0& 0& \dots & {b}_{n}\end{array}\right),\phantom{\rule{2em}{0ex}}{\stackrel{˜}{B}}_{1}:=\left(\begin{array}{ccccc}0& {b}_{1}& {b}_{1}& \dots & {b}_{1}\\ {b}_{2}& 0& {b}_{2}& \dots & {b}_{2}\\ ⋮& ⋮& ⋮& \ddots & ⋮\\ {b}_{n}& {b}_{n}& {b}_{n}& \dots & 0\end{array}\right),\\ & \phi \left(x\right):={x}^{T}{B}_{1}x+\sum _{i=1}^{n}{c}_{i}\left({x}_{i}\right).\end{array}$
(24)

Note that, when $$c_{i}$$ is differentiable and convex for each i, problem (23) is equivalent to the following variational inequality problem:

$$\text{Find } x^{*}\in C \text{ such that } \bigl\langle \tilde{B}_{1}x^{*}-\bar{a}+ \nabla\varphi\bigl(x^{*}\bigr), x-x^{*} \bigr\rangle \geq0 \text{ for all } x\in C.$$

We tested the proposed algorithm with the cost function given by

$$c_{i}(x_{i})=\frac{1}{2}p_{i}x_{i}^{2}+q_{i}x_{i},$$
(25)

where $$p_{i}\geq0$$. In , Yen et al. showed that function $$f(x,y)$$ defined by (23), (24) and (25) satisfies Assumptions (A1), (A2) and (A4).

In , the author denoted by $$g(z)$$ the total environmental fee. It is unreasonable. Firstly, the total environmental fee should be included in the cost, that is, it is a part of $$c_{i}(x_{i})$$. Secondly, it is supposed that the companies behave as players in an oligopolistic market, but at the same time they are subordinated to the centralized planning decision in order to minimize the total environmental fee for the whole system. That is, the model is not concordant with the real system behavior.

It may be reasonable to denote by $$g(z)$$ the restriction for the emission of contaminants. To protect the environment, governments generally adopt policies to restrict emissions of contaminants.

Assume that the production of electricity brings p contaminants and governments require that the quantity of contaminants brought by the production of one unit of electricity is in a given region. We use a set $$K\subset\mathbb{R}^{p}$$ to denote this region.

Let $$b_{k,l}$$ denote the quantity of the contaminant k $$(k = 1, \dots, p)$$ for consuming one unit of material l $$(l = 1,\dots, m)$$. Let B be the matrix whose entries are $$b_{k,l}$$. Then entry k of vector Bz is the quantity of contaminant k for consuming one unit of material $$z_{l}$$ $$(l = 1,\dots, m)$$. So, the quantity of contaminant k $$(k = 1,\dots, p)$$ for producing one unit of electricity is entry k of $$BAx$$, and $$BAx$$ should be in the set K, i.e., $$BAx\in K$$. We get $$Bz\in K$$ when letting $$z=Ax$$.

Therefore we define function $$g(z)$$ as follows:

$$g(z)=\frac{1}{2} \bigl\Vert Bz-P_{K}(Bz) \bigr\Vert ^{2},$$
(26)

which is differentiable and $$\nabla g(z)=B^{T}(I-P_{K})(Bz)$$ (see, e.g., ).

Take the sequences $$\{\beta_{k}\}$$, $$\{\epsilon_{k}\}$$, $$\{\delta_{k} \}$$ of the parameters as follows:

$$\beta_{k}=\frac{4}{k+1}, \qquad \epsilon_{k}=0,\qquad \delta_{k}=3, \qquad \gamma_{k}=\max\bigl\{ 3, \Vert {{\eta_{k}}} \Vert \bigr\}$$

for each $$k\geq1$$ and take $$\nu=\frac{1.99}{\|B\|}$$. The entries of matrix A were randomly generated in the interval [0, 5]. In the bifunction $$f(x,y)$$ defined by (23), (24) and (25), the parameters $$\alpha=0.5$$ and $$b_{i}$$, $$p_{i}$$ and $$q_{i}$$ for each $$i = 1,\dots, n$$ were generated randomly in the interval $$(0, 1]$$, $$[1, 3]$$, and $$[1, 3]$$, respectively. In the function $$g(z)$$, we take $$B\in\mathbb{R}^{p}\times\mathbb{R}^{m}$$, and its elements are generated randomly in $$(0,1)$$.

Since function $$g(z)$$ is differentiable, we use Algorithm 4.1 to solve Problem 4.1 and compare it with Algorithms 1.1 and 3.1. In Algorithms 1.1 and 3.1 we substitute $$\operatorname{prox}_{\lambda g}$$ with $$I-\nu\nabla g$$ and do not consider the constraint set Q.

The computational results are shown in Figs. 1 and 2. The horizontal and vertical axes show iteration k, as well as error1$$(k):= \|x^{k}-x ^{k-1}\|$$ and error2$$(k):= \|Ax^{k}-P_{Q}(Ay^{k})\|$$, respectively. We solve the model with $$m=15$$ and take $$n=10$$ as the number of companies.

From Figs. 1 and 2, we have two conclusions as follows:

(a) The “error1” of Algorithm 4.1 is smaller than that of Algorithms 1.1 and 3.1 and the “error1” of Algorithm 3.1 is slightly smaller than that of Algorithm 1.1.

(b) The “error2” of Algorithm 4.1 decreases with the iteration number k, while the “error2” of Algorithms 1.1 and 3.1 increases with the iteration number k. The “error2” of Algorithm 4.1 is smaller than those of Algorithms 1.1 and 3.1.

Next we give a numerical procedure in an infinite-dimensional space and compare Algorithm 4.1 with a numerical algorithm which is based on the Halpern modification of [8, Algorithm 6.1] as follows:

### Algorithm 5.1

$$x^{k+1}=\tau_{k}x^{1}+(1-\tau_{k})U \bigl(x^{k}+\gamma A^{*}(T-I) \bigl(Ax^{k}\bigr) \bigr),$$

where $$T:=P_{Q}(I-\lambda\nabla g)$$, $$U:=P_{C}(I-\lambda f)$$, and $$\gamma\in(0,1/L)$$, L is the spectral radius of the operators $$A^{*}A$$, denoted by $$\rho(A^{*}A)$$. The parameter λ depends on the constants of the inverse strong monotonicity of g and f.

According to the condition of the convergence of Halpern-type algorithm, we assume that $$\lim_{k\rightarrow\infty} \tau_{k}=0$$ and $$\sum_{k=1}^{\infty}\tau_{k}=\infty$$.

### Example 5.2

Suppose that $$H = L^{2}([0,1])$$ with norm $$\|x\|:= (\int_{0}^{1} |x(t)|^{2}\,dt )^{\frac{1}{2}}$$ and inner product $$\langle x,y\rangle:= \int_{0}^{1} x(t)y(t)\,dt$$, $$x,y \in H$$. Let $$C:=\{x \in H:\|x\|\leq1\}$$ be the unit ball, $$Q:=\{x\in H:\langle x(t),\sin(10x(t))\rangle\leq1\}$$. Define an operator $$F:C\rightarrow H$$ by

$$F(x) (t)= \int_{0}^{1} \bigl(x(t)-B(t,s)p\bigl(x(s)\bigr) \bigr)\,ds+q(t)$$

for all $$x \in C$$ and $$t \in[0,1]$$, where

$$B(t,s)=\frac{2tse^{t+s}}{e\sqrt{e^{2}-1}},\qquad p(x)=\cos x,\qquad q(t)=\frac {2te^{t}}{e\sqrt{e^{2}-1}}.$$

As shown in , F is monotone and L-Lipschitz-continuous with $$L=2$$. Let $$f(x(t), y(t))=\langle Fx(t),y(t)-x(t)\rangle$$, $$g(x)(t)=\frac{1}{2}\|x(t)\|^{2}$$ and $$(Ax)(t)=3x(t)$$ for all $$x\in H$$.

Let $$x^{1}(t)=1$$. Take the sequences $$\{\alpha_{k}\}$$, $$\{\beta_{k}\}$$, $$\{\epsilon_{k}\}$$, $$\{\delta_{k}\}$$ of the parameters as follows:

$$\alpha_{k}=\frac{1}{2},\qquad \beta_{k}= \frac{4}{k+1},\qquad \epsilon_{k}=0,\qquad \delta_{k}=3, \qquad \gamma_{k}=\max\bigl\{ 3, \Vert \eta_{k} \Vert \bigr\}$$

for each $$k\geq1$$ and take $$\nu=\frac{1.99}{L_{g}}$$. We take $$\lambda=1$$ according to the numerical tests since the constants of the inverse strong monotonicity of g and f are unknown. Take $$\tau_{k}=\frac{1}{k+1}$$ and $$\gamma=\frac{0.9}{\rho(A^{*}A)}$$ for Algorithm 5.1. We use $$\operatorname{error}=\frac{1}{2}\| P_{C}(x^{k})-x^{k}\|^{2}+ \frac{1}{2}\|P_{Q}(Ax^{k})-Ax^{k}\|^{2}$$ to measure the error of the kth iteration.

Numerical results are given in Fig. 3, which illustrate that Algorithm 4.1 behaves better than Algorithm 5.1.

## Conclusions

We first introduce a new algorithm, which involves a projection of each iteration, and show its strong convergence. We also improve the model proposed in  by adding a constraint to the minimization problem of the total environmental fee. Two algorithms are introduced to approximate the solution and their strong convergence is analyzed.

## References

1. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)

2. Blum, E., Oettli, W.: From optimization and variational inequalities to equilibrium problems. Math. Stud. 63, 123–145 (1994)

3. Byrne, C.: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Probl. 20, 103–120 (2004)

4. Ceng, L.C.: Approximation of common solutions of a split inclusion problem and a fixed-point problem. J. Appl. Numer. Optim. 1, 1–12 (2019)

5. Censor, Y., Bortfeld, T., Martin, B., Trofimov, A.: A unified approach for inversion problems in intensity modulated radiation therapy. Phys. Med. Biol. 51, 2353–2365 (2006)

6. Censor, Y., Elfving, T., Kopf, N., Bortfeld, T.: The multiple-sets split feasibility problem and its applications for inverse problems. Inverse Probl. 21, 2071–2084 (2005)

7. Censor, Y., Elving, T.: A multiprojections algorithm using Bregman projections in a product spaces. Numer. Algorithms 8, 221–239 (1994)

8. Censor, Y., Gibali, A., Reich, S.: Algorithms for the split variational inequality problem. Numer. Algorithms 59, 301–323 (2012)

9. Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475–504 (2004)

10. Contreras, J., Klusch, M., Krawczyk, J.B.: Numerical solution to Nash–Cournot equilibria in coupled constraint electricity markets. IEEE Trans. Power Syst. 19, 195–206 (2004)

11. Crombez, G.: A geometrical look at iterative methods for operators with fixed points. Numer. Funct. Anal. Optim. 26, 157–175 (2005)

12. Crombez, G.: A hierarchical presentation of operators with fixed points on Hilbert spaces. Numer. Funct. Anal. Optim. 27, 259–277 (2006)

13. Dong, Q.L., Cho, Y.J., Zhong, L.L., Rassias, T.M.: Inertial projection and contraction algorithms for variational inequalities. J. Glob. Optim. 70, 687–704 (2018)

14. Dong, Q.L., He, S., Zhao, J.: Solving the split equality problem without prior knowledge of operator norms. Optimization 64, 1887–1906 (2015)

15. Dong, Q.L., Lu, Y.Y., Yang, J.: The extragradient algorithm with inertial effects for solving the variational inequality. Optimization 65, 2217–2226 (2016)

16. Dong, Q.L., Tang, Y.C., Cho, Y.J., Rassias, T.M.: “Optimal” choice of the step length of the projection and contraction methods for solving the split feasibility problem. J. Glob. Optim. 71, 341–360 (2018)

17. Dong, Q.L., Yao, Y., He, S.: Weak convergence theorems of the modified relaxed projection algorithms for the split feasibility problem in Hilbert spaces. Optim. Lett. 8, 1031–1046 (2014)

18. Dong, Q.L., Yuan, H.B., Cho, Y.J., Rassias, T.M.: Modified inertial Mann algorithm and inertial CQ-algorithm for nonexpansive mappings. Optim. Lett. 12, 87–102 (2018)

19. Fan, K.: Fixed point and minimax theorems in locally convex topological linear spaces. Proc. Natl. Acad. Sci. USA 38, 121–126 (1952)

20. He, S.N., Tian, H.L.: Selective projection methods for solving a class of variational inequalities. Numer. Algorithms 80, 617–634 (2019)

21. He, S.N., Tian, H.L., Xu, H.K.: The selective projection method for convex feasibility and split feasibility problems. J. Nonlinear Convex Anal. 19, 1199–1215 (2018)

22. Konnov, I.V.: Combined Relaxation Methods for Variational Inequalities. Springer, Berlin (2000)

23. Konnov, I.V.: The method of pairwise variations with tolerances for linearly constrained optimization problems. J. Nonlinear Var. Anal. 1, 25–41 (2017)

24. Moudafi, A., Thakur, B.S.: Solving proximal split feasibility problems without prior knowledge of operator norms. Optim. Lett. 8, 2099–2110 (2014)

25. Muu, L.D., Oettli, W.: Convergence of an adaptive penalty scheme for finding constrained equilibria. Nonlinear Anal. 18, 1159–1166 (1992)

26. Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1, 123–231 (2013)

27. Quoc, T.D., Muu, L.D.: Iterative methods for solving monotone equilibrium problems via dual gap functions. Comput. Optim. Appl. 51, 709–728 (2012)

28. Rockafellar, T.R., Wets, R.: Variational Analysis. Springer, Berlin (1998)

29. Santos, P., Scheimberg, S.: An inexact subgradient algorithm for equilibrium problems. Comput. Appl. Math. 30, 91–107 (2011)

30. Shehu, Y., Dong, Q.L., Jiang, D.: Single projection method for pseudo-monotone variational inequality in Hilbert spaces. Optimization 68, 385–409 (2019)

31. Xiao, Y.B., Huang, N.J., Cho, Y.J.: A class of generalized evolution variational inequalities in Banach spaces. Appl. Math. Lett. 25, 914–920 (2012)

32. Xu, H.K.: Averaged mappings and the gradient-projection algorithm. J. Optim. Theory Appl. 150, 360–378 (2011)

33. Yao, Y., Leng, L., Postolache, M., Zheng, X.: Mann-type iteration method for solving the split common fixed point problem. J. Nonlinear Convex Anal. 18, 875–882 (2017)

34. Yao, Y., Liou, Y.C., Yao, J.C.: Iterative algorithms for the split variational inequality and fixed point problems under nonlinear transformations. J. Nonlinear Sci. Appl. 10, 843–854 (2017)

35. Yao, Y., Yao, J.C., Liou, Y.C., Postolache, M.: Iterative algorithms for split common fixed points of demicontractive operators without priori knowledge of operator norms. Carpath. J. Math. 34, 459–466 (2018)

36. Yao, Y.H., Postolache, M., Liou, Y.C.: Strong convergence of a self-adaptive method for the split feasibility problem. Fixed Point Theory Appl. 2013, Article ID 201 (2013)

37. Yen, L.H., Muu, L.D., Huyen, N.T.T.: An algorithm for a class of split feasibility problems: application to a model in electricity production. Math. Methods Oper. Res. 84, 549–565 (2016)

38. Zhao, J.: Solving split equality fixed-point problem of quasi-nonexpansive mappings without prior knowledge of operators norms. Optimization 64, 2619–2630 (2015)

39. Zhao, J., Zong, H.: Iterative algorithms for solving the split feasibility problem in Hilbert spaces. J. Fixed Point Theory Appl. 21, 11 (2018)

### Acknowledgements

We sincerely thank Prof. S. He for his helpful discussion and the reviewers for their valuable suggestions and useful comments that have led to the present improved version of the original manuscript.

### Availability of data and materials

Data sharing not applicable to this article as no datasets were generated during the current study.

## Funding

The first author was supported by the scientific research project of Tianjin Municipal Education Commission (No. 2018KJ253). The fifth author was supported by the Theoretical and Computational Science (TaCS) Center under Computational and Applied Science for Smart Innovation Cluster (CLASSIC), Faculty of Science, KMUTT. The authors acknowledge the financial support provided by King Mongkut’s University of Technology Thonburi through the “KMUTT 55th Anniversary Commemorative Fund”. Furthermore, Poom Kumam was supported by he Thailand Research Fund (TRF) and the King Mongkut’s University of Technology Thonburi (KMUTT) under the TRF Research Scholar Award (Grant No. RSA6080047).

## Author information

Authors

### Contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

### Corresponding author

Correspondence to P. Kumam.

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests. 