• Research
• Open Access

# Error analysis of variational discretization solving temperature control problems

Journal of Inequalities and Applications20132013:450

https://doi.org/10.1186/1029-242X-2013-450

• Accepted: 10 September 2013
• Published:

## Abstract

In this paper, we consider variational discretization solving temperature control problems with pointwise control constraints, where the state and the adjoint state are approximated by piecewise linear finite element functions, while the control is not directly discretized. We derive a priori error estimates of second-order for the control, the state and the adjoint state. Moreover, we obtain a posteriori error estimates. Finally, we present some numerical algorithms for the control problem and do some numerical experiments to illustrate our theoretical results.

## Keywords

• variational discretization
• finite element
• optimal control problems
• a priori error estimates
• a posteriori error estimates

## 1 Introduction

We are interested in a material plate defined in a two-dimensional convex domain Ω with a Lipschitz boundary Ω. For the state y of the material, we choose the temperature distribution which is maintained equal to zero along the boundary. We denote thermal radiation or positive temperature feedback due to chemical reactions by the term $\varphi \left(y\right)$ (see, e.g., ) and assume that there exists a source $f\in {L}^{2}\left(\mathrm{\Omega }\right)$. This system is governed by the following equation:
$\left\{\begin{array}{ll}-div\left(A\left(x\right)\mathrm{\nabla }y\left(x\right)\right)+\varphi \left(y\left(x\right)\right)=f\left(x\right),& x\in \mathrm{\Omega },\\ y\left(x\right)=0,& x\in \partial \mathrm{\Omega }.\end{array}$
The setting above suggests that we may control the temperature distribution y to come close to a given target by acting with an additional distributed source term u, namely the control function. The corresponding optimal control problem is formulated as follows:
$\left\{\begin{array}{l}{min}_{u\in K}\left\{{\int }_{\mathrm{\Omega }}\left(g\left(y\right)+h\left(u\right)\right)\phantom{\rule{0.2em}{0ex}}dx\right\},\\ -div\left(A\left(x\right)\mathrm{\nabla }y\left(x\right)\right)+\varphi \left(y\left(x\right)\right)=f\left(x\right)+Bu\left(x\right),\phantom{\rule{1em}{0ex}}x\in \mathrm{\Omega },\\ y\left(x\right)=0,\phantom{\rule{1em}{0ex}}x\in \partial \mathrm{\Omega },\end{array}$
(1.1)
where $g\left(\cdot \right)$ and $h\left(\cdot \right)$ are strictly convex continuous differentiable functions, $h\left(u\right)\to +\mathrm{\infty }$ as ${\parallel u\parallel }_{{L}^{2}\left(\mathrm{\Omega }\right)}\to \mathrm{\infty }$, $A\left(x\right)={\left({a}_{ij}\left(x\right)\right)}_{2×2}\in {\left({W}^{1,\mathrm{\infty }}\left(\overline{\mathrm{\Omega }}\right)\right)}^{2×2}$ such that $\left(A\left(x\right)\xi \right)\cdot \xi \ge c\mid \xi {\mid }^{2}$, $\mathrm{\forall }\xi \in {\mathbb{R}}^{2}$, $f\left(x\right)\in {L}^{2}\left(\mathrm{\Omega }\right)$, B is a linear continuous operator, and K is defined by

where a and b are two constants.

Optimal control problems have been extensively used in many aspects of the modern life such as social, economic, scientific and engineering numerical simulation . Finite element approximation seems to be the most widely used method in computing optimal control problems. A systematic introduction of finite element method for PDEs and optimal control problems can be found in . Concerning elliptic optimal control problems, a priori error estimates were investigated in [9, 10], a posteriori error estimates based on recovery techniques have been obtained in , a posteriori error estimates of residual type have been derived in , some error estimates and superconvergence results have been established in , and some adaptive finite element methods can be found in . For parabolic optimal control problems, a priori error estimates are established in , a posteriori error estimates of residual type are investigated in [35, 36]. Recently, error estimates of spectral method for optimal control problems have been derived in [37, 38], and numerical methods for constrained elliptic control problems with rapidly oscillating coefficients are studied in .

For a constrained optimal control problem, the control has lower regularity than the state and the adjoint state. So most researchers considered using piecewise linear finite element functions to approximate the state and the adjoint state and using piecewise constant functions to approximate the control. They constructed a projection gradient algorithm where the a priori error estimates of the control is first-order in [11, 12]. Recently, Borzì considered a second-order discretization and multigrid solution of elliptic nonlinear constrained control problems in , Hinze introduced a variational discretization concept for optimal control problems and derived a priori error estimates for the control which is second-order in [41, 42]. The purpose of this paper is to consider variational discretization for convex temperature control problems governed by nonlinear elliptic equations with pointwise control constraints.

In this paper, we adopt the standard notation ${W}^{m,q}\left(\mathrm{\Omega }\right)$ for Sobolev spaces on Ω with the norm ${\parallel \cdot \parallel }_{{W}^{m,q}\left(\mathrm{\Omega }\right)}$ and seminorm ${|\cdot |}_{{W}^{m,q}\left(\mathrm{\Omega }\right)}$. We set ${H}_{0}^{1}\left(\mathrm{\Omega }\right)\equiv \left\{v\in {H}^{1}\left(\mathrm{\Omega }\right):v{|}_{\partial \mathrm{\Omega }}=0\right\}$ and denote ${W}^{m,2}\left(\mathrm{\Omega }\right)$ by ${H}^{m}\left(\mathrm{\Omega }\right)$. In addition, c or C denotes a generic positive constant.

The paper is organized as follows: In Section 2, we introduce a variational discretization approximation scheme for the model problem. In Section 3, we derive a priori error estimates. In Section 4, we derive sharp a posteriori error estimates of residual type. We present some numerical algorithms and do some numerical experiments to verify our theoretical results in the last section.

## 2 Variational discretization approximation for the model problem

We now consider a variational discretization approximation for the model problem (1.1). For ease of exposition, we set $V={H}_{0}^{1}\left(\mathrm{\Omega }\right)$, $U={L}^{2}\left(\mathrm{\Omega }\right)$, $\parallel \cdot \parallel ={\parallel \cdot \parallel }_{{L}^{2}\left(\mathrm{\Omega }\right)}$, ${\parallel \cdot \parallel }_{1,\mathrm{\Omega }}={\parallel \cdot \parallel }_{{H}^{1}\left(\mathrm{\Omega }\right)}$, ${\parallel \cdot \parallel }_{2,\mathrm{\Omega }}={\parallel \cdot \parallel }_{{H}^{2}\left(\mathrm{\Omega }\right)}$ and
$\begin{array}{c}a\left(y,w\right)={\int }_{\mathrm{\Omega }}\left(A\left(x\right)\mathrm{\nabla }y\right)\cdot \mathrm{\nabla }w,\phantom{\rule{1em}{0ex}}\mathrm{\forall }y,w\in V,\hfill \\ \left(u,w\right)={\int }_{\mathrm{\Omega }}u\cdot w,\phantom{\rule{1em}{0ex}}\mathrm{\forall }u,w\in U.\hfill \end{array}$
It follows from the assumptions on $A\left(x\right)$ that
$a\left(y,y\right)\ge c{\parallel y\parallel }_{1,\mathrm{\Omega }}^{2},\phantom{\rule{2em}{0ex}}|a\left(y,w\right)|\le C{\parallel y\parallel }_{1,\mathrm{\Omega }}{\parallel w\parallel }_{1,\mathrm{\Omega }},\phantom{\rule{1em}{0ex}}\mathrm{\forall }y,w\in V.$
(2.1)
Then the standard weak formula for the state equation is
$a\left(y,w\right)+\left(\varphi \left(y\right),w\right)=\left(f+Bu,w\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }w\in V,$
(2.2)

where we assume that the function $\varphi \left(\cdot \right)\in {W}^{1,\mathrm{\infty }}\left(-R,R\right)$ for any $R>0$, ${\varphi }^{\prime }\left(\cdot \right)\ge 0$ and ${\varphi }^{\prime }\left(y\right)\in {L}^{2}\left(\mathrm{\Omega }\right)$ for any $y\in {H}^{1}\left(\mathrm{\Omega }\right)$. Thus, the equation above has a unique solution.

Throughout the paper, we impose the following assumptions:

(A1) ${g}^{\prime }\left(\cdot \right)$ and ${h}^{\prime }\left(\cdot \right)$ are Lipschitz continuous, namely,
$\begin{array}{c}|{g}^{\prime }\left({y}_{1}\right)-{g}^{\prime }\left({y}_{2}\right)|\le C|{y}_{1}-{y}_{2}|,\phantom{\rule{1em}{0ex}}\mathrm{\forall }{y}_{1},{y}_{2}\in {L}^{2}\left(\mathrm{\Omega }\right),\hfill \\ |{h}^{\prime }\left(u\left({x}_{1}\right)\right)-{h}^{\prime }\left(u\left({x}_{2}\right)\right)|\le C|{x}_{1}-{x}_{2}|,\phantom{\rule{1em}{0ex}}\mathrm{\forall }u\in K,{x}_{1},{x}_{2}\in \overline{\mathrm{\Omega }}.\hfill \end{array}$
(A2) There exists a positive constant m such that
${h}^{″}\left(u\right)\ge m,\phantom{\rule{1em}{0ex}}\mathrm{\forall }u\in K.$
Then the model problem (1.1) can be restated as
$\left\{\begin{array}{l}{min}_{u\in K}\left\{{\int }_{\mathrm{\Omega }}\left(g\left(y\right)+h\left(u\right)\right)\phantom{\rule{0.2em}{0ex}}dx\right\},\\ a\left(y,w\right)+\left(\varphi \left(y\right),w\right)=\left(f+Bu,w\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }w\in V.\end{array}$
(2.3)
It is well known (see, e.g., ) that the control problem (2.3) has a solution $\left(y,u\right)\in V×K$, and that if the pair $\left(y,u\right)\in V×K$ is the solution of (2.3), then there is an adjoint state $p\in V$ such that the triplet $\left(y,p,u\right)\in V×V×K$ satisfies the following optimality conditions:
$a\left(y,w\right)+\left(\varphi \left(y\right),w\right)=\left(f+Bu,w\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }w\in V,$
(2.4)
$a\left(q,p\right)+\left({\varphi }^{\prime }\left(y\right)p,q\right)=\left({g}^{\prime }\left(y\right),q\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }q\in V,$
(2.5)
$\left({h}^{\prime }\left(u\right)+{B}^{\ast }p,v-u\right)\ge 0,\phantom{\rule{1em}{0ex}}\mathrm{\forall }v\in K,$
(2.6)

where ${B}^{\ast }$ is the adjoint operator of B.

Lemma 2.1 Suppose that assumptions (A1)-(A2) are satisfied. Let $p\in V$ be the solution of (2.4)-(2.6). Then the following equation:
${h}^{\prime }\left(s\left(x\right)\right)+{B}^{\ast }p\left(x\right)=0,$
(2.7)

admits a unique solution $s\left(x\right)$ and $s\left(x\right)\in {C}^{0,1}\left(\overline{\mathrm{\Omega }}\right)$.

Proof It follows from ${h}^{″}\left(u\right)\ge m>0$ that (2.7) has a unique solution. Note that ${g}^{\prime }\left(y\right)\in {L}^{2}\left(\mathrm{\Omega }\right)$. From the regularity theory of patrial differential equations (see, e.g., ), we have
$p\left(x\right)\in {H}_{0}^{1}\left(\mathrm{\Omega }\right)\cap {W}^{2,2}\left(\mathrm{\Omega }\right).$
Because Ω is a two-dimension convex domain, according to embedding theorem, we get
$p\left(x\right)\in {C}^{0,1}\left(\overline{\mathrm{\Omega }}\right).$
From (A2) and (2.7), we get
$\begin{array}{r}m|s\left(x\right)-s\left({x}_{0}\right)|\\ \phantom{\rule{1em}{0ex}}\le |{\int }_{0}^{1}{h}^{″}\left(\theta s\left(x\right)+\left(1-\theta \right)s\left({x}_{0}\right)\right)\phantom{\rule{0.2em}{0ex}}d\theta \cdot \left(s\left(x\right)-s\left({x}_{0}\right)\right)|\\ \phantom{\rule{1em}{0ex}}=|{h}^{\prime }\left(s\left(x\right)\right)-{h}^{\prime }\left(s\left({x}_{0}\right)\right)|\\ \phantom{\rule{1em}{0ex}}=|-{B}^{\ast }p\left(x\right)+{B}^{\ast }p\left({x}_{0}\right)|\\ \phantom{\rule{1em}{0ex}}\le C|x-{x}_{0}|.\end{array}$

Consequently, we complete the proof of equation (2.7). □

We introduce the following pointwise projection operator:
${\mathrm{\Pi }}_{\left[a,b\right]}\left(g\left(x\right)\right)=max\left(a,min\left(b,g\left(x\right)\right)\right).$
(2.8)

It is clear that ${\mathrm{\Pi }}_{\left[a,b\right]}\left(\cdot \right)$ is Lipschitz continuous with constant 1. As in , it is easy to prove the following lemma.

Lemma 2.2 Let $\left(y,p,u\right)$ and $s\left(x\right)$ be the solutions of (2.4)-(2.6) and (2.7), respectively. Assume that assumptions (A1)-(A2) are satisfied. Then
$u\left(x\right)={\mathrm{\Pi }}_{\left[a,b\right]}\left(s\left(x\right)\right).$
(2.9)

Remark 2.1 We should point out that (2.6) and (2.9) are equivalent. This theory can be used to more complex situation, for example, K is characterized by a bound on the integral on u over Ω, namely, ${\int }_{\mathrm{\Omega }}u\left(x\right)\phantom{\rule{0.2em}{0ex}}dx\ge 0$, we have similar results.

Let ${\mathcal{T}}^{h}$ be a regular triangulation of Ω, such that $\overline{\mathrm{\Omega }}={\bigcup }_{\tau \in {\mathcal{T}}^{h}}\overline{\tau }$. Let $h={max}_{\tau \in {\mathcal{T}}^{h}}\left\{{h}_{\tau }\right\}$, where ${h}_{\tau }$ denotes the diameter of the element τ. Associated with ${\mathcal{T}}^{h}$ is a finite dimensional subspace ${S}^{h}$ of $C\left(\overline{\mathrm{\Omega }}\right)$, such that $\chi {|}_{\tau }$ are polynomials of m-order ($m\ge 1$) for all $\chi \in {S}^{h}$ and $\tau \in {\mathcal{T}}^{h}$. Let ${V}^{h}=\left\{{v}_{h}\in {S}^{h}:{v}_{h}{|}_{\partial \mathrm{\Omega }}=0\right\}$. It is easy to see that ${V}^{h}\subset V$.

Then a possible variational discretization approximation scheme of (2.1) is as follows:
$\left\{\begin{array}{l}{min}_{{u}_{h}\in K}\left\{{\int }_{\mathrm{\Omega }}\left(g\left({y}_{h}\right)+h\left({u}_{h}\right)\right)\phantom{\rule{0.2em}{0ex}}dx\right\},\\ a\left({y}_{h},{w}_{h}\right)+\left(\varphi \left({y}_{h}\right),{w}_{h}\right)=\left(f+B{u}_{h},{w}_{h}\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }{w}_{h}\in {V}^{h}.\end{array}$
(2.10)
It is well known (see, e.g., ) that control problem (2.10) has a solution $\left({y}_{h},{u}_{h}\right)\in {V}^{h}×K$, and that if the pair $\left({y}_{h},{u}_{h}\right)\in {V}^{h}×K$ is the solution of (2.10), then there is an adjoint state ${p}_{h}\in {V}^{h}$ such that the triplet $\left({y}_{h},{p}_{h},{u}_{h}\right)\in {V}^{h}×{V}^{h}×K$ satisfies the following optimality conditions:
$a\left({y}_{h},{w}_{h}\right)+\left(\varphi \left({y}_{h}\right),{w}_{h}\right)=\left(f+B{u}_{h},{w}_{h}\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }{w}_{h}\in {V}^{h},$
(2.11)
$a\left({q}_{h},{p}_{h}\right)+\left({\varphi }^{\prime }\left({y}_{h}\right){p}_{h},{q}_{h}\right)=\left({g}^{\prime }\left({y}_{h}\right),{q}_{h}\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }{q}_{h}\in {V}^{h},$
(2.12)
$\left({h}^{\prime }\left({u}_{h}\right)+{B}^{\ast }{p}_{h},v-{u}_{h}\right)\ge 0,\phantom{\rule{1em}{0ex}}\mathrm{\forall }v\in K.$
(2.13)

Similar to Lemma 2.2, it is easy to show the following lemma.

Lemma 2.3 Suppose that assumptions (A1)-(A2) are satisfied. Let $\left({y}_{h},{p}_{h},{u}_{h}\right)$ be the solution of (2.11)-(2.13), and ${s}_{h}\left(x\right)$ is the solution of the following equation:
${h}^{\prime }\left({s}_{h}\left(x\right)\right)+{B}^{\ast }{p}_{h}\left(x\right)=0.$
(2.14)
Then we have
${u}_{h}\left(x\right)={\mathrm{\Pi }}_{\left[a,b\right]}\left({s}_{h}\left(x\right)\right).$
(2.15)

Remark 2.2 In many applications, the objective functional is uniform convex near the solution u, which is assumed in many studies on numerical methods of the problem, see, for example, [20, 43]. In this paper, we assumed that $g\left(\cdot \right)$ and $h\left(\cdot \right)$ are strictly convex continuous differentiable functions, for instance, $h\left(u\right)=\frac{\alpha }{2}{\int }_{\mathrm{\Omega }}{u}^{2}$, which is frequently met, then the exact solution of the variational inequality (2.13) is ${u}_{h}\left(x\right)=max\left(a,min\left(b,-\frac{1}{\alpha }{B}^{\ast }{p}_{h}\left(x\right)\right)\right)$, and for numerically solving the problem, we can replace ${u}_{h}\left(x\right)$ by $max\left(a,min\left(b,-\frac{1}{\alpha }{B}^{\ast }{p}_{h}\left(x\right)\right)\right)$ in our program.

## 3 A priori error estimates

We now derive a priori error estimates of the variational discretization approximation scheme. Just for ease of exposition, let
$J\left(u\right)={\int }_{\mathrm{\Omega }}\left(g\left(y\right)+h\left(u\right)\right)\phantom{\rule{0.2em}{0ex}}dx,$
and ${J}^{\prime }\left(u\right)$ is the Fréchet derivative of $J\left(u\right)$ at u. Similarly to (2.4)-(2.6), we can prove that
$\begin{array}{r}\left({J}^{\prime }\left(u\right),v\right)=\left({h}^{\prime }\left(u\right)+{B}^{\ast }p,v\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }v\in K,\\ \left({J}^{\prime }\left({u}_{h}\right),v\right)=\left({h}^{\prime }\left({u}_{h}\right)+{B}^{\ast }p\left({u}_{h}\right),v\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }v\in K,\end{array}$
where $p\left({u}_{h}\right)$ satisfies the following system:
$a\left(y\left({u}_{h}\right),w\right)+\left(\varphi \left(y\left({u}_{h}\right)\right),w\right)=\left(f+B{u}_{h},w\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }w\in V,$
(3.1)
$a\left(q,p\left({u}_{h}\right)\right)+\left({\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)p\left({u}_{h}\right),q\right)=\left({g}^{\prime }\left(y\left({u}_{h}\right)\right),q\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }q\in V.$
(3.2)

Let ${\pi }_{h}:{C}^{0}\left(\overline{\mathrm{\Omega }}\right)\to {V}^{h}$ be the standard Lagrange interpolation operator such that for any $v\in {C}^{0}\left(\overline{\mathrm{\Omega }}\right)$, ${\pi }_{h}v\left({P}_{i}\right)=v\left({P}_{i}\right)$ for all ${P}_{i}\in \mathbf{P}$, where P is the vertex set associated with the triangulation ${\mathcal{T}}^{h}$, and n is the dimension of the domain Ω, we have the following result:

Lemma 3.1 

Let ${\pi }_{h}$ be the standard Lagrange interpolation operator. For $m=0$ or 1, $q>\frac{n}{2}$ and $\mathrm{\forall }v\in {W}^{2,q}\left(\mathrm{\Omega }\right)$, we have
${|v-{\pi }_{h}v|}_{{W}^{m,q}\left(\mathrm{\Omega }\right)}\le C{h}^{2-m}{|v|}_{{W}^{2,q}\left(\mathrm{\Omega }\right)}.$
Lemma 3.2 Let $\left({y}_{h},{p}_{h},{u}_{h}\right)$ and $\left(y\left({u}_{h}\right),p\left({u}_{h}\right)\right)$ be the solutions of (2.11)-(2.13) and (3.1)-(3.2), respectively. Assume that $p\left({u}_{h}\right),y\left({u}_{h}\right)\in {H}^{2}\left(\mathrm{\Omega }\right)$ and ${\varphi }^{\prime }\left(\cdot \right)$ is locally Lipschitz continuous. Then there exists a constant C independent of h such that
${\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }_{1,\mathrm{\Omega }}+{\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}\le Ch.$
(3.3)
Proof From ${\varphi }^{\prime }\left(\cdot \right)\ge 0$, (2.12), (3.2) and embedding theorem ${\parallel v\parallel }_{{L}^{4}\left(\mathrm{\Omega }\right)}\le C{\parallel v\parallel }_{{H}^{1}\left(\mathrm{\Omega }\right)}$, we have
$\begin{array}{r}c{\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}^{2}\\ \phantom{\rule{1em}{0ex}}\le a\left(p\left({u}_{h}\right)-{p}_{h},p\left({u}_{h}\right)-{p}_{h}\right)+\left({\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)\left(p\left({u}_{h}\right)-{p}_{h}\right),p\left({u}_{h}\right)-{p}_{h}\right)\\ \phantom{\rule{1em}{0ex}}=a\left(p\left({u}_{h}\right)-{\pi }_{h}p\left({u}_{h}\right),p\left({u}_{h}\right)-{p}_{h}\right)+\left({\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)\left(p\left({u}_{h}\right)-{p}_{h}\right),p\left({u}_{h}\right)-{\pi }_{h}p\left({u}_{h}\right)\right)\\ \phantom{\rule{2em}{0ex}}+\left({g}^{\prime }\left(y\left({u}_{h}\right)\right)-{g}^{\prime }\left({y}_{h}\right),{\pi }_{h}p\left({u}_{h}\right)-{p}_{h}\right)+\left(\left({\varphi }^{\prime }\left({y}_{h}\right)-{\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)\right){p}_{h},{\pi }_{h}p\left({u}_{h}\right)-{p}_{h}\right)\\ \phantom{\rule{1em}{0ex}}\le C{\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}{\parallel p\left({u}_{h}\right)-{\pi }_{h}p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}+C\parallel y\left({u}_{h}\right)-{y}_{h}\parallel \parallel {\pi }_{h}p\left({u}_{h}\right)-{p}_{h}\parallel \\ \phantom{\rule{2em}{0ex}}+C\parallel {\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)\parallel {\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{{L}^{4}\left(\mathrm{\Omega }\right)}{\parallel p\left({u}_{h}\right)-{\pi }_{h}p\left({u}_{h}\right)\parallel }_{{L}^{4}\left(\mathrm{\Omega }\right)}\\ \phantom{\rule{2em}{0ex}}+C\parallel y\left({u}_{h}\right)-{y}_{h}\parallel {\parallel {p}_{h}\parallel }_{{L}^{4}\left(\mathrm{\Omega }\right)}{\parallel {\pi }_{h}p\left({u}_{h}\right)-{p}_{h}\parallel }_{{L}^{4}\left(\mathrm{\Omega }\right)}\\ \phantom{\rule{1em}{0ex}}\le C\left(\delta \right){\parallel p\left({u}_{h}\right)-{\pi }_{h}p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}+C\left(\delta \right){\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }^{2}\\ \phantom{\rule{2em}{0ex}}+C\delta \left({\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}^{2}+{\parallel {\pi }_{h}p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}^{2}\right)\\ \phantom{\rule{1em}{0ex}}\le C\left(\delta \right){\parallel p\left({u}_{h}\right)-{\pi }_{h}p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}+C\left(\delta \right){\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }^{2}+C\delta {\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}^{2}.\end{array}$
(3.4)
Note that $p\left({u}_{h}\right)\in {H}^{2}\left(\mathrm{\Omega }\right)$, by using Lemma 3.1, we obtain
$\begin{array}{rl}{\parallel p\left({u}_{h}\right)-{p}_{h}\parallel }_{1,\mathrm{\Omega }}& \le C{\parallel p\left({u}_{h}\right)-{\pi }_{h}\left(p\left({u}_{h}\right)\right)\parallel }_{1,\mathrm{\Omega }}+C\parallel y\left({u}_{h}\right)-{y}_{h}\parallel \\ \le Ch{\parallel p\left({u}_{h}\right)\parallel }_{2,\mathrm{\Omega }}+C\parallel y\left({u}_{h}\right)-{y}_{h}\parallel \\ \le Ch+C\parallel y\left({u}_{h}\right)-{y}_{h}\parallel .\end{array}$
(3.5)
Similarly, we can prove that
${\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }_{1,\mathrm{\Omega }}\le Ch{\parallel y\left({u}_{h}\right)\parallel }_{2,\mathrm{\Omega }}\le Ch.$
(3.6)

Then (3.3) follows from (3.5)-(3.6). □

In order to derive sharp a priori estimates, we introduce the following auxiliary problems:
(3.7)
(3.8)
where
$\mathrm{\Phi }=\left\{\begin{array}{ll}\frac{\varphi \left(y\left({u}_{h}\right)\right)-\varphi \left({y}_{h}\right)}{y\left({u}_{h}\right)-{y}_{h}},& y\left({u}_{h}\right)\ne {y}_{h},\\ {\varphi }^{\prime }\left({y}_{h}\right),& y\left({u}_{h}\right)={y}_{h}.\end{array}$
From the regularity estimates (see, e.g., ), we obtain
${\parallel \xi \parallel }_{2,\mathrm{\Omega }}\le C\parallel {F}_{1}\parallel ,\phantom{\rule{2em}{0ex}}{\parallel \zeta \parallel }_{2,\mathrm{\Omega }}\le C\parallel {F}_{2}\parallel .$
Lemma 3.3 Let $\left({y}_{h},{p}_{h},{u}_{h}\right)$ be the solution of (2.11)-(2.13). Suppose that $y\left({u}_{h}\right),p\left({u}_{h}\right)\in {H}^{2}\left(\mathrm{\Omega }\right)$ and ${\varphi }^{\prime }\left(\cdot \right)$ is locally Lipschitz continuous. Then there exists a constant C independent of h such that
$\parallel y\left({u}_{h}\right)-{y}_{h}\parallel +\parallel p\left({u}_{h}\right)-{p}_{h}\parallel \le C{h}^{2}.$
(3.9)
Proof Let ${F}_{1}=y\left({u}_{h}\right)-{y}_{h}$ and ${\xi }_{h}={\pi }_{h}\xi$. We have
$\begin{array}{rl}{\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }^{2}& =a\left(y\left({u}_{h}\right)-{y}_{h},\xi \right)+\left(\mathrm{\Phi }\xi ,y\left({u}_{h}\right)-{y}_{h}\right)\\ =a\left(y\left({u}_{h}\right)-{y}_{h},\xi -{\xi }_{h}\right)+\left(\mathrm{\Phi }\xi ,y\left({u}_{h}\right)-{y}_{h}\right)-\left(\varphi \left(y\left({u}_{h}\right)\right)-\varphi \left({y}_{h}\right),{\xi }_{h}\right)\\ =a\left(y\left({u}_{h}\right)-{y}_{h},\xi -{\xi }_{h}\right)+\left(\varphi \left(y\left({u}_{h}\right)\right)-\varphi \left({y}_{h}\right),\xi -{\xi }_{h}\right)\\ \le C{\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }_{1,\mathrm{\Omega }}{\parallel \xi -{\xi }_{h}\parallel }_{1,\mathrm{\Omega }}+C\parallel y\left({u}_{h}\right)-{y}_{h}\parallel \parallel \xi -{\xi }_{h}\parallel \\ \le C{\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }_{1,\mathrm{\Omega }}{\parallel \xi -{\xi }_{h}\parallel }_{1,\mathrm{\Omega }}.\end{array}$
(3.10)
Note that
${\parallel \xi -{\xi }_{h}\parallel }_{1,\mathrm{\Omega }}\le Ch{\parallel \xi \parallel }_{2,\mathrm{\Omega }}\le Ch\parallel y\left({u}_{h}\right)-{y}_{h}\parallel .$
(3.11)
Thus,
$\parallel y\left({u}_{h}\right)-{y}_{h}\parallel \le Ch{\parallel y\left({u}_{h}\right)-{y}_{h}\parallel }_{1,\mathrm{\Omega }}\le C{h}^{2}.$
(3.12)
Similarly, let ${F}_{2}=p\left({u}_{h}\right)-{p}_{h}$ and ${\zeta }_{h}={\pi }_{h}\zeta$, we obtain
$\parallel p\left({u}_{h}\right)-{p}_{h}\parallel \le C{h}^{2}.$
(3.13)

From (3.12) and (3.13), we get (3.9). □

Lemma 3.4 Let $\left(y,p,u\right)$ and $\left({y}_{h},{p}_{h},{u}_{h}\right)$ be the solutions of (2.4)-(2.6) and (2.11)-(2.13), respectively. Assume that all the conditions in Lemma  3.3 are valid. Then there exists a constant C independent of h such that
$\parallel u-{u}_{h}\parallel \le C{h}^{2}.$
(3.14)
Proof It is clear that
$\left({J}^{\prime }\left(v\right)-{J}^{\prime }\left(u\right),v-u\right)\ge c{\parallel v-u\parallel }^{2},\phantom{\rule{1em}{0ex}}\mathrm{\forall }v,u\in K.$
(3.15)
By using (2.6) and (2.13), we have
$\begin{array}{rl}c{\parallel u-{u}_{h}\parallel }^{2}& \le \left({J}^{\prime }\left(u\right)-{J}^{\prime }\left({u}_{h}\right),u-{u}_{h}\right)\\ =\left({h}^{\prime }\left(u\right)+{B}^{\ast }p,u-{u}_{h}\right)-\left({h}^{\prime }\left({u}_{h}\right)+{B}^{\ast }p\left({u}_{h}\right),u-{u}_{h}\right)\\ \le -\left({h}^{\prime }\left({u}_{h}\right)+{B}^{\ast }{p}_{h},u-{u}_{h}\right)+\left({B}^{\ast }{p}_{h}-{B}^{\ast }p\left({u}_{h}\right),u-{u}_{h}\right)\\ \le \left({B}^{\ast }{p}_{h}-{B}^{\ast }p\left({u}_{h}\right),u-{u}_{h}\right)\\ \le C\parallel {p}_{h}-p\left({u}_{h}\right)\parallel \parallel u-{u}_{h}\parallel .\end{array}$
(3.16)

From (3.9) and (3.16), we derive (3.14). □

Now we combine Lemmas 3.2-3.4 to come up with the following main result.

Theorem 3.1 Let $\left(y,p,u\right)$ and $\left({y}_{h},{p}_{h},{u}_{h}\right)$ be the solutions of (2.4)-(2.6) and (2.11)-(2.13), respectively. Assume that all the conditions in Lemmas 3.2-3.4 are valid. Then we have
$\parallel u-{u}_{h}\parallel +\parallel y-{y}_{h}\parallel +\parallel p-{p}_{h}\parallel \le C{h}^{2}.$
(3.17)
Proof Note that
$\parallel p-{p}_{h}\parallel \le \parallel p-p\left({u}_{h}\right)\parallel +\parallel p\left({u}_{h}\right)-{p}_{h}\parallel ,$
(3.18)
$\parallel y-{y}_{h}\parallel \le \parallel y-y\left({u}_{h}\right)\parallel +\parallel y\left({u}_{h}\right)-{y}_{h}\parallel .$
(3.19)
From (2.4)-(2.5), (3.1)-(3.2) and the regularity estimates, we have
$\parallel p-p\left({u}_{h}\right)\parallel \le {\parallel p-p\left({u}_{h}\right)\parallel }_{2,\mathrm{\Omega }}\le C\parallel y-y\left({u}_{h}\right)\parallel ,$
(3.20)
$\parallel y-y\left({u}_{h}\right)\parallel \le {\parallel y-y\left({u}_{h}\right)\parallel }_{2,\mathrm{\Omega }}\le C\parallel u-{u}_{h}\parallel .$
(3.21)

Then, (3.17) follows from (3.9), (3.14) and (3.18)-(3.21). □

## 4 A posteriori error estimates

We now derive a posteriori error estimates for the variational discretization approximation scheme. The following lemmas are very important in deriving a posteriori error estimates of residual type.

Lemma 4.1 

$\mathrm{\forall }v\in {W}^{1,q}\left(\mathrm{\Omega }\right)$, $1\le q<\mathrm{\infty }$,
${\parallel v\parallel }_{{W}^{0,q}\left(\partial \tau \right)}\le C\left({h}_{\tau }^{-\frac{1}{q}}{\parallel v\parallel }_{{W}^{0,q}\left(\tau \right)}+{h}_{\tau }^{1-\frac{1}{q}}{|v|}_{{W}^{1,q}\left(\tau \right)}\right).$
(4.1)
Lemma 4.2 Let $\left(y,p,u\right)$ and $\left({y}_{h},{p}_{h},{u}_{h}\right)$ be the solutions of (2.4)-(2.6) and (2.11)-(2.13), respectively. Then we have
$\parallel u-{u}_{h}\parallel \le C\parallel {p}_{h}-p\left({u}_{h}\right)\parallel ,$
(4.2)

where $p\left({u}_{h}\right)$ is defined in (3.2).

Proof It follows from (2.6) and (2.13) that
$\begin{array}{rl}c{\parallel u-{u}_{h}\parallel }^{2}\le & \left({J}^{\prime }\left(u\right),u-{u}_{h}\right)-\left({J}^{\prime }\left({u}_{h}\right),u-{u}_{h}\right)\\ \le & -\left({J}^{\prime }\left({u}_{h}\right),u-{u}_{h}\right)\\ =& -\left({h}^{\prime }\left({u}_{h}\right)+{B}^{\ast }{p}_{h},u-{u}_{h}\right)\\ +\left({B}^{\ast }{p}_{h}-{B}^{\ast }p\left({u}_{h}\right),u-{u}_{h}\right)\\ \le & C\left(\delta \right){\parallel {p}_{h}-p\left({u}_{h}\right)\parallel }^{2}+\delta {\parallel u-{u}_{h}\parallel }^{2}.\end{array}$
(4.3)

Let δ be small enough, then (4.2) follows from (4.3). □

Lemma 4.3 Let $\left({y}_{h},{p}_{h},{u}_{h}\right)$ and $\left(y\left({u}_{h}\right),p\left({u}_{h}\right)\right)$ be the solutions of (2.11)-(2.13) and (3.1)-(3.2), respectively. Assume that ${\varphi }^{\prime }\left(\cdot \right)$ is locally Lipschitz continuous. Then there exists a positive constant C independent of h such that
${\parallel {y}_{h}-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}+{\parallel {p}_{h}-p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}\le C\left({\eta }_{1}^{2}+{\eta }_{2}^{2}\right),$
(4.4)
where
$\begin{array}{r}{\eta }_{1}^{2}=\sum _{\tau \in {\mathcal{T}}^{h}}{h}_{\tau }^{2}{\int }_{\tau }{\left({g}^{\prime }\left({y}_{h}\right)+div\left({A}^{\ast }\mathrm{\nabla }{p}_{h}\right)-{\varphi }^{\prime }\left({y}_{h}\right){p}_{h}\right)}^{2}\phantom{\rule{0.2em}{0ex}}dx+\sum _{l\cap \partial \mathrm{\Omega }\ne \mathrm{\varnothing }}{h}_{l}{\int }_{l}{\left[{A}^{\ast }\mathrm{\nabla }{p}_{h}\cdot \mathbf{n}\right]}^{2}\phantom{\rule{0.2em}{0ex}}ds,\\ {\eta }_{2}^{2}=\sum _{\tau \in {\mathcal{T}}^{h}}{h}_{\tau }^{2}{\int }_{\tau }{\left(div\left(A\mathrm{\nabla }{y}_{h}\right)-\varphi \left({y}_{h}\right)+f+B{u}_{h}\right)}^{2}\phantom{\rule{0.2em}{0ex}}dx+\sum _{l\cap \partial \mathrm{\Omega }\ne \mathrm{\varnothing }}{h}_{l}{\int }_{l}{\left[A\mathrm{\nabla }{y}_{h}\cdot \mathbf{n}\right]}^{2}\phantom{\rule{0.2em}{0ex}}ds,\end{array}$
where ${h}_{l}$ is the size of the face $l={\overline{\tau }}_{l}^{1}\cap {\overline{\tau }}_{l}^{2}$, and ${\tau }_{l}^{1}$, ${\tau }_{l}^{2}$ are two neighboring elements in ${\mathcal{T}}^{h}$, ${\left[A\mathrm{\nabla }{y}_{h}\cdot \mathbf{n}\right]}_{l}$, and ${\left[{A}^{\ast }\mathrm{\nabla }{p}_{h}\cdot \mathbf{n}\right]}_{l}$ are the A-normal and ${A}^{\ast }$-normal derivative jumps over the interior face l, respectively, defined by
$\begin{array}{r}{\left[A\mathrm{\nabla }{y}_{h}\cdot \mathbf{n}\right]}_{l}=\left(A\mathrm{\nabla }{y}_{h}{|}_{{\tau }_{l}^{1}}-A\mathrm{\nabla }{y}_{h}{|}_{{\tau }_{l}^{2}}\right)\cdot \mathbf{n},\\ {\left[{A}^{\ast }\mathrm{\nabla }{p}_{h}\cdot \mathbf{n}\right]}_{l}=\left({A}^{\ast }\mathrm{\nabla }{p}_{h}{|}_{{\tau }_{l}^{1}}-{A}^{\ast }\mathrm{\nabla }{p}_{h}{|}_{{\tau }_{l}^{2}}\right)\cdot \mathbf{n},\end{array}$

where n is the normal vector on $l={\tau }_{l}^{1}\cap {\tau }_{l}^{2}$ outwards ${\tau }_{l}^{1}$. For later convenience, we defined ${\left[A\mathrm{\nabla }{y}_{h}\cdot \mathbf{n}\right]}_{l}=0$ and ${\left[{A}^{\ast }\mathrm{\nabla }{p}_{h}\cdot \mathbf{n}\right]}_{l}=0$ when $l\subset \partial \mathrm{\Omega }$.

Proof Let ${e}^{p}={p}_{h}-p\left({u}_{h}\right)$ and ${e}_{I}^{p}={\pi }_{h}{e}^{p}$, it follows from the Green formula, embedding theorem ${\parallel v\parallel }_{{L}^{4}\left(\mathrm{\Omega }\right)}\le C{\parallel v\parallel }_{{H}^{1}\left(\mathrm{\Omega }\right)}$, Lemma 4.1, (2.12) and (3.2) that
$\begin{array}{r}c{\parallel {p}_{h}-p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}\\ \phantom{\rule{1em}{0ex}}\le a\left({e}^{p},{p}_{h}-p\left({u}_{h}\right)\right)+\left({\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)\left({p}_{h}-p\left({u}_{h}\right)\right),{e}^{p}\right)\\ \phantom{\rule{1em}{0ex}}=a\left({e}^{p}-{e}_{I}^{p},{p}_{h}-p\left({u}_{h}\right)\right)+\left({\varphi }^{\prime }\left({y}_{h}\right){p}_{h}-{\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)p\left({u}_{h}\right),{e}^{p}-{e}_{I}^{p}\right)+a\left({e}_{I}^{p},{p}_{h}-p\left({u}_{h}\right)\right)\\ \phantom{\rule{2em}{0ex}}+\left({\varphi }^{\prime }\left({y}_{h}\right){p}_{h}-{\varphi }^{\prime }\left(y\left({u}_{h}\right)\right)p\left({u}_{h}\right),{e}_{I}^{p}\right)+\left({\varphi }^{\prime }\left(y\left({u}_{h}\right)\right){p}_{h}-{\varphi }^{\prime }\left({y}_{h}\right){p}_{h},{e}^{p}\right)\\ \phantom{\rule{1em}{0ex}}=\sum _{\tau \in {\mathcal{T}}^{h}}{\int }_{\tau }\left({g}^{\prime }\left({y}_{h}\right)+div\left({A}^{\ast }\mathrm{\nabla }{p}_{h}\right)-{\varphi }^{\prime }\left({y}_{h}\right){p}_{h}\right)\left({e}_{I}^{p}-{e}^{p}\right)\phantom{\rule{0.2em}{0ex}}dx+\left({g}^{\prime }\left({y}_{h}\right)-{g}^{\prime }\left(y\left({u}_{h}\right)\right),{e}^{p}\right)\\ \phantom{\rule{2em}{0ex}}+\sum _{\tau \in {\mathcal{T}}^{h}}{\int }_{\partial \tau }\left({A}^{\ast }\mathrm{\nabla }{p}_{h}\cdot \mathbf{n}\right)\left({e}^{p}-{e}_{I}^{p}\right)\phantom{\rule{0.2em}{0ex}}ds+\left({\varphi }^{\prime }\left(y\left({u}_{h}\right)\right){p}_{h}-{\varphi }^{\prime }\left({y}_{h}\right){p}_{h},{e}^{p}\right)\\ \phantom{\rule{1em}{0ex}}\le C{\eta }_{1}^{2}+C\left(\delta \right){\parallel {y}_{h}-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}+\delta {\parallel {p}_{h}-p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}.\end{array}$
(4.5)
Similarly, we obtain
$\begin{array}{r}c{\parallel {y}_{h}-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}\\ \phantom{\rule{1em}{0ex}}\le a\left({y}_{h}-y\left({u}_{h}\right),{e}^{y}\right)+\left(\varphi \left({y}_{h}\right)-\varphi \left(y\left({u}_{h}\right)\right),{e}^{y}\right)\\ \phantom{\rule{1em}{0ex}}=\left(A\mathrm{\nabla }\left({y}_{h}-y\left({u}_{h}\right)\right),\mathrm{\nabla }\left({e}^{y}-{e}_{I}^{y}\right)\right)+\left(\varphi \left({y}_{h}\right)-\varphi \left(y\left({u}_{h}\right)\right),{e}^{y}-{e}_{I}^{y}\right)\\ \phantom{\rule{1em}{0ex}}=\sum _{\tau \in {\mathcal{T}}^{h}}{\int }_{\tau }\left(div\left(A\mathrm{\nabla }{y}_{h}\right)-\varphi \left({y}_{h}\right)+f+B{u}_{h}\right)\left({e}_{I}^{y}-{e}^{y}\right)\phantom{\rule{0.2em}{0ex}}dx+\sum _{\tau \in {\mathcal{T}}^{h}}{\int }_{\partial \tau }\left(A\mathrm{\nabla }{y}_{h}\cdot \mathbf{n}\right)\left({e}^{y}-{e}_{I}^{y}\right)\phantom{\rule{0.2em}{0ex}}ds\\ \phantom{\rule{1em}{0ex}}\le C\left(\delta \right){\eta }_{2}^{2}+\delta {\parallel {y}_{h}-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}^{2}.\end{array}$
(4.6)

From (4.5) and (4.6), we derive (4.4). □

Theorem 4.1 Let $\left(y,p,u\right)$ and $\left({y}_{h},{p}_{h},{u}_{h}\right)$ be the solutions of (2.4)-(2.6) and (2.11)-(2.13), respectively. Assume that all the conditions in Lemmas 4.2-4.3 are valid. Then there exists a constant C independent of h such that
${\parallel u-{u}_{h}\parallel }^{2}+{\parallel y-{y}_{h}\parallel }_{1,\mathrm{\Omega }}^{2}+{\parallel p-{p}_{h}\parallel }_{1,\mathrm{\Omega }}^{2}\le C\left({\eta }_{1}^{2}+{\eta }_{2}^{2}\right),$
(4.7)

where ${\eta }_{1}$ and ${\eta }_{2}$ are defined in Lemma  4.3.

Proof Note that
${\parallel p-{p}_{h}\parallel }_{1,\mathrm{\Omega }}\le {\parallel p-p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}+{\parallel {p}_{h}-p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }},$
(4.8)
${\parallel y-{y}_{h}\parallel }_{1,\mathrm{\Omega }}\le {\parallel y-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}+{\parallel {y}_{h}-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }},$
(4.9)
and
${\parallel p-p\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}\le {\parallel p-p\left({u}_{h}\right)\parallel }_{2,\mathrm{\Omega }}\le C\parallel y-y\left({u}_{h}\right)\parallel ,$
(4.10)
${\parallel y-y\left({u}_{h}\right)\parallel }_{1,\mathrm{\Omega }}\le {\parallel y-y\left({u}_{h}\right)\parallel }_{2,\mathrm{\Omega }}\le C\parallel u-{u}_{h}\parallel .$
(4.11)

Then (4.7) follows from (4.2), (4.4) and (4.8)-(4.11). □

## 5 Numerical experiments

For a constrained optimization problem
$\underset{u\in K\subset U}{min}J\left(u\right),$
where $J\left(u\right)$ is a convex functional on U and K is a convex subset of U, the iterative scheme reads ($n=0,1,2,\dots$):
$\left\{\begin{array}{l}b\left({u}_{n+\frac{1}{2}},v\right)=b\left({u}_{n},v\right)-{\rho }_{n}\left({J}^{\prime }\left({u}_{n}\right),v\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }v\in U,\\ {u}_{n+1}={P}_{K}^{b}\left({u}_{n+\frac{1}{2}}\right),\end{array}$
(5.1)
where $b\left(\cdot ,\cdot \right)$ is a symmetric and positive definite bilinear form, and similarly to , the projection operator ${P}_{K}^{b}:U\to K$ is defined: For given $w\in U$ find ${P}_{K}^{b}w\in K$ such that
$b\left({P}_{K}^{b}w-w,{P}_{K}^{b}w-w\right)=\underset{u\in K}{min}b\left(u-w,u-w\right).$

The bilinear form $b\left(\cdot ,\cdot \right)$ provides a suitable precondition for the projection gradient algorithm. Let ${U}^{h}=\left\{{v}_{h}\in {L}^{2}\left(\mathrm{\Omega }\right),a\le {v}_{h}\le b:{v}_{h}{|}_{\tau }=constant,\mathrm{\forall }\tau \in {\mathcal{T}}^{h}\right\}$. For an acceptable error $tol$ and a fixed step size ${\rho }_{n}$, by applying (5.1) to the discretized nonlinear elliptic optimal control problem, we introduce the following projection gradient algorithm (see, e.g., [11, 12]), for ease of exposition, we have omitted the subscript h.

Step 1. Initialize ${u}_{0}$;

Step 2. Solve the following equations:
$\left\{\begin{array}{ll}b\left({u}_{n+\frac{1}{2}},v\right)=b\left({u}_{n},v\right)-{\rho }_{n}\left({h}^{\prime }\left({u}_{n}\right)+{B}^{\ast }{p}_{n},v\right),& {u}_{n+\frac{1}{2}},{u}_{n}\in {U}^{h},\mathrm{\forall }v\in {U}^{h},\\ a\left({y}_{n},w\right)+\left(\varphi \left({y}_{n}\right),w\right)=\left(f+B{u}_{n},w\right),& {y}_{n}\in {V}^{h},\mathrm{\forall }w\in {V}^{h},\\ a\left(q,{p}_{n}\right)+\left({\varphi }^{\prime }\left({y}_{n}\right){p}_{n},q\right)=\left({y}_{n}-{y}_{d},q\right),& {p}_{n}\in {V}^{h},\mathrm{\forall }q\in {V}^{h},\\ {u}_{n+1}={P}_{K}^{b}\left({u}_{n+\frac{1}{2}}\right);\end{array}$
(5.2)

Step 3. Calculate the iterative error: ${e}_{n+1}=\parallel {u}_{n+1}-{u}_{n}\parallel$;

Step 4. If ${e}_{n+1}\le tol$, stop; else, go to Step 2.

According to the preceding analysis, we construct the following variational discretization algorithm.

Algorithm 5.2 (Variational discretization algorithm)

Step 1. Initialize ${u}_{0}$;

Step 2. Solve the following equations:
$\left\{\begin{array}{ll}b\left({u}_{n+\frac{1}{2}},v\right)=b\left({u}_{n},v\right)-{\rho }_{n}\left({h}^{\prime }\left({u}_{n}\right)+{B}^{\ast }{p}_{n},v\right),& {u}_{n+\frac{1}{2}},{u}_{n}\in U,\mathrm{\forall }v\in U,\\ a\left({y}_{n},w\right)+\left(\varphi \left({y}_{n}\right),w\right)=\left(f+B{u}_{n},w\right),& {y}_{n}\in {V}^{h},\mathrm{\forall }w\in {V}^{h},\\ a\left(q,{p}_{n}\right)+\left({\varphi }^{\prime }\left({y}_{n}\right){p}_{n},q\right)=\left({y}_{n}-{y}_{d},q\right),& {p}_{n}\in {V}^{h},\mathrm{\forall }q\in {V}^{h},\\ {u}_{n+1}={\mathrm{\Pi }}_{\left[a,b\right]}\left({u}_{n+\frac{1}{2}}\right).\end{array}$
(5.3)

Step 3. Calculate the iterative error: ${e}_{n+1}=\parallel {u}_{n+1}-{u}_{n}\parallel$;

Step 4. If ${e}_{n+1}\le tol$, stop; else, go to Step 2.

It is well known that there are four major types of adaptive finite element methods, namely, the h-methods (mesh refinement), the p-methods (order enrichment), the r-methods (mesh redistribution) and the hp-methods (the combination of h-method and p-method). For an acceptable error $Tol$, by using a posteriori error estimator ${\eta }_{1}^{2}$ and ${\eta }_{2}^{2}$ as the mesh refinement indicator and the Algorithm 5.2, we present the following adaptive variational discretization algorithm.

Algorithm 5.3 (Adaptive variational discretization algorithm)

Step 1. Solve the discretized optimization problem with Algorithm 5.2 on the current mesh obtain the numerical solution ${u}_{n}^{\prime }$ and calculate the error estimators ${\eta }_{1}$ and ${\eta }_{2}$;

Step 2. Adjust the mesh by using estimators ${\eta }_{1}$ and ${\eta }_{2}$, then update the numerical solution ${u}_{n}^{\prime }$ and obtain the new numerical solution ${u}_{n+1}^{\prime }$ on new mesh;

Step 3. If $\parallel {u}_{n+1}^{\prime }-{u}_{n}^{\prime }\parallel \le Tol$, stop; else, go to Step 1.

All of the following numerical examples were solved numerically with codes developed based on AFEPack which provided a general tool of finite element approximation for PDEs. The package is freely available and the details can be found in .

We consider the following optimal control problems:
$\left\{\begin{array}{l}{min}_{u\in K}\left\{{\int }_{\mathrm{\Omega }}\left(g\left(y\left(x\right)\right)+h\left(u\left(x\right)\right)\right)\phantom{\rule{0.2em}{0ex}}dx\right\},\\ -div\left(A\left(x\right)\mathrm{\nabla }y\left(x\right)\right)+\varphi \left(y\left(x\right)\right)=f\left(x\right)+Bu\left(x\right),\phantom{\rule{1em}{0ex}}x\in \mathrm{\Omega },\\ y\left(x\right)=0,\phantom{\rule{1em}{0ex}}x\in \partial \mathrm{\Omega },\end{array}$
where
$\begin{array}{rcl}g\left(y\left(x\right)\right)& =& \frac{1}{2}{\left[y\left(x\right)-{y}_{0}\left(x\right)\right]}^{2},\\ h\left(u\left(x\right)\right)& =& \frac{1}{2}{\left[u\left(x\right)-{u}_{0}\left(x\right)\right]}^{2},\end{array}$
and
$K=\left\{v\left(x\right)\in {L}^{2}\left(\mathrm{\Omega }\right):a\le v\left(x\right)\le b,x\in \mathrm{\Omega }\right\},$

the domain Ω is the unit square $\left[0,1\right]×\left[0,1\right]$ and $B=I$.

Example 1 In the first example, we compare the convergence order of $\parallel u-{u}_{h}\parallel$ in Algorithm 5.1 with that in Algorithm 5.2. The data are as follows:
$\begin{array}{c}A\left(x\right)=E,\phantom{\rule{2em}{0ex}}\varphi \left(y\right)={y}^{3},\phantom{\rule{2em}{0ex}}a=0,\phantom{\rule{2em}{0ex}}b=1.5,\hfill \\ p\left(x\right)=4{x}_{1}{x}_{2}sin\left(2\pi {x}_{1}\right)sin\left(2\pi {x}_{2}\right),\hfill \\ y\left(x\right)=p\left(x\right),\hfill \\ {u}_{0}\left(x\right)=1+sin\left(2\pi {x}_{1}\right)sin\left(2\pi {x}_{2}\right),\hfill \\ u\left(x\right)=min\left(1.5,max\left(0,{u}_{0}\left(x\right)-p\left(x\right)\right)\right),\hfill \\ f\left(x\right)=-div\left(A\left(x\right)\mathrm{\nabla }y\left(x\right)\right)+\varphi \left(y\left(x\right)\right)-u\left(x\right),\hfill \\ {y}_{0}\left(x\right)=y\left(x\right)+div\left({A}^{\ast }\left(x\right)\mathrm{\nabla }p\left(x\right)\right)-{\varphi }^{\prime }\left(y\left(x\right)\right)p\left(x\right).\hfill \end{array}$
The numerical results are listed in Table 1 and Table 2.
In Figure 1, we see clearly that in the projection gradient algorithm, $\parallel u-{u}_{h}\parallel =O\left(h\right)$, while in the variational discretization algorithm, $\parallel u-{u}_{h}\parallel =O\left({h}^{2}\right)$. In Figure 2, we show the profiles of the exact solution u alongside the solution error. Figure 1 The convergence order of ∥ u − u h ∥ . Figure 2 The exact solution u (left) and the error u h − u (right).
Example 2 In order to illustrate the reliability and efficiency of the a posteriori error estimates in Theorem 4.1, we use Algorithm 5.3 to solve this example. The data are as follows:
$\begin{array}{r}\varphi \left(y\right)={y}^{3},\phantom{\rule{2em}{0ex}}a=-0.5,\phantom{\rule{2em}{0ex}}b=0,\\ A\left(x\right)=\left\{\begin{array}{ll}E,& {x}_{1}+{x}_{2}\ge 1,\\ 2\cdot E,& {x}_{1}+{x}_{2}<1,\end{array}\\ p\left(x\right)=\left\{\begin{array}{ll}2sin\left(\pi {x}_{1}\right)sin\left(\pi {x}_{2}\right),& {x}_{1}+{x}_{2}\ge 1,\\ sin\left(\pi {x}_{1}\right)sin\left(\pi {x}_{2}\right),& {x}_{1}+{x}_{2}<1,\end{array}\\ y\left(x\right)=p\left(x\right),\\ {u}_{0}\left(x\right)=0,\\ u\left(x\right)=min\left(0,max\left(-0.5,{u}_{0}\left(x\right)-p\left(x\right)\right)\right),\\ f\left(x\right)=-div\left(A\left(x\right)\mathrm{\nabla }y\left(x\right)\right)+\varphi \left(y\left(x\right)\right)-u\left(x\right),\\ {y}_{0}\left(x\right)=y\left(x\right)+div\left({A}^{\ast }\left(x\right)\mathrm{\nabla }p\left(x\right)\right)-{\varphi }^{\prime }\left(y\left(x\right)\right)p\left(x\right).\end{array}$
The numerical results based on adaptive mesh and uniform mesh are presented in Table 3. In Figure 3, we show the profiles of the exact solution u alongside the solution error. From Table 3, it is clear that the adaptive mesh generated via the error indicators in Theorem 4.1 are able to save substantial computational work, in comparison with the uniform mesh. Our numerical results confirm our theoretical results. Figure 3 The exact solution u (left) and the error u h − u (right).

## Declarations

### Acknowledgements

This work is supported by the Scientific Research Project of Department of Education of Hunan Province (13C338).

## Authors’ Affiliations

(1)
Department of Mathematics and Computational Science, Hunan University of Science and Engineering, Yongzhou, Hunan, 425100, China

## References 