A new semismooth Newton method for solving finite-dimensional quasi-variational inequalities

Xie, Shui-Lian; Sun, Zhe; Xu, Hong-Ru

doi:10.1186/s13660-021-02671-2

Research
Open access
Published: 28 July 2021

A new semismooth Newton method for solving finite-dimensional quasi-variational inequalities

Shui-Lian Xie¹,
Zhe Sun² &
Hong-Ru Xu¹

Journal of Inequalities and Applications volume 2021, Article number: 132 (2021) Cite this article

1081 Accesses
1 Citations
Metrics details

Abstract

In this paper, we consider the numerical method for solving finite-dimensional quasi-variational inequalities with both equality and inequality constraints. Firstly, we present a semismooth equation reformulation to the KKT system of a finite-dimensional quasi-variational inequality. Then we propose a semismooth Newton method to solve the equations and establish its global convergence. Finally, we report some numerical results to show the efficiency of the proposed method. Our method can obtain the solution to some problems that cannot be solved by the method proposed in (Facchinei et al. in Comput. Optim. Appl. 62:85–109, 2015). Besides, our method outperforms than the interior point method proposed in (Facchinei et al. in Math. Program. 144:369–412, 2014).

1 Introduction

We consider the finite-dimensional quasi-variational inequality QVI(K, F): Find a vector $x^{*}\in K(x^{*})$ such that

$$ F\bigl(x^{*}\bigr)^{T} \bigl(y-x^{*}\bigr)\ge 0,\quad \forall y\in K\bigl(x^{*}\bigr), $$

(1.1)

where $F:R^{n}\rightarrow R^{n}$ is a point to point mapping and $K:R^{n}\rightrightarrows R^{n}$ is a point to set mapping with closed and convex images. Throughout the paper, we assume that F belongs to $C^{1}$ and for each $x\in R^{n}$, the feasible set mapping K is given by

$$ K(x)\triangleq \bigl\{ y\in R^{n}\mid g(y,x)\le 0, h(y,x)=0\bigr\} , $$

(1.2)

where $g:R^{n}\times R^{n}\rightarrow R^{m_{1}}$ belongs to $C^{2}$ and $g_{i}(\cdot ,x)$ is convex on $R^{n}$ for each $i=1,2,\ldots ,m_{1}$ and for all $x\in R^{n}$, $h:R^{n}\times R^{n}\rightarrow R^{m_{2}}$ belongs to $C^{2}$ and $h_{j}(\cdot ,x)$ is affine on $R^{n}$ for each $j=1,2,\ldots ,m_{2}$ and for all $x\in R^{n}$. When the set $K(x)$ is independent of x, (1.1) reduces to the famous variational inequality (VI). For VI, we refer the reader to [13] and the references therein.

QVI (1.1), which was first introduced by Bensoussan and Lions [2, 3], has important applications in many fields such as generalized Nash games, mechanics, economics, statistics, transportation and biology; see for example [1, 6, 10, 12] and the references therein. One interesting topic on QVI is to develop the efficient algorithms for the solution of QVI. Since QVI is nonsmooth and nonconvex, it is difficult to design effective methods for QVI, and by now, compared with VI, the numerical methods are still scarce. In this paper, we mainly focus on the numerical method based on the KKT conditions of QVI. This area attracts many people’s attention and much progress has been made. In [12] an interior point approach was proposed to solve QVI and the convergence was established for several classes of interesting QVIs. Reference [8, 9] developed a so called LP-Newton method and the method can be successfully applied to nonsmooth systems of equations with non-isolated solutions. Reference [21] developed an efficient regularized smoothing Newton-type algorithm for QVI. The proposed algorithm takes the advantage of newly introduced smoothing functions and a non-monotone line search strategy. [10] proposed a semismooth Newton method for QVI. They obtained global convergence and locally superlinear/quadratic convergence result for some important classes of quasi-variational inequality problems. The numerical results show that the method performs well.

There are many ways to compute a numerical solution of the nonlinear complementarity problems (NCP), such as linearized projected relaxation methods [13], the modulus-based matrix splitting method [24] and the penalty method [7, 23, 25]. In the past two decades, the nonsmooth-equation-based method has been thoroughly studied to solve NCP; see for example [5, 14–19] and the references therein. A common way to reformulate the complementarity system is to use the so called NCP-function. A function $\phi :R^{2}\rightarrow R$ is called an NCP-function if it satisfies

$$ \phi (a,b)=0\quad \Leftrightarrow\quad a\ge 0,\quad b\ge 0,\quad ab=0. $$

For example, the famous Fischer–Burmeister (FB) function takes the form

$$ \phi (a,b)=\sqrt{a^{2}+b^{2}}-a-b. $$

By the use of the NCP-function, nonlinear complementarity problem can be easily converted into a system of nonlinear equations. Most existing NCP-functions are generally nondifferentiable in the sense of Fréderivative but semismooth in the sense of Mifflin [20] and Qi and Sun [22]. In [17], the authors proposed a nonsmooth equation reformulation to the NCP. Their reformulation enjoys a nice property that it is continuous differentiable everywhere except at the solution. In this paper, we present a semismooth equation reformulation to the KKT system of a quasi-variational inequality and propose a semismooth Newton method to solve the equations.

The paper is organized as follows. In the next section, we describe a semismooth equation reformulation to the KKT system of a quasi-variational inequality, present the semismooth Newton method and establish the global convergence for the method. In Sect. 3, we compare the proposed method with some other methods on problems list in [11].

In the following, we introduce some notations that will be used in this paper. For a continuously differentiable function $F:R^{n}\rightarrow R^{n}$, we write $\mathit{JF}(x)$ for the Jacobian of F at a point $x\in R^{n}$, whereas $\nabla F(x)$ denotes the transposed Jacobian of F. Given a smooth mapping $g:R^{n}\times R^{n}\rightarrow R^{m}$, $(y,x)\mapsto g(y,x)$, $\nabla _{y}g(y,x)$ denotes the transpose of the partial Jacobian of g with respect to the y-variables. If F is locally Lipschitz continuous around x, then $\partial F(x)$ denotes Clarke’s generalized Jacobian of F at x. For a vector $x\in R^{n}$ and a subset $I\subset \{1,2,\ldots ,n\}$, we write $x_{I}$ for the subvector consisting of the elements $x_{i}$, $i\in I$. For a matrix $A\in R^{n\times n}$ and two subsets $I, J\subset \{1,2,\ldots ,n\}$, the symbol $A_{IJ}$ stands for the submatrix with entries $a_{ij}$ for $i\in I$, $j\in J$. The symbol $\operatorname{diag}(a_{11},a_{22},\ldots ,a_{nn})$ stands for a diagonal matrix with diagonal elements $a_{11},a_{22},\ldots ,a_{nn}$.

2 Semismooth equation reformulation and semismooth Newton method

Firstly, we give the following definition that will be used.

Definition 2.1

([22])

A function $F:R^{n}\rightarrow R^{n}$ is semismooth at a point $x\in R^{n}$ if it is locally Lipschitzian at x and

$$ \lim_{V\in \partial F(x+td'),d'\rightarrow d, t\downarrow 0}Vd' $$

exists for any $d\in R^{n}$, where $\partial F(x)$ is the generalized Jacobian of F at x. F is strongly semismooth at $x\in R^{n}$ if for any $d\rightarrow \mathbf{0} $ and any $V\in \partial F(x+d)$,

$$ Vd-F'(x;d)=O\bigl( \Vert d \Vert ^{2}\bigr), $$

where $F'(x;d)$ denotes the directional derivative of F at x along the direction d.

A point x is called a KKT point of QVI (1.1) if there exist Lagrange multipliers $\lambda \in R^{m_{1}}$ and $\nu \in R^{m_{2}}$ such that

$$ \textstyle\begin{cases} F(x)+\nabla _{y} g(x,x)\lambda +\nabla _{y} h(x,x)\nu =0, \\ h(x,x)=0, \\ \lambda \ge 0,\qquad g(x,x)\le 0,\qquad \lambda ^{T}g(x,x)=0. \end{cases} $$

(2.1)

Similar to Theorem 1 of [12], we find that $x^{*}\in K(x^{*})$ is a solution of (1.1) if there exist $\lambda ^{*}\in R^{m_{1}}$ and $\nu ^{*}\in R^{m_{2}}$ such that $(x^{*},\lambda ^{*},\nu ^{*})$ satisfies the KKT conditions (2.1). Moreover, if $x^{*}\in K(x^{*})$ is a solution of (1.1) and some suitable constraint qualification holds at $x^{*}$, then there exist $\lambda ^{*}\in R^{m_{1}}$ and $\nu ^{*}\in R^{m_{2}}$ such that $(x^{*},\lambda ^{*},\nu ^{*})$ satisfies the KKT conditions (2.1). Based on the above relationship, our aim is to develop a numerical method for solving the KKT conditions (2.1). For convenience, let

$$\begin{aligned}& L(x,\lambda ,\nu ):=F(x)+\nabla _{y} g(x,x)\lambda +\nabla _{y} h(x,x) \nu , \\& p(x):=g(x,x),\qquad q(x):=h(x,x), \end{aligned}$$

and then (2.1) can be rewritten as

$$ \textstyle\begin{cases} L(x,\lambda ,\nu )=0, \\ q(x)=0, \\ p(x)+w=0, \\ \lambda \ge 0,\qquad w\ge 0, \qquad \lambda ^{T}w=0, \end{cases} $$

(2.2)

where the $w\in R^{m_{1}}$ are slack variables.

It is not easy to solve (2.2) directly since the fourth formula is a complementarity system. We replace the complementarity system by an NCP-function [17], which is called the smoothed form of FB function:

$$ \phi (u,v,\varepsilon )=\sqrt{u^{2}+v^{2}+\varepsilon ^{2}}-(u+v). $$

It is clear that, for each $\varepsilon \ne 0$, $\phi (u,v,\varepsilon )$ is continuously differentiable. We use it to construct an almost smooth equation reformulation to the fourth formula.

Let $\Phi _{FB}(\lambda ,w)=(\phi _{1}^{FB}(\lambda _{1},w_{1}),\ldots , \phi _{m_{1}}^{FB}(\lambda _{m_{1}},w_{m_{1}}))^{T}$ and $S(\lambda ,w)=(S_{1}(\lambda ,w),\ldots ,S_{m_{1}}(\lambda ,w))^{T}$, where for each $i=1,2,\ldots ,m_{1}$, the elements $\phi _{i}^{FB}(\lambda _{i},w_{i})$ and $S_{i}(\lambda ,w)$ are given by

$$ \phi _{i}^{FB}(\lambda _{i},w_{i})= \sqrt{\lambda _{i}^{2}+w_{i}^{2}}- \lambda _{i}-w_{i} $$

and

$$ S_{i}(\lambda ,w)=\phi \bigl(\lambda _{i},w_{i},\mu ^{\frac{1}{2}} \bigl\Vert \Phi _{FB}( \lambda ,w) \bigr\Vert \bigr)= \sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta (\lambda ,w)}-\lambda _{i}-w_{i}, $$

(2.3)

respectively, where $0<\mu <\frac{(\sqrt{2}+1)^{2}}{m_{1}}$ is a parameter, $\|\cdot \|$ is the Euclidean norm, and

$$ \theta (\lambda ,w)=\frac{1}{2} \bigl\Vert \Phi _{FB}( \lambda ,w) \bigr\Vert ^{2}. $$

It is obvious that, for each $i=1,2,\ldots ,m_{1}$, $S_{i}(\lambda ,w)$ is differentiable everywhere except at the degenerate point $(\lambda ,w)$ which satisfies $\theta (\lambda ,w)=0$ and $\lambda _{i}=w_{i}=0$ for some $i=1,2,\ldots ,m_{1}$. Moreover, we can obtain from Theorem 2.3 of [17] that $S(\lambda ,w)=0$ is equivalent to $\lambda \ge 0$, $w\ge 0$, $\lambda ^{T} w=0$. This means that $(x^{*},\lambda ^{*},\nu ^{*})$ is a KKT point of the QVI if and only if $(x^{*},\lambda ^{*},\nu ^{*},w^{*})$ with $w^{*}=-p(x^{*})$ is a solution of the nonsmooth system of equations

$$ H(x,\lambda ,\nu ,w)=0,\quad \mbox{with } H(x,\lambda ,\nu ,w):= \begin{pmatrix} L(x,\lambda ,\nu ) \\ q(x) \\ p(x)+w \\ S(\lambda ,w) \end{pmatrix}. $$

(2.4)

Associated with the system of $H(x,\lambda ,\nu ,w)=0$, we consider its natural merit function

$$ \Psi (z):=\frac{1}{2} \bigl\Vert H(z) \bigr\Vert ^{2}, $$

(2.5)

where we set $z:=(x,\lambda ,\nu ,w)$.

By a direct calculation, we find that the gradient $\nabla \theta (\lambda ,w)$ of $\theta (\cdot ,\cdot )$ at $(\lambda ,w)$ can be expressed as follows:

$$ \nabla \theta (\lambda ,w)=\bigl(\partial \theta (\lambda ,w)/\partial \lambda _{1},\ldots ,\partial \theta (\lambda ,w)/\partial \lambda _{m_{1}}, \partial \theta (\lambda ,w)/\partial w_{1},\ldots ,\partial \theta ( \lambda ,w)/\partial w_{m_{1}}\bigr)^{T}, $$

where

$$ \partial \theta (\lambda ,w)/\partial \lambda _{i}=\phi _{i}^{FB}( \lambda _{i},w_{i})v_{\lambda _{i}}, \quad \mbox{with } v_{\lambda _{i}} \in \partial _{\lambda _{i}} \phi _{i}^{FB}(\lambda _{i},w_{i}), $$

and

$$ \partial \theta (\lambda ,w)/\partial w_{i}=\phi _{i}^{FB}(\lambda _{i},w_{i})v_{w_{i}}, \quad \mbox{with } v_{w_{i}}\in \partial _{w_{i}} \phi _{i}^{FB}( \lambda _{i},w_{i}), $$

which means that

$$ \nabla \theta (\lambda ,w)=\bigl[ \underbrace{\operatorname{diag}(v_{\lambda _{1}},v_{\lambda _{2}}, \ldots ,v_{\lambda _{m_{1}}})}_{ V_{\lambda }} \underbrace{\operatorname{diag}(v_{w_{1}},v_{w_{2}}, \ldots ,v_{w_{m_{1}}})}_{V_{w}}\bigr]^{T} \Phi _{FB}(\lambda ,w). $$

Here, $\partial _{\lambda _{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})$ denotes the partial generalized gradient of $\phi _{i}^{FB}(\cdot ,w_{i})$ at $\lambda _{i}$ and $\partial _{w_{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})$ denotes the partial generalized gradient of $\phi _{i}^{FB}(\cdot ,w_{i})$ at $w_{i}$, respectively. In particular, if $\theta (\lambda ,w)=0$, then $\nabla \theta (\lambda ,w)=0$.

If $\theta (\lambda ,w)\ne 0$, we can get by a direct calculation

$$ \begin{aligned} \nabla S_{i}(\lambda ,w)&= \biggl( \frac{\partial S_{i}}{\partial \lambda _{1}},\ldots ,\frac{\partial S_{i}}{\partial \lambda _{m_{1}}},\frac{\partial S_{i}}{\partial w_{1}},\ldots , \frac{\partial S_{i}}{\partial w_{m_{1}}} \biggr)^{T} \\ &= \biggl[ \biggl( \frac{\lambda _{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}-1 \biggr)e_{i}^{T}, \biggl( \frac{w_{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}-1 \biggr)e_{i}^{T} \biggr]^{T} \\ &\quad {}+ \frac{\mu \nabla \theta (\lambda ,w)}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }} \\ &= \biggl[ \biggl( \underbrace{\frac{\lambda _{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}}_{a_{i}( \lambda ,w)}-1 \biggr)e_{i}^{T}, \biggl( \underbrace{ \frac{w_{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}}_{b_{i}( \lambda ,w)}-1 \biggr)e_{i}^{T} \biggr]^{T} \\ &\quad {}+ \underbrace{\frac{\sqrt{2\mu \theta }}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}}_{c_{i}( \lambda ,w)} \sqrt{\mu } \begin{bmatrix} V_{\lambda }\\ V_{w}\end{bmatrix}\frac{\Phi _{FB}(\lambda ,w)}{ \Vert \Phi _{FB}(\lambda ,w) \Vert },\end{aligned} $$

where

$$ a_{i}^{2}(\lambda ,w)+b_{i}^{2}( \lambda ,w)+c_{i}^{2}(\lambda ,w)=1. $$

Otherwise, we have $\lambda _{i}\geq 0$, $w_{i}\geq 0$ and $\lambda _{i}w_{i}=0$ for any i, which means that, if $\lambda _{i}^{2}+w_{i}^{2}\neq 0$, then

$$ \begin{aligned} \nabla S_{i}(\lambda ,w)&= \biggl( \frac{\partial S_{i}}{\partial \lambda _{1}},\ldots , \frac{\partial S_{i}}{\partial \lambda _{m_{1}}}, \frac{\partial S_{i}}{\partial w_{1}},\ldots , \frac{\partial S_{i}}{\partial w_{m_{1}}} \biggr)^{T} \\ &= \biggl[ \biggl( \frac{\lambda _{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}}}-1 \biggr)e_{i}^{T}, \biggl(\frac{w_{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}}}-1 \biggr)e_{i}^{T} \biggr]^{T}, \end{aligned} $$

and, if $\lambda _{i}^{2}+w_{i}^{2}=0$, then the element in $\partial _{C}S_{i}(\lambda ,w)$ takes the form

$$ \bigl[ (a_{i}-1 )e_{i}^{T}, (b_{i}-1 )e_{i}^{T} \bigr]^{T}+c_{i} \sqrt{\mu }\bigl[\operatorname{diag}(\bar{v}_{ \lambda _{1}}, \bar{v}_{\lambda _{2}},\ldots ,\bar{v}_{\lambda _{m_{1}}}) \operatorname{diag}( \bar{v}_{w_{1}},\bar{v}_{w_{2}},\ldots ,\bar{v}_{w_{m_{1}}}) \bigr]^{T}u,$$

(2.6)

where $\bar{v}_{\lambda _{i}}\in \partial _{\lambda _{i}} \phi _{i}^{FB}( \lambda _{i},w_{i})$, $\bar{v}_{w_{i}}\in \partial _{w_{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})$, $\|u\|=1$, and

$$ a_{i}^{2}+b_{i}^{2}+c_{i}^{2} \le 1. $$

Therefore, the partial generalized derivatives $S(\lambda ,w)$ can be expressed in the form of

$$ U_{\lambda }=\operatorname{diag}(a_{1}-1,a_{2}-1, \ldots ,a_{m_{1}}-1)+\sqrt{\mu }\operatorname{diag}(c_{1},c_{2}, \ldots ,c_{m_{1}})EV_{\lambda }\operatorname{diag}(u) $$

(2.7)

and

$$ U_{w}=\operatorname{diag}(b_{1}-1,b_{2}-1, \ldots ,b_{m_{1}}-1)+\sqrt{\mu }\operatorname{diag}(c_{1},c_{2}, \ldots ,c_{m_{1}})EV_{w}\operatorname{diag}(u), $$

(2.8)

where $a_{i}^{2}+b_{i}^{2}+c_{i}^{2}\le 1$, $u\in R^{m_{1}}$ satisfies $\|u\|=1$, E is a matrix whose elements are one, $V_{\lambda }$ and $V_{w}$ are diagonal matrices whose diagonal elements belong to $\partial _{\lambda _{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})$ and $\partial _{w_{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})$, respectively.

On the basis of the above calculations, we have the following proposition.

Proposition 2.2

Let the mapping H be defined by (2.4). Then the following statements hold:

(a)
If F is continuously differentiable and g, h are twice continuously differentiable, then H is semismooth and
$$ \partial H(x,\lambda ,\nu ,w)\subseteq \left \{ \begin{pmatrix} J_{x}L(x,\lambda ,\nu )& \nabla _{y} g(x,x) & \nabla _{y} h(x,x) & 0 \\ J q(x)&0&0&0 \\ J p(x)&0&0&I \\ 0&U_{\lambda }&0& U_{w} \end{pmatrix} \right \} , $$
where $U_{\lambda }$, $U_{w}$ is defined by (2.7) and (2.8), respectively.
(b)
If, in addition, JF, $\nabla ^{2}g_{i}$ ($i=1,\ldots ,m_{1}$) and $\nabla ^{2}h_{j}$ ($j=1,\ldots ,m_{2}$) are locally Lipschitz, then H is strongly semismooth.
(c)
Let the merit function Ψ be defined by (2.5). If F is continuously differentiable and g, h are twice continuously differentiable, then Ψ is continuously differentiable, and its gradient is given by
$$ \nabla \Psi (z)=V^{T}H(z) $$
for an arbitrary element $V\in \partial H(z)$.

Remark 2.1

Consider $\operatorname{QVI}(\tilde{K}, F)$, where

$$ \tilde{K}(x)\triangleq \bigl\{ y\in R^{n}\mid g(y,x) \le 0\bigr\} . $$

(2.9)

That is, there are no equality constraints in QVI (1.1). Similarly, we can formulate the above problem in terms of the nonsmooth system of equations

$$ \tilde{H}(x,\lambda ,w)=0,\quad \mbox{with } \tilde{H}(x,\lambda ,w):= \begin{pmatrix} \tilde{L}(x,\lambda ) \\ p(x)+w \\ S(\lambda ,w) \end{pmatrix}, $$

(2.10)

where $\tilde{L}(x,\lambda ):=F(x)+\nabla _{y} g(x,x)\lambda $. Similar to the Proposition 2.2, if F is continuously differentiable and g is twice continuously differentiable, then H̃ is semismooth and

$$ \partial \tilde{H}(x,\lambda ,w)\subseteq \left \{ \begin{pmatrix} J_{x}\tilde{L}(x,\lambda )& \nabla _{y} g(x,x) & 0 \\ J p(x)&0&I \\ 0&U_{\lambda }& U_{w} \end{pmatrix} \right \} , $$

where $U_{\lambda }$ and $U_{w}$ are the same as in Proposition 2.2.

Now, we present the semismooth Newton method for (1.1).

Algorithm 1

(Semismooth Newton Method)

Step 0. Choose $z^{0}=(x^{0},\lambda ^{0},\nu ^{0},w^{0})\in R^{n}\times R^{m_{1}} \times R^{m_{2}}\times R^{m_{1}}$, $\rho >0$, $\beta \in (0,1)$, $\sigma \in (0,\frac{1}{2})$, $p>2$, $\varepsilon \ge 0$, and set $k:=0$.
Step 1. If $\|\nabla \Psi (z^{k})\|\le \varepsilon $, stop.
Step 2. Choose an arbitrary element $V_{k}\in \partial H(z^{k})$, and compute $d^{k}$ as a solution of the linear system of equations
$$ V_{k}d=-H\bigl(z^{k}\bigr). $$
(2.11)
If either this system is not solvable or the sufficient decrease condition
$$ \nabla \Psi \bigl(z^{k}\bigr)^{T}d^{k} \le -\rho \bigl\Vert d^{k} \bigr\Vert ^{p}$$
(2.12)
is not satisfied, then take $d^{k}:=-\nabla \Psi (z^{k})$.
Step 3. Compute a stepsize $t_{k}$ as the maximum of the numbers $\beta ^{l_{k}}$, $l_{k}=0,1,2,\ldots $ , such that the following Armijo condition holds:
$$ \Psi \bigl(z^{k}+t_{k}d^{k} \bigr)\le \Psi \bigl(z^{k}\bigr)+\sigma t_{k}\nabla \Psi \bigl(z^{k}\bigr)^{T} d^{k}.$$
(2.13)
Step 4. Set $z^{k+1}:=z^{k}+t_{k}d^{k}$, $k\leftarrow k+1$, and go to Step 1.

End.

Below, we establish the following global convergence theorem for Algorithm 1.

Theorem 2.3

Let $\{z^{k}\}=\{(x^{k},\lambda ^{k},\nu ^{k},w^{k})\}$ be a sequence of iterates generated by Algorithm 1. Then every accumulation point of the sequence $\{z^{k}\}$ is a stationary point of the merit function Ψ.

Proof

We prove it by contradiction. Firstly, if for an infinite set of indices N, $d^{k}=-\nabla \Psi (z^{k})$ for all $k\in N$, then, by [4] Proposition 1.16, we see that any limit point $z^{*}$ of ${z^{k}}$ satisfies $\nabla \Psi (z^{*})$.

In the following, we suppose the direction is always given by (2.11). Suppose $\{z^{k}\}\rightarrow z^{*}$ and $\nabla \Psi (z^{*})\ne 0$, by (2.11), we have

$$ \bigl\Vert H\bigl(z^{k}\bigr) \bigr\Vert = \bigl\Vert V_{k}d^{k} \bigr\Vert \le \Vert V_{k} \Vert \times \bigl\Vert d^{k} \bigr\Vert . $$

Noting that $\|V_{k}\|$ cannot be 0, otherwise $H(z^{k})=0$ and $z^{k}$ would be a stationary point. Hence, we have

$$ \bigl\Vert d^{k} \bigr\Vert \ge \frac{ \Vert H(z^{k}) \Vert }{ \Vert V_{k} \Vert }. $$

(2.14)

If for some subsequence N, $\{d^{k}\}_{N}\rightarrow 0$, we have by (2.14), $\{H(z^{k})\}_{N}\rightarrow 0$, and $z^{*}$ is a solution of the QVI (1.1). Hence, there exists a $m>0$ such that $\|d^{k}\|\ge m$. Noting that $\{\nabla \Psi (z^{k})\}_{N}$ is bounded and $p>2$, there exists $M>0$ such that $\|d^{k}\|\le M$. Otherwise, it would contradict (2.12).

By (2.13) and $\{z^{k}\}$ is a bounded sequence, $\Psi (z^{k})$ is bounded from below and $\{\Psi (z^{k+1})-\Psi (z^{k})\}\rightarrow 0$, which implies

$$ \bigl\{ \beta ^{l_{k}}\nabla \Psi \bigl(z^{k} \bigr)^{T}d^{k}\bigr\} \rightarrow 0. $$

(2.15)

Suppose, subsequencing if necessary, we have $\{\beta ^{l_{k}}\}\rightarrow 0$. By (2.13), we have

$$ \frac{\Psi (z^{k}+\beta ^{l_{k}-1}d^{k})-\Psi (z^{k})}{\beta ^{l_{k}-1}}> \sigma \nabla \Psi \bigl(z^{k} \bigr)^{T}d^{k}. $$

(2.16)

By $m\le \|d^{k}\|\le M$, we can assume, subsequencing if necessary, that $\{d^{k}\}\rightarrow \bar{d}\ne 0$. By passing to the limit in (2.16), we get

$$ \nabla \Psi \bigl(z^{k}\bigr)^{T} \bar{d}\ge \sigma \nabla \Psi \bigl(z^{k}\bigr)^{T} \bar{d}. $$

(2.17)

On the other hand, by (2.12), we have $\nabla \Psi (z^{k})^{T}\bar{d}\le -\rho \|\bar{d}\|^{p}<0$, which contradicts (2.17). Hence $\beta ^{l_{k}}$ is bounded away from 0. (2.15) and (2.12) imply that $\{d^{k}\}\rightarrow 0$, thus contradicting $0< m\le \|d^{k}\|$, so that $\nabla \Psi (z^{*})=0$. This completes the proof. □

Remark 2.2

The method proposed in [10] only considers the case of inequality constraints, while our method can solve QVI with both equality and inequality constraints. Besides, as we will see in the next section, our method can solve some problems in QVILIB [11], which cannot be solved by the method proposed by [10].

3 Numerical experiments

In this section, we report the results obtained by Algorithm 1 on problems list in QVILIB. All the computations in this paper were done using Matlab 2014a on a computer with 8.00 GB RAM and 2.5 GHz CPU. We solved all 55 test problems whose detailed description can be found in [11]. For each problem we list

the x-part of the starting point (the number reported is the value of all components of the x-part of the starting point);
the number of iterations;
the number of evaluations of Ψ;
the value of $Y(x,\lambda ,\nu )$ at the termination.

In order to perform the linear algebra involved, we used Matlab’s linear system solver mldivide. If any entry of the solution given by mldivide is a NaN or it is equal to ±∞ or the sufficient decrease condition is not satisfied, then an anti gradient direction is used. We take $\mu =10^{-5}$, $\beta =0.5$, $\rho =10^{-10}$, $\sigma =0.01$ and $p=2.1$. We choose $\lambda ^{0}=0$, $\nu ^{0}=0$ and $w^{0}=0$ for all problems. For (2.6), we choose $a_{i}=b_{i}=c_{i}=0$ when $(\lambda _{i},w_{i})= (0,0)$ and $\theta =0$. Our aim is mainly to verify the reliability of the method, and compare the iteration numbers with the results presented in [10]. In order to perform a fair computation with the results in [10], we choose the same stopping criterion, i.e., let

$$ Y(x,\lambda ,\nu )=\left \Vert \begin{pmatrix} L(x,\lambda ,\nu ) \\ S(\lambda ,-p(x))\end{pmatrix} \right \Vert _{\infty }, $$

and choose the termination criterion to be $Y(x^{k},\lambda ^{k},\nu ^{k})\le 10^{-4}$. The iteration is also stopped if the number of iterations exceeds 500 or the stepsize $t_{k}$ computed at Step 3 is less than 10⁻⁶.

We denote Algorithm 2.2 proposed in [10] by SSN, and compare our method with SSN. The results are list in Table 1. From Table 1, for problems that can be solved by SSN, they can also be solved by our method with almost the same iteration numbers except problems Box2B, Box3A, KunR11, KunR12, KunR21 and KunR22. However, our method can solve the problems RHS1A1, RHS1B1, RHS2A1, RHS2B1 and Wal3, which cannot be solved by SSN.

Table 1 Test results for Algorithm 1 and SSN

Full size table

We also compare our method with the interior point method (denoted by IP) proposed in [12] from the iteration number and CPU time. For IP, we use the same parameters presented in [12] and the results are list in Table 2. From Table 2, we can see that our method is much more effective than IP for most problems.

Table 2 Test results for Algorithm 1 and IP

Full size table

We also consider other problems in QVILIB which are not test in Tables 1 and 2, including the QVIs with equality constraints, that is, Problems LunSS1 to Scrim12 in Table 3. As we can see from the table, Algorithm 1 can solve over half of those problems effectively.

Table 3 Test results for rest QVIs in QVILIB

Full size table

We tried to make some modifications to the algorithm for those problems that cannot be solved. Specifically, when calculating the Jacobian of $\tilde{H}(\tilde{z})$, we use $\mathit{JF}(x)$ to approximate JL̃. For now, we cannot prove the convergence of the modified algorithm. However, it is interesting to find that the modified algorithm can find a solution for some problems, such as MoveSet3A1, MoveSet3A2, MoveSet3B1 and MoveSet3B2. The results are presented in Table 4.

Table 4 Test results for modified algorithm

Full size table

Figures 1–4 display the performance of our method on the problems BiLin1A, Movset4A1, OutZ40 and Wal2. The vertical axis in those figures represents the value of Y and the horizontal axis represents the iteration number. As we can see from the figures, with the increase of the iteration numbers, the value of Y decrease.

Conclusion Remarks

In this paper, we have studied the numerical solution of QVI. We obtain the KKT system of a QVI and present a semismooth Newton method to solve the equations. We also establish its global convergence. Numerical results show that the performance of the proposed algorithm is promising.

Availability of data and materials

Not applicable.

References

Baiocchi, C., Capelo, A.: Variational and Quasivariational Inequalities: Applications to Free Boundary Problems. Wiley, New York (1984)
MATH Google Scholar
Bensoussan, A., Lions, J.L.: Nouvelle formulation de problèmes de contrôle implusionnel et applications. C. R. Acad. Sci. Paris, Sér. A 276, 1189–1192 (1973)
MathSciNet MATH Google Scholar
Bensoussan, A., Lions, J.L.: Nouvelles méthodes en contrôle impulsionnel. Appl. Math. Optim. 1, 289–312 (1975)
Article Google Scholar
Bertserkas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York (1982)
Google Scholar
Chen, B., Chen, X., Kanzow, C.: A penalized Fischer–Burmeister NCP-function. Math. Program., Ser. A 88, 211–216 (2000)
Article MathSciNet Google Scholar
Dreves, A., Facchinei, F., Fischer, A., Herrich, M.: A new error bound result for generalized Nash equilibrium problems and its algorithmic application. Comput. Optim. Appl. 59, 63–84 (2014)
Article MathSciNet Google Scholar
Duan, Y., Wang, S., Zhou, Y.: A power penalty approach to a mixed quasilinear elliptic complementarity problem. J. Glob. Optim. (2021). https://doi.org/10.1007/s10898-021-01000-7
Article Google Scholar
Facchinei, F., Fischer, A., Herrich, M.: A family of Newton methods for nonsmooth constrained systems with nonisolated solutions. Math. Methods Oper. Res. 77, 433–443 (2013)
Article MathSciNet Google Scholar
Facchinei, F., Fischer, A., Herrich, M.: An LP-Newton method: nonsmooth equations, KKT systems, and nonisolated solutions. Math. Program. 146, 1–36 (2014)
Article MathSciNet Google Scholar
Facchinei, F., Kanzow, C., Karl, S., Sagratella, S.: The semismooth Newton method for the solution of quasi-variational inequalities. Comput. Optim. Appl. 62, 85–109 (2015)
Article MathSciNet Google Scholar
Facchinei, F., Kanzow, C., Sagratella, S.: QVILIB: a library of quasi-variational inequality test problems. Pac. J. Optim. 9, 225–250 (2013)
MathSciNet MATH Google Scholar
Facchinei, F., Kanzow, C., Sagratella, S.: Solving quasi-variational inequalities via their KKT conditions. Math. Program. 144, 369–412 (2014)
Article MathSciNet Google Scholar
Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, New York (2003)
MATH Google Scholar
Jiang, H., Qi, L.: A new nonsmooth equations approach to noninear complementarity problem. SIAM J. Control Optim. 35, 178–193 (1997)
Article MathSciNet Google Scholar
Kanzow, C., Kleinmichel, H.: A new class of semismooth Newton-type methods for nonlinear complementarity problems. Comput. Optim. Appl. 11, 227–251 (1998)
Article MathSciNet Google Scholar
Li, D.H., Fukushima, M.: Globally convergent Broyden-like methods for semismooth equations and applications to VIP, NCP and MCP. Ann. Oper. Res. 103, 71–97 (2001)
Article MathSciNet Google Scholar
Li, D.H., Li, Q., Xu, H.: An almost smooth equation reformulation to the nonlinear complementarity problem and Newton’s method. Optim. Methods Softw. 27, 1–13 (2011)
Article MathSciNet Google Scholar
Luca, T.D., Facchinei, F., Kanzow, C.: A semismooth equation approach to the solution of noninear complementarity problems. Math. Program. 75, 407–439 (1996)
MATH Google Scholar
Mangasarian, O.L., Solodov, M.V.: Nonlinear complementarity as unconstrained and constrained minization. Math. Program. 62, 277–297 (1993)
Article Google Scholar
Mifflin, R.: Semismooth and semiconvex functions in constrained optimization. SIAM J. Control Optim. 15, 957–972 (1997)
MathSciNet Google Scholar
Ni, T., Zhai, J.: A regularized smoothing Newton-type algorithm for quasi-variational inequalities. Comput. Math. Appl. 68, 1312–1324 (2014)
Article MathSciNet Google Scholar
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58, 353–367 (1993)
Article MathSciNet Google Scholar
Wang, S.: An interior penalty method for a large-scale finite-dimensional nonlinear double obstacle problem. Appl. Math. Model. 58, 217–228 (2018)
Article MathSciNet Google Scholar
Xie, S.L., Xu, H.R., Zeng, J.P.: Two-step modulus-based matrix splitting iteration method for a class of nonlinear complementarity problems. Linear Algebra Appl. 494, 1–10 (2016)
Article MathSciNet Google Scholar
Zhao, J.X., Wang, S.: A power penalty approach to a discretized obstacle problem with nonlinear constraints. Optim. Lett. 13, 1483–1504 (2019)
Article MathSciNet Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

The work was supported by the Natural Science Foundation of China (grant No. 11601188, 11761037), by the Educational Commission of Guangdong Province, China (grant No. 2019KTSCX172), by the Natural Science Foundation of Jiangxi Province (grant No. 20181BAB201009).

Author information

Authors and Affiliations

School of Mathematics, Jiaying University, Meizhou, 514015, China
Shui-Lian Xie & Hong-Ru Xu
College of Mathematics and Information Science, Jiangxi Normal University, Nanchang, China
Zhe Sun

Authors

Shui-Lian Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Sun
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Ru Xu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors jointly worked on the results. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhe Sun.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Xie, SL., Sun, Z. & Xu, HR. A new semismooth Newton method for solving finite-dimensional quasi-variational inequalities. J Inequal Appl 2021, 132 (2021). https://doi.org/10.1186/s13660-021-02671-2

Download citation

Received: 22 September 2020
Accepted: 15 July 2021
Published: 28 July 2021
DOI: https://doi.org/10.1186/s13660-021-02671-2

A new semismooth Newton method for solving finite-dimensional quasi-variational inequalities

Abstract

1 Introduction