# Modified Newton-type methods for the NCP by using a class of one-parametric NCP-functions

## Abstract

In this paper, we propose a new Newton-type method for solving the nonlinear complementarity problem (NCP) based on a class of one-parametric NCP-functions, where an approximate Newton direction can be obtained by solving a modified Newton equation in each iteration. The method is shown to be globally convergent without any additional assumption. To investigate the fast convergence of this class of methods, we propose a modified version of the proposed method and show the method is globally and locally superlinearly convergent. The preliminary numerical results show the effectiveness of the modified method.

## 1 Introduction

Consider the nonlinear complementarity problem $NCP\left(F\right)$

$x\ge 0,F\left(x\right)\ge 0,\phantom{\rule{1em}{0ex}}{x}^{T}F\left(x\right)=0,$

where $F:{\mathcal{R}}^{n}\to {\mathcal{R}}^{n}$ is a continuously differentiable function. We assume that F is a ${P}_{0}$-function throughout this paper. It is well known that $NCP\left(F\right)$ can be reformulated as a system of nonsmooth equations, where the so-called NCP-function plays an important role in this class of methods.

Definition 1 A function $\varphi :{\mathcal{R}}^{2}\to \mathcal{R}$ is called an NCP-function if it satisfies

$\varphi \left(a,b\right)=0\phantom{\rule{1em}{0ex}}⟺\phantom{\rule{1em}{0ex}}a\ge 0,\phantom{\rule{2em}{0ex}}b\ge 0,\phantom{\rule{2em}{0ex}}ab=0.$

Over the past two decades, a variety of NCP-functions have been studied (see, for example, ). Among them, a popular NCP-function is the well-known Fischer-Burmeister NCP-function  defined as

${\varphi }_{\mathrm{FB}}\left(a,b\right)=\sqrt{{a}^{2}+{b}^{2}}-a-b.$

In this paper, we use a family of NCP-functions based on the FB function, which was introduced by Kanzow and Kleinmichel ,

${\varphi }_{\lambda }\left(a,b\right)=\sqrt{{\left(a-b\right)}^{2}+\lambda ab}-a-b,$
(1)

where λ is a fixed parameter such that $\lambda \in \left(0,4\right)$. In the case of $\lambda =2$, the NCP-function ${\varphi }_{\lambda }$ obviously reduces to the Fischer-Burmeister function.

By using ${\varphi }_{\lambda }$ defined by (1), the NCP is equivalent to a system of nonsmooth equations

${\mathrm{\Phi }}_{\lambda }\left(x\right)=\left[\begin{array}{c}{\varphi }_{\lambda }\left({x}_{1},{F}_{1}\left(x\right)\right)\\ ⋮\\ {\varphi }_{\lambda }\left({x}_{n},{F}_{n}\left(x\right)\right)\end{array}\right]=0.$

Let ${\theta }_{\lambda }\left(x\right)=\frac{1}{2}{\parallel {\mathrm{\Phi }}_{\lambda }\left(x\right)\parallel }^{2}$. Then solving $NCP\left(F\right)$ is equivalent to solving the unconstrained minimization ${min}_{x\in {R}^{n}}{\theta }_{\lambda }\left(x\right)$ with the optimal value 0.

Kanzow and Kleinmichel  studied the properties of ${\mathrm{\Phi }}_{\lambda }$ and ${\theta }_{\lambda }$ and proposed the corresponding semismooth Newton method. Their method first attempted to use the Newton direction, but if the Newton equation is unsolvable or the Newton direction is not a direction of sufficient decrease for ${\theta }_{\lambda }$, then it switches to the steepest descent direction. In this paper, we propose a Newton-type method for the ${P}_{0}$-$NCP\left(F\right)$, where, in each iteration, we need to construct an approximation of $\partial {\mathrm{\Phi }}_{\lambda }\left(x\right)$ (the Clarke subdifferential of ${\mathrm{\Phi }}_{\lambda }$ at x, which is defined in the next section), which is nonsingular, and hence the direction-finding problem can be solved only by solving a system of perturbed Newton equations. We show that the proposed method is globally convergent without any additional assumption. The proposed method is similar to the one discussed by Yamashita and Fukushima , where the NCP-function ${\varphi }_{\mathrm{FB}}$ was used. Since ${\varphi }_{\mathrm{FB}}$ is a special case of ${\varphi }_{\lambda }$, the proposed method can be used more widely. However, it is hard for us to discuss the locally fast convergence of the proposed method. In order to investigate the locally fast convergence of this class of methods, we revise the proposed method. We show that the modified method is globally and locally superlinearly convergent. The preliminary numerical results show the effectiveness of the modified method.

## 2 Preliminaries

In this section, we recall some basic concepts and known results.

Definition 2 $F:{\mathcal{R}}^{n}\to {\mathcal{R}}^{n}$ is called a ${P}_{0}$-function if

$\underset{{x}_{i}\ne {y}_{i}}{\underset{1\le i\le n}{max}}\left({x}_{i}-{y}_{i}\right)\left({F}_{i}\left(x\right)-{F}_{i}\left(y\right)\right)\ge 0,\phantom{\rule{1em}{0ex}}\mathrm{\forall }x,y\in {\mathcal{R}}^{n},x\ne y.$

Definition 3 A matrix $M\in {\mathcal{R}}^{n×n}$ is a ${P}_{0}$-matrix if each of its principal minors is nonnegative.

It is known that the Jacobian of every continuously differentiable ${P}_{0}$-function is a ${P}_{0}$-matrix. The following theorem will play an important role in our analysis. Notice that, for a vector a, ${D}_{a}$ denotes the diagonal matrix with the i th diagonal element being ${a}_{i}$.

Theorem 4 (see )

Let M be a ${P}_{0}$-matrix, ${D}_{a}$ and ${D}_{b}$ be negative definite diagonal matrices. Then ${D}_{a}+{D}_{b}M$ is nonsingular.

Let $\mathrm{\Phi }:{\mathcal{R}}^{n}\to {\mathcal{R}}^{n}$ be locally Lipschitz continuous; by Rademacher’s theorem, Φ is differentiable almost everywhere.

Definition 5 Let ${D}_{\mathrm{\Phi }}$ denote the set , then the B-subdifferential of Φ at x is defined as

${\partial }_{B}\mathrm{\Phi }\left(x\right)=\left\{v\in {R}^{n×n}|v=\underset{{x}^{k}\to x}{\underset{{x}^{k}\in {D}_{\mathrm{\Phi }}}{lim}}{\mathrm{\Phi }}^{\prime }\left({x}^{k}\right)\right\}.$

The Clarke subdifferential of Φ at x is defined as

$\partial \mathrm{\Phi }\left(x\right)=\mathit{co}\phantom{\rule{0.2em}{0ex}}{\partial }_{B}\mathrm{\Phi }\left(x\right),$

where co denotes the convex hull of a set.

By the definition of ${\mathrm{\Phi }}_{\lambda }$, we know that ${\mathrm{\Phi }}_{\lambda }$ is not differentiable at x if ${x}_{i}=0={F}_{i}\left(x\right)$ for some i. However, since ${\mathrm{\Phi }}_{\lambda }$ is locally Lipschitz continuous [, Lemma 2.1], ${\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left(x\right)$ is nonempty at every $x\in {\mathcal{R}}^{n}$. But how to specify the set ${\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left(x\right)$ exactly at x where $\mathrm{\nabla }{\mathrm{\Phi }}_{\lambda }\left(x\right)$ does not exist?

To solve this problem, we construct two mappings $\stackrel{˜}{\mathcal{H}}$ and $\stackrel{ˆ}{\mathcal{H}}$ which approximate ${\partial }_{B}{\mathrm{\Phi }}_{\lambda }$. For a set X, we denote the power set of X by $P\left(X\right)$.

Define the mapping $\stackrel{˜}{\mathcal{H}}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{n×n}\right)$ as

$\stackrel{˜}{\mathcal{H}}\left(x\right)=\left\{\stackrel{˜}{H}\in {R}^{n×n}|\stackrel{˜}{H}={D}_{\stackrel{˜}{a}}+{D}_{\stackrel{˜}{b}}{F}^{\prime }\left(x\right),\left(\stackrel{˜}{a},\stackrel{˜}{b}\right)\in \stackrel{˜}{\mathrm{\Omega }}\left(x\right)\right\},$

where $\stackrel{˜}{\mathrm{\Omega }}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{2n}\right)$ is given by

$\stackrel{˜}{\mathrm{\Omega }}\left(x\right)=\left\{\left(\stackrel{˜}{a},\stackrel{˜}{b}\right)\in {R}^{2n}|\left({\stackrel{˜}{a}}_{i},{\stackrel{˜}{b}}_{i}\right)\in {\stackrel{˜}{\mathrm{\Omega }}}_{i}\left(x\right),i=1,2,\dots ,n\right\}$

with

(2)

Here, ${C}_{\lambda }$ denotes the constant $2-\frac{\lambda \left(4-\lambda \right)}{8}$, and

$\begin{array}{c}{\stackrel{ˆ}{a}}_{i}=\frac{2\left({x}_{i}-{F}_{i}\left(x\right)\right)+\lambda {F}_{i}\left(x\right)}{2\sqrt{{\left({x}_{i}-{F}_{i}\left(x\right)\right)}^{2}+\lambda {x}_{i}{F}_{i}\left(x\right)}}-1,\hfill \\ {\stackrel{ˆ}{b}}_{i}=\frac{-2\left({x}_{i}-{F}_{i}\left(x\right)\right)+\lambda {x}_{i}}{2\sqrt{{\left({x}_{i}-{F}_{i}\left(x\right)\right)}^{2}+\lambda {x}_{i}{F}_{i}\left(x\right)}}-1.\hfill \end{array}$

In the following, we define $\stackrel{ˆ}{\mathcal{H}}$ similarly to $\stackrel{˜}{\mathcal{H}}$, which is a subset of $\stackrel{˜}{\mathcal{H}}$.

The mapping $\stackrel{ˆ}{\mathcal{H}}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{n×n}\right)$ is defined by

$\stackrel{ˆ}{\mathcal{H}}\left(x\right)=\left\{\stackrel{ˆ}{H}\in {R}^{n×n}|\stackrel{ˆ}{H}={D}_{\stackrel{ˆ}{a}}+{D}_{\stackrel{ˆ}{b}}{F}^{\prime }\left(x\right),\left(\stackrel{ˆ}{a},\stackrel{ˆ}{b}\right)\in \stackrel{ˆ}{\mathrm{\Omega }}\left(x\right)\right\},$

where $\stackrel{ˆ}{\mathrm{\Omega }}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{2n}\right)$ is defined by

$\stackrel{ˆ}{\mathrm{\Omega }}\left(x\right)=\left\{\left(\stackrel{ˆ}{a},\stackrel{ˆ}{b}\right)\in {R}^{2n}|\left(\stackrel{ˆ}{a},\stackrel{ˆ}{b}\right)=\left(g\left(x,z\right),h\left(x,z\right)\right),z\in Z\left(x\right)\right\}.$

Here , and β denotes the set $\left\{i|{x}_{i}=0={F}_{i}\left(x\right)\right\}$. The components of a vector $g\left(x,z\right)$ are given by

and the components of a vector $h\left(x,z\right)$ are given by

Remark From (2), we find that, for every $x\in {\mathcal{R}}^{n}$, $\left(\stackrel{˜}{a},\stackrel{˜}{b}\right)\in \stackrel{˜}{\mathrm{\Omega }}\left(x\right)$ satisfies $-\sqrt{{C}_{\lambda }}-1\le {\stackrel{˜}{a}}_{i},{\stackrel{˜}{b}}_{i}\le 0$ (see [, Proposition 2.6]), and ${\stackrel{˜}{a}}_{i}$, ${\stackrel{˜}{b}}_{i}$ do not vanish simultaneously. It is the same with elements in $\stackrel{ˆ}{\mathcal{H}}$.

The mappings $\stackrel{˜}{\mathcal{H}}$ and $\stackrel{ˆ}{\mathcal{H}}$ have the following property which will play an important role in our analysis.

Theorem 6 For an arbitrary $x\in {\mathcal{R}}^{n}$, we have $\stackrel{ˆ}{\mathcal{H}}\left(x\right)\subseteq {\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left(x\right)\subseteq \stackrel{˜}{\mathcal{H}}\left(x\right)$.

Proof ${\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left(x\right)\subseteq \stackrel{˜}{\mathcal{H}}\left(x\right)$ was shown in [, Proposition 2.5]. Hence, we prove $\stackrel{ˆ}{\mathcal{H}}\left(x\right)\subseteq {\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left(x\right)$ in the following.

For an arbitrary $\stackrel{ˆ}{H}\in \stackrel{ˆ}{\mathcal{H}}\left(x\right)$, we shall build a sequence of points $\left\{{y}^{k}\right\}$ where ${\mathrm{\Phi }}_{\lambda }$ is differentiable at every ${y}^{k}$ and such that $\mathrm{\nabla }{\mathrm{\Phi }}_{\lambda }{\left({y}^{k}\right)}^{T}$ tends to $\stackrel{ˆ}{H}$; then the theorem will be obtained by the definition of B-subdifferential.

Let ${y}^{k}=x+{\epsilon }^{k}z$, where $z\in Z\left(x\right)$ and $\left\{{\epsilon }^{k}\right\}$ is a sequence of positive numbers converging to 0. If $i\notin \beta$, either ${x}_{i}\ne 0$ or ${F}_{i}\left(x\right)\ne 0$, and ${z}_{i}\ne 0$ for all $i\in \beta$.

We can see, by continuity, that if ${\epsilon }^{k}$ is small enough, then for each i, either ${y}_{i}^{k}\ne 0$ or ${F}_{i}\left({y}^{k}\right)\ne 0$, so ${\mathrm{\Phi }}_{\lambda }$ is differentiable at ${y}^{k}$. If $i\notin \beta$, by continuity, the i th row of $\mathrm{\nabla }{\mathrm{\Phi }}_{\lambda }{\left({y}^{k}\right)}^{T}$ tends to the i th row of $\stackrel{ˆ}{H}$. So, we only concern the case of $i\in \beta$.

From [, Proposition 2.5], we know that the i th row of $\mathrm{\nabla }{\mathrm{\Phi }}_{\lambda }{\left({y}^{k}\right)}^{T}$ is

$\left({a}_{i}\left({y}^{k}\right)-1\right){e}_{i}^{T}+\left({b}_{i}\left({y}^{k}\right)-1\right)\mathrm{\nabla }{F}_{i}{\left({y}^{k}\right)}^{T},$
(3)

where

$\begin{array}{c}{a}_{i}\left({y}^{k}\right)=\frac{2\left({\epsilon }^{k}{z}_{i}-{F}_{i}\left({y}^{k}\right)\right)+\lambda {F}_{i}\left({y}^{k}\right)}{2\sqrt{{\left({\epsilon }^{k}{z}_{i}-{F}_{i}\left({y}^{k}\right)\right)}^{2}+\lambda {\epsilon }^{k}{z}_{i}{F}_{i}\left({y}^{k}\right)}},\hfill \\ {b}_{i}\left({y}^{k}\right)=\frac{-2\left({\epsilon }^{k}{z}_{i}-{F}_{i}\left({y}^{k}\right)\right)+\lambda {\epsilon }^{k}{z}_{i}}{2\sqrt{{\left({\epsilon }^{k}{z}_{i}-{F}_{i}\left({y}^{k}\right)\right)}^{2}+\lambda {\epsilon }^{k}{z}_{i}{F}_{i}\left({y}^{k}\right)}}.\hfill \end{array}$

By Taylor-expansion, we have, for each $i\in \beta$,

(4)

Substituting (4) into (3) and passing to limit, we have, by the continuity of F, that the rows of $\mathrm{\nabla }{\mathrm{\Phi }}_{\lambda }{\left({y}^{k}\right)}^{T}$ tend to the corresponding rows of $\stackrel{ˆ}{H}$ when $i\in \beta$. Hence, $\mathrm{\nabla }{\mathrm{\Phi }}_{\lambda }{\left({y}^{k}\right)}^{T}$ tends to $\stackrel{ˆ}{H}$. □

In this paper, we present two algorithms. The first one, which is presented in Section 3, uses matrices obtained by perturbing $\stackrel{˜}{H}\in \stackrel{˜}{\mathcal{H}}$. We will establish its global convergence. While in Section 4, we present another algorithm based on $\stackrel{ˆ}{H}\in \stackrel{ˆ}{\mathcal{H}}$, which is a restricted version of the first one. The second algorithm can be superlinearly convergent.

## 3 Algorithm and global convergence

Considering the Newton-type method, the direction-finding problem is solved by $\stackrel{˜}{H}d=-{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)$, where $\stackrel{˜}{H}\in \stackrel{˜}{\mathcal{H}}\left({x}^{k}\right)$. However, $\stackrel{˜}{H}$ is not necessarily nonsingular. In this section, we will perturb $\stackrel{˜}{H}$ to $\stackrel{˜}{G}$, which is nonsingular. Then a search direction can be obtained by solving $\stackrel{˜}{G}d=-{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)$. Now, let us construct $\stackrel{˜}{G}$ as follows.

First, mapping ${\mathrm{\Lambda }}_{i}:{\mathcal{R}}^{n+2}\to P\left({\mathcal{R}}^{2}\right)$, $i=1,2,\dots ,n$ are defined by

where $\epsilon \in \left(0,1-\sqrt{{C}_{\lambda }/2}\right)$, and $\sigma :{\mathcal{R}}^{+}\to {\mathcal{R}}^{+}$ is a nondecreasing continuous function such that $\sigma \left(0\right)=0$ and $\sigma \left(t\right)>0$ for all $t>0$.

Because $\epsilon \in \left(0,1-\sqrt{\frac{{C}_{\lambda }}{2}}\right)$, it is obvious that for $\left(a,b\right)\in \stackrel{˜}{\mathrm{\Omega }}\left(x\right)$, the case of $-\epsilon <{a}_{i}$ and $-\epsilon <{b}_{i}$ will not happen.

In the following, we construct $\stackrel{˜}{G}$ as

$\stackrel{˜}{G}={D}_{\stackrel{˜}{p}}+{D}_{\stackrel{˜}{q}}{F}^{\prime }\left(x\right),$

where $\stackrel{˜}{p}$ and $\stackrel{˜}{q}$ are vectors such that

$\left({\stackrel{˜}{p}}_{i},{\stackrel{˜}{q}}_{i}\right)=\left({\stackrel{˜}{a}}_{i}+{\overline{a}}_{i},{\stackrel{˜}{b}}_{i}+{\overline{b}}_{i}\right),\phantom{\rule{1em}{0ex}}i=1,2,\dots ,n,$
(5)

with $\left(\stackrel{˜}{a},\stackrel{˜}{b}\right)\in \stackrel{˜}{\mathrm{\Omega }}\left(x\right)$, and $\left({\overline{a}}_{i},{\overline{b}}_{i}\right)\in {\mathrm{\Lambda }}_{i}\left(x,{\stackrel{˜}{a}}_{i},{\stackrel{˜}{b}}_{i}\right)$, $i=1,2,\dots ,n$.

If ${\theta }_{\lambda }\left(x\right)>0$, the definition of ${\mathrm{\Lambda }}_{i}$ and (5) imply that both ${D}_{\stackrel{˜}{p}}$, ${D}_{\stackrel{˜}{p}}$ are negative definite matrices. Furthermore, we define $\stackrel{˜}{\mathcal{G}}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{n×n}\right)$ as follows:

It is obvious that $\stackrel{˜}{G}={D}_{\stackrel{˜}{p}}+{D}_{\stackrel{˜}{q}}{F}^{\prime }\left(x\right)$ and $\stackrel{˜}{H}={D}_{\stackrel{˜}{a}}+{D}_{\stackrel{˜}{b}}{F}^{\prime }\left(x\right)$ are closely related. $\stackrel{˜}{G}$ is nonsingular under proper conditions.

Theorem 7 If x is not a solution of $NCP\left(F\right)$, i.e., ${\theta }_{\lambda }\left(x\right)>0$, then every $\stackrel{˜}{G}\in \stackrel{˜}{\mathcal{G}}\left(x\right)$ is nonsingular.

Proof For every $\stackrel{˜}{G}\in \stackrel{˜}{\mathcal{G}}\left(x\right)$, if ${\theta }_{\lambda }\left(x\right)>0$, then it follows from the definition of $\stackrel{˜}{G}$ that ${D}_{\stackrel{˜}{p}}$ and ${D}_{\stackrel{˜}{q}}$ are negative definite matrices.

Since F is a ${P}_{0}$-function, the Jacobian of F is a ${P}_{0}$-matrix. So, ${F}^{\prime }\left({x}^{k}\right)$ is a ${P}_{0}$-matrix. Hence, by Theorem 4, $\stackrel{˜}{G}$ is nonsingular. □

By the mapping $\stackrel{˜}{\mathcal{G}}$, we define $\stackrel{˜}{\mathrm{\Delta }}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{n}\right)$ as

$\stackrel{˜}{\mathrm{\Delta }}\left(x\right)=\left\{d\in {\mathcal{R}}^{n}|\stackrel{˜}{G}d=-{\mathrm{\Phi }}_{\lambda }\left(x\right),\stackrel{˜}{G}\in \stackrel{˜}{\mathcal{G}}\left(x\right)\right\}.$

It is easy to see that $\stackrel{˜}{\mathrm{\Delta }}\left(x\right)$ is nonempty for every x such that ${\theta }_{\lambda }\left(x\right)>0$. Now we give the first algorithm.

Algorithm 1 Step 1. Initialization: choose $\lambda \in \left(0,4\right)$, ${x}^{0}\in {\mathcal{R}}^{n}$, $\rho \in \left(0,0.5\right)$, $\beta \in \left(0,1\right)$, and set $k:=0$.

Step 2. Termination criterion: if ${\theta }_{\lambda }\left(x\right)=0$, stop. Otherwise, go to Step 3.

Step 3. Search direction calculation: find a vector ${d}^{k}\in \stackrel{˜}{\mathrm{\Delta }}\left({x}^{k}\right)$.

Step 4. Line search: let m be the smallest nonnegative integer such that

${\theta }_{\lambda }\left({x}^{k}+{\beta }^{m}{d}^{k}\right)-{\theta }_{\lambda }\left({x}^{k}\right)\le {\beta }^{m}\rho \mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{k}\right)}^{T}{d}^{k}.$

Step 5. Update: set ${x}^{k+1}:={x}^{k}+{t}_{k}{d}^{k}$, where ${t}_{k}={\beta }^{m}$, $k:=k+1$, and go to Step 2.

It is obvious that if ${\theta }_{\lambda }\left({x}^{k}\right)=0$, then ${x}^{k}$ is a solution of $NCP\left(F\right)$. Next, we will prove the global convergence of Algorithm 1. First, we show that every $d\in \stackrel{˜}{\mathrm{\Delta }}\left(x\right)$ is a descent direction of ${\theta }_{\lambda }$ at x.

Lemma 8 (see [, Lemma 3.2])

If x is not a solution of $NCP\left(F\right)$, i.e., ${\theta }_{\lambda }\left(x\right)>0$, then every $d\in \stackrel{˜}{\mathrm{\Delta }}\left(x\right)$ satisfies the descent condition for ${\theta }_{\lambda }$, i.e., $\mathrm{\nabla }{\theta }_{\lambda }{\left(x\right)}^{T}d<0$.

Theorem 9 Every accumulation point of a sequence $\left\{{x}^{k}\right\}$ generated by Algorithm  1 is a solution of $NCP\left(F\right)$.

Proof Owing to Step 4, $\left\{{\theta }_{\lambda }\left({x}^{k}\right)\right\}$ is decreasing monotonically and nonnegative. It must converge to some ${\theta }_{\lambda }^{\ast }\ge 0$. We assume ${\theta }_{\lambda }^{\ast }>0$. Let ${x}^{\ast }$ be an accumulation point of $\left\{{x}^{k}\right\}$ and ${\left\{{x}^{k}\right\}}_{k\in \mathcal{K}}$ be a subsequence converging to ${x}^{\ast }$.

$\stackrel{˜}{\mathrm{\Delta }}$ is uniformly compact near ${x}^{\ast }$ and closed at ${x}^{\ast }$ (see [, Lemma 3.4]), we assume, without loss of generality, that $lim\underset{}{{}_{k\to \mathrm{\infty }}k\in \mathcal{K}}{d}^{k}={d}^{\ast }\in \stackrel{˜}{\mathrm{\Delta }}\left({x}^{\ast }\right)$. From Lemma 8, we will get the contradiction if we can prove $\mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{\ast }\right)}^{T}{d}^{\ast }=0$. This can be obtained by considering the following two cases:

• Suppose that $inf\left\{{t}_{k}\right\}\ge t>0$. Then we have

${\theta }_{\lambda }\left({x}^{k}+{t}_{k}{d}^{k}\right)-{\theta }_{\lambda }\left({x}^{k}\right)\le {t}_{k}\rho \mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{k}\right)}^{T}{d}^{k}\le 0.$

It is obvious that $\mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{\ast }\right)}^{T}{d}^{\ast }=0$ is satisfied.

• Suppose that $inf\left\{{t}_{k}\right\}=0$. In this case, we assume $lim\underset{k\to \mathrm{\infty }}{k\in \mathcal{K}}{t}_{k}=0$ without loss of generality. By line search, we have

$\frac{{\theta }_{\lambda }\left({x}^{k}+\frac{{t}_{k}}{\beta }{d}^{k}\right)-{\theta }_{\lambda }\left({x}^{k}\right)}{\frac{{t}_{k}}{\beta }}>\rho \mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{k}\right)}^{T}{d}^{k},$

taking the limit yields $\mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{\ast }\right)}^{T}{d}^{\ast }\ge \rho \mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{\ast }\right)}^{T}{d}^{\ast }$. Since $\rho \in \left(0,0.5\right)$, we have $\mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{\ast }\right)}^{T}{d}^{\ast }\ge 0$. Hence, $\mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{\ast }\right)}^{T}{d}^{\ast }=0$.

We get the contradiction. The proof is complete. □

## 4 Modified algorithm and fast convergence

In the above section, we established global convergence of Algorithm 1. It determines a search direction based on $\stackrel{˜}{\mathcal{H}}$ which contains the generalized Jacobian ${\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left(x\right)$. However, it is hard for us to show the superlinear convergence of Algorithm 1. In the following, we should modify the search direction properly to accelerate the convergence of algorithm. By the definition of $\stackrel{ˆ}{\mathcal{H}}$, we know that $\stackrel{ˆ}{H}\in \stackrel{ˆ}{\mathcal{H}}$ is not necessarily nonsingular. Can we perturb $\stackrel{ˆ}{H}$ similar to $\stackrel{˜}{H}?$ Next, we give a positive answer to this question.

Define $\stackrel{ˆ}{G}$ as

$\stackrel{ˆ}{G}={D}_{\stackrel{ˆ}{p}}+{D}_{\stackrel{ˆ}{q}}{F}^{\prime }\left(x\right),$

where $\stackrel{ˆ}{p}$ and $\stackrel{ˆ}{q}$ are vectors such that

$\left({\stackrel{ˆ}{p}}_{i},{\stackrel{ˆ}{q}}_{i}\right)=\left({\stackrel{ˆ}{a}}_{i}+{\overline{a}}_{i},{\stackrel{ˆ}{b}}_{i}+{\overline{b}}_{i}\right),\phantom{\rule{1em}{0ex}}i=1,2,\dots ,n,$
(6)

with $\left(\stackrel{ˆ}{a},\stackrel{ˆ}{b}\right)\in \stackrel{ˆ}{\mathrm{\Omega }}\left(x\right)$, and $\left({\overline{a}}_{i},{\overline{b}}_{i}\right)\in {\mathrm{\Lambda }}_{i}\left(x,{\stackrel{ˆ}{a}}_{i},{\stackrel{ˆ}{b}}_{i}\right)$, $i=1,2,\dots ,n$.

If ${\theta }_{\lambda }\left(x\right)>0$, the definition of ${\mathrm{\Lambda }}_{i}$ and (6) imply that both ${D}_{\stackrel{ˆ}{p}}$, ${D}_{\stackrel{ˆ}{p}}$ are negative definite matrices.

Mapping $\stackrel{ˆ}{\mathcal{G}}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{n×n}\right)$ is defined by

From Theorem 6, $\stackrel{ˆ}{\mathcal{H}}\subseteq \stackrel{˜}{\mathcal{H}}$. It is obvious that $\stackrel{ˆ}{\mathcal{G}}\subseteq \stackrel{˜}{\mathcal{G}}$. And from Theorem 7, every $\stackrel{ˆ}{G}\in \stackrel{ˆ}{\mathcal{G}}\left(x\right)$ is nonsingular if ${\theta }_{\lambda }\left(x\right)>0$.

Define $\stackrel{ˆ}{\mathrm{\Delta }}:{\mathcal{R}}^{n}\to P\left({\mathcal{R}}^{n}\right)$ as

$\stackrel{ˆ}{\mathrm{\Delta }}\left(x\right)=\left\{d\in {\mathcal{R}}^{n}|\stackrel{ˆ}{G}d=-{\mathrm{\Phi }}_{\lambda }\left(x\right),\stackrel{ˆ}{G}\in \stackrel{ˆ}{\mathcal{G}}\left(x\right)\right\}.$

For any x, since $\stackrel{ˆ}{\mathcal{G}}\left(x\right)\subseteq \stackrel{˜}{\mathcal{G}}\left(x\right)$, we obtain $\stackrel{ˆ}{\mathrm{\Delta }}\left(x\right)\subseteq \stackrel{˜}{\mathrm{\Delta }}\left(x\right)$. Hence, $\stackrel{ˆ}{\mathrm{\Delta }}\left(x\right)$ is nonempty for every x such that ${\theta }_{\lambda }\left(x\right)>0$. Next, we will give the second algorithm. The search direction is chosen from $\stackrel{ˆ}{\mathrm{\Delta }}\left(x\right)$. The only difference from Algorithm 1 is the search direction.

Algorithm 2 Step 1. Initialization: choose $\lambda \in \left(0,4\right)$, ${x}^{0}\in {\mathcal{R}}^{n}$, $\rho \in \left(0,0.5\right)$, $\beta \in \left(0,1\right)$, and set $k:=0$.

Step 2. Termination criterion: if ${\theta }_{\lambda }\left(x\right)=0$, stop. Otherwise, go to Step 3.

Step 3. Search direction calculation: find a vector ${d}^{k}\in \stackrel{ˆ}{\mathrm{\Delta }}\left({x}^{k}\right)$.

Step 4. Line search: let m be the smallest nonnegative integer such that

${\theta }_{\lambda }\left({x}^{k}+{\beta }^{m}{d}^{k}\right)-{\theta }_{\lambda }\left({x}^{k}\right)\le {\beta }^{m}\rho \mathrm{\nabla }{\theta }_{\lambda }{\left({x}^{k}\right)}^{T}{d}^{k}.$

Step 5. Update: set ${x}^{k+1}:={x}^{k}+{t}_{k}{d}^{k}$, where ${t}_{k}={\beta }^{m}$, $k:=k+1$, and go to Step 2.

Since $\stackrel{ˆ}{\mathrm{\Delta }}\left({x}^{k}\right)\subseteq \stackrel{˜}{\mathrm{\Delta }}\left({x}^{k}\right)$ at each ${x}^{k}$ as mentioned above, global convergence of Algorithm 2 is directly obtained from Theorem 9. We state the following theorem without proof.

Theorem 10 Every accumulation point of a sequence $\left\{{x}^{k}\right\}$ generated by Algorithm  2 is a solution of $NCP\left(F\right)$.

In the following, we focus our attention on the superlinear convergence rate of Algorithm 2. To begin with, we assume that the sequence $\left\{{x}^{k}\right\}$ generated by Algorithm 2 has a unique limit point ${x}^{\ast }$.

Lemma 11 We have $\parallel {x}^{k}+{d}^{k}-{x}^{\ast }\parallel =o\left(\parallel {x}^{k}-{x}^{\ast }\parallel \right)$.

Proof For each k, we have

$\begin{array}{r}\parallel {x}^{k}+{d}^{k}-{x}^{\ast }\parallel \\ \phantom{\rule{1em}{0ex}}=\parallel {x}^{k}-{\stackrel{ˆ}{G}}_{k}^{-1}{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)-{x}^{\ast }\parallel \\ \phantom{\rule{1em}{0ex}}=\parallel {\stackrel{ˆ}{G}}_{k}^{-1}\left({\mathrm{\Phi }}_{\lambda }\left({x}^{\ast }\right)-{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)+{\stackrel{ˆ}{H}}_{k}\left({x}^{k}-{x}^{\ast }\right)+\left({\stackrel{ˆ}{G}}_{k}-{\stackrel{ˆ}{H}}_{k}\right)\left({x}^{k}-{x}^{\ast }\right)\right)\parallel \\ \phantom{\rule{1em}{0ex}}\le \parallel {\stackrel{ˆ}{G}}_{k}^{-1}\parallel \left(\parallel {\mathrm{\Phi }}_{\lambda }\left({x}^{\ast }\right)-{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)+{\stackrel{ˆ}{H}}_{k}\left({x}^{k}-{x}^{\ast }\right)\parallel +\parallel {\stackrel{ˆ}{G}}_{k}-{\stackrel{ˆ}{H}}_{k}\parallel \parallel {x}^{k}-{x}^{\ast }\parallel \right),\end{array}$

where ${\stackrel{ˆ}{H}}_{k}\in \stackrel{ˆ}{\mathcal{H}}\left({x}^{k}\right)$ is the matrix corresponding to ${\stackrel{ˆ}{G}}_{k}$. Since ${\mathrm{\Phi }}_{\lambda }$ is semismooth [, Lemma 2.2] and $\stackrel{ˆ}{\mathcal{H}}\left({x}^{k}\right)\in {\partial }_{B}{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)$ for each k, by Theorem 6, we have

$\parallel {\mathrm{\Phi }}_{\lambda }\left({x}^{\ast }\right)-{\mathrm{\Phi }}_{\lambda }\left({x}^{k}\right)+{\stackrel{ˆ}{H}}_{k}\left({x}^{k}-{x}^{\ast }\right)\parallel =o\left(\parallel {x}^{k}-{x}^{\ast }\parallel \right)$

(see the proof of [, Theorem 3.1]). Moreover, by the definition of ${\mathrm{\Lambda }}_{i}$ and (6), we have

$\parallel {\stackrel{ˆ}{G}}_{k}-{\stackrel{ˆ}{H}}_{k}\parallel =O\left(\sigma \left({\theta }_{\lambda }\left({x}^{k}\right)\right)\right).$

Consequently, it follows that

$\parallel {x}^{k}+{d}^{k}-{x}^{\ast }\parallel \le \parallel {\stackrel{ˆ}{G}}_{k}^{-1}\parallel \left(o\left(\parallel {x}^{k}-{x}^{\ast }\parallel \right)+O\left(\sigma \left({\theta }_{\lambda }\left({x}^{k}\right)\right)\right)\parallel {x}^{k}-{x}^{\ast }\parallel \right).$

Since $\sigma \left({\theta }_{\lambda }\left({x}^{k}\right)\right)\to 0$ and $\left\{\parallel {\stackrel{ˆ}{G}}_{k}^{-1}\parallel \right\}$ is bounded (see the proof of [, Lemma 4.2]), we obtain the desired result. □

Now, we prove the superlinear convergence of Algorithm 2.

Theorem 12 Algorithm  2 has a superlinear rate of convergence.

Proof We have ${x}^{k+1}={x}^{k}+{d}^{k}$ for all k sufficiently large (see the proof of [, Lemma 4.6]). It then follows from Lemma 11 that

$\underset{k\to \mathrm{\infty }}{lim}\frac{\parallel {x}^{k+1}-{x}^{\ast }\parallel }{\parallel {x}^{k}-{x}^{\ast }\parallel }=0.$

The proof is complete. □

## 5 Numerical results

In this section, we do some preliminary numerical experiments to test Algorithm 2 and compare its performance with that of the algorithms proposed in Chen and Pan  and Sun and Zeng .

First, we set $\beta =0.5$, $\rho ={10}^{-4}$, $\epsilon =0.05$ and $\sigma \left(t\right)=0.1min\left\{1,t\right\}$.

For $z\in Z\left(x\right)$, we define

The stopping criterion for Algorithm 2 is ${\theta }_{\lambda }\left({x}^{k}\right)\le {10}^{-8}$. The programs are coded in MATLAB and run on a personal computer with a 2.1 GHZ CPU processor.

The meaning of the columns in the tables are stated as follows:

iter: the total iteration number, resi: the value of ${\theta }_{\lambda }\left(x\right)$.

Problem 1 Let $F\left(x\right)=Ax+q$, where

$A=\left[\begin{array}{ccccc}3& -1& 0& \cdots & 0\\ -1& 3& -1& \cdots & 0\\ 0& 0& 0& \cdots & -1\\ ⋮& ⋮& ⋮& ⋮& ⋮\\ 0& 0& 0& \cdots & 3\end{array}\right],\phantom{\rule{2em}{0ex}}q={\left(-1,\dots ,-1\right)}^{T}.$

The corresponding complementarity problem has the unique solution. Table 1 lists the test results for Problem 1 with different n, λ and initial points $a={\left(-5,\dots ,-5\right)}^{T}$, $b={\left(0,\dots ,0\right)}^{T}$, $c={\left(1,\dots ,1\right)}^{T}$, $d={\left(8,\dots ,8\right)}^{T}$.

From Table 1, we see that the test results for $\lambda \in \left(0,1\right)$ are better than for other cases. Especially, the good numerical results are obtained when λ closes to 0. Then we compare the test results with Chen and Pan , where we set $p=2$, $\epsilon =1.0\mathrm{e}-08$, $\sigma =1.0\mathrm{e}-10$, $\beta =0.2$ for convenience. Table 2 lists the test results for .

Tables 1 and 2 indicate that Algorithm 2 performed much better than Chen and Pan  did on Problem 1.

Problem 2 Free boundary problems can also be solved by the method we presented. The following problem arises from the discretization of a free boundary problem (see ). Let $\mathrm{\Omega }=\left(0,1\right)×\left(0,1\right)$ and a function g satisfy $g\left(x,0\right)=x\left(1-x\right)$, $g\left(x,y\right)=0$ on $x=0,1$ or $y=1$.

Consider the following problem: find u such that

where $f\left(u,x,y\right)$ is a continuously differentiable ${P}_{0}$-function. We discretize the problem by the five-point difference scheme with mesh-step h. Then we get the following complementarity problem: find $x\in {\mathcal{R}}^{n}$ such that

$x\ge 0,\phantom{\rule{1em}{0ex}}Ax+\mathrm{\Psi }\left(x\right)\ge 0\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{x}^{T}\left(Ax+\mathrm{\Psi }\left(x\right)\right)=0.$

Set initial point as ${x}_{0}={\left(0,\dots ,0\right)}^{T}$. Table 3 lists the test results with different functions f, λ, and h.

From Table 3, we have the following observations.

• Our test results become better when λ decreases. It is obvious that when $\lambda =2$ the result is not good enough. That is to say, Algorithm 2 with the NCP-function ${\varphi }_{\lambda }$ ($0<\lambda <2$) is better than the one discussed in  where the Fischer-Burmeister function was used.

• Whether the function $f\left(u,x,y\right)$ is linear or nonlinear, the test results are good. The results are especially better when λ closes to 0.

We compare the test results with Sun and Zeng  where we set $\beta =0.5$, $c={0.5}^{4}$. Table 4 lists the test results for  with different functions f and h.

Tables 3 and 4 indicate that Algorithm 2 performed as well as Sun and Zeng  did on Problem 2.

Problem 3 We implemented Algorithm 2 for some test problems with all available starting points in MCPLIB . The results are reported in Table 5 with seconds for unit of time.

The above examples indicate that the results are better when λ closes to 0. A reasonable interpretation for this is that the values of ${g}_{i}\left(x,z\right)$ and ${h}_{i}\left(x,z\right)$ become smaller when λ increases and hence causes some difficulty for Algorithm 2. This also implies that the performance of Algorithm 2 will become worse when p increases. When $\lambda \to 0$, the NCP obviously reduces to $min\left\{x,F\left(x\right)\right\}=0$. But it is a nonsmooth equation so we cannot use this method.

## 6 Concluding remarks

In this paper, we have studied a class of one-parametric NCP-functions ${\varphi }_{\lambda }\left(\cdot ,\cdot \right)$ which include the well-known Fischer-Burmeister function as a special case and proposed modified Newton-type algorithms for solving ${P}_{0}$ complementarity problems.

Numerical results for the test problems have shown that this method is promising when $\lambda \in \left(0,4\right)$. Moreover, our numerical results indicated that the performance of the modified Newton-type method becomes better when λ decreases, which is a new and important numerical result. We believe that Algorithm 2 can effectively solve more practical problems if they can be reformulated as an $NCP\left(F\right)$. We leave this as a future research topic.

## References

1. Chen J-S: On some NCP-functions based on the generalized Fischer-Burmeister function. Asia-Pac. J. Oper. Res. 2007, 24: 401–420. 10.1142/S0217595907001292

2. Chen J-S, Pan S: A family of NCP functions and a descent method for the nonlinear complementarity problem. Comput. Optim. Appl. 2008, 40: 389–404. 10.1007/s10589-007-9086-0

3. Fischer A: A special Newton-type optimization methods. Optimization 1992, 24: 269–284. 10.1080/02331939208843795

4. Hu SL, Huang ZH, Chen J-S: Properties of a family of generalized NCP-functions and a derivative free algorithm for complementarity problems. J. Comput. Appl. Math. 2009, 230(1):69–82. 10.1016/j.cam.2008.10.056

5. Hu SL, Huang ZH, Lu N: Smoothness of a class of merit functions for the second-order cone complementarity problem. Pac. J. Optim. 2010, 6(3):551–571.

6. Kanzow C, Kleinmichel H: A new class of semismooth Newton-type methods for nonlinear complementarity problems. Comput. Optim. Appl. 1998, 11: 227–251. 10.1023/A:1026424918464

7. Lu LY, Huang ZH, Hu SL: Properties of a family of merit functions and a derivative-free method for the NCP. Appl. Math. J. Chin. Univ. Ser. A 2010, 25(4):379–390. 10.1007/s11766-010-2179-z

8. Yamashita N, Fukushima M: Modified Newton methods for solving a semismooth reformulation of monotone complementarity problems. Math. Program. 1997, 76: 469–491.

9. Qi L: Convergence analysis of some algorithms for solving nonsmooth equations. Math. Oper. Res. 1993, 18: 227–244. 10.1287/moor.18.1.227

10. Sun Z, Zeng JP: A monotone semismooth Newton type method for a class of complementarity problem. J. Comput. Appl. Math. 2011, 235: 1261–1274. 10.1016/j.cam.2010.08.012

11. Billups SC, Dirkse SP, Soares MC: A comparison of algorithms for large scale mixed complementarity problems. Comput. Optim. Appl. 1997, 7: 3–25. 10.1023/A:1008632215341

## Acknowledgements

This work was supported by the NSFC (50975200).

## Author information

Authors

### Corresponding author

Correspondence to Zijun Deng.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

WX participated in the design of the algorithm. ZD performed the numerical experiment and statistical analysis. All authors read and approved the final manuscript.

## Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

Xie, W., Deng, Z. Modified Newton-type methods for the NCP by using a class of one-parametric NCP-functions. J Inequal Appl 2012, 286 (2012). https://doi.org/10.1186/1029-242X-2012-286

• Accepted:

• Published:

• DOI: https://doi.org/10.1186/1029-242X-2012-286

### Keywords

• nonlinear complementarity problem
• NCP-function
• generalized Newton method
• global convergence
• superlinear convergence 