Modified Newton-type methods for the NCP by using a class of one-parametric NCP-functions

Xie, Weisong; Deng, Zijun

doi:10.1186/1029-242X-2012-286

Research
Open access
Published: 07 December 2012

Modified Newton-type methods for the NCP by using a class of one-parametric NCP-functions

Weisong Xie¹ &
Zijun Deng¹

Journal of Inequalities and Applications volume 2012, Article number: 286 (2012) Cite this article

1544 Accesses
Metrics details

Abstract

In this paper, we propose a new Newton-type method for solving the nonlinear complementarity problem (NCP) based on a class of one-parametric NCP-functions, where an approximate Newton direction can be obtained by solving a modified Newton equation in each iteration. The method is shown to be globally convergent without any additional assumption. To investigate the fast convergence of this class of methods, we propose a modified version of the proposed method and show the method is globally and locally superlinearly convergent. The preliminary numerical results show the effectiveness of the modified method.

1 Introduction

Consider the nonlinear complementarity problem $NCP (F)$

x \geq 0, F (x) \geq 0, x^{T} F (x) = 0,

where $F : R^{n} \to R^{n}$ is a continuously differentiable function. We assume that F is a $P_{0}$ -function throughout this paper. It is well known that $NCP (F)$ can be reformulated as a system of nonsmooth equations, where the so-called NCP-function plays an important role in this class of methods.

Definition 1 A function $ϕ : R^{2} \to R$ is called an NCP-function if it satisfies

ϕ (a, b) = 0 ⟺ a \geq 0, b \geq 0, a b = 0 .

Over the past two decades, a variety of NCP-functions have been studied (see, for example, [1–7]). Among them, a popular NCP-function is the well-known Fischer-Burmeister NCP-function [3] defined as

ϕ_{FB} (a, b) = \sqrt{a^{2} + b^{2}} - a - b .

In this paper, we use a family of NCP-functions based on the FB function, which was introduced by Kanzow and Kleinmichel [6],

ϕ_{λ} (a, b) = \sqrt{{(a - b)}^{2} + λ a b} - a - b,

(1)

where λ is a fixed parameter such that $λ \in (0, 4)$ . In the case of $λ = 2$ , the NCP-function $ϕ_{λ}$ obviously reduces to the Fischer-Burmeister function.

By using $ϕ_{λ}$ defined by (1), the NCP is equivalent to a system of nonsmooth equations

Φ_{λ} (x) = [\begin{array}{c} ϕ_{λ} (x_{1}, F_{1} (x)) \\ ⋮ \\ ϕ_{λ} (x_{n}, F_{n} (x)) \end{array}] = 0 .

Let $θ_{λ} (x) = \frac{1}{2} {∥ Φ_{λ} (x) ∥}^{2}$ . Then solving $NCP (F)$ is equivalent to solving the unconstrained minimization ${min}_{x \in R^{n}} θ_{λ} (x)$ with the optimal value 0.

Kanzow and Kleinmichel [6] studied the properties of $Φ_{λ}$ and $θ_{λ}$ and proposed the corresponding semismooth Newton method. Their method first attempted to use the Newton direction, but if the Newton equation is unsolvable or the Newton direction is not a direction of sufficient decrease for $θ_{λ}$ , then it switches to the steepest descent direction. In this paper, we propose a Newton-type method for the $P_{0}$ - $NCP (F)$ , where, in each iteration, we need to construct an approximation of $\partial Φ_{λ} (x)$ (the Clarke subdifferential of $Φ_{λ}$ at x, which is defined in the next section), which is nonsingular, and hence the direction-finding problem can be solved only by solving a system of perturbed Newton equations. We show that the proposed method is globally convergent without any additional assumption. The proposed method is similar to the one discussed by Yamashita and Fukushima [8], where the NCP-function $ϕ_{FB}$ was used. Since $ϕ_{FB}$ is a special case of $ϕ_{λ}$ , the proposed method can be used more widely. However, it is hard for us to discuss the locally fast convergence of the proposed method. In order to investigate the locally fast convergence of this class of methods, we revise the proposed method. We show that the modified method is globally and locally superlinearly convergent. The preliminary numerical results show the effectiveness of the modified method.

2 Preliminaries

In this section, we recall some basic concepts and known results.

Definition 2 $F : R^{n} \to R^{n}$ is called a $P_{0}$ -function if

\underset{x_{i} \neq y_{i}}{max_{1 \leq i \leq n}} (x_{i} - y_{i}) (F_{i} (x) - F_{i} (y)) \geq 0, \forall x, y \in R^{n}, x \neq y .

Definition 3 A matrix $M \in R^{n \times n}$ is a $P_{0}$ -matrix if each of its principal minors is nonnegative.

It is known that the Jacobian of every continuously differentiable $P_{0}$ -function is a $P_{0}$ -matrix. The following theorem will play an important role in our analysis. Notice that, for a vector a, $D_{a}$ denotes the diagonal matrix with the i th diagonal element being $a_{i}$ .

Theorem 4 (see [8])

Let M be a $P_{0}$ -matrix, $D_{a}$ and $D_{b}$ be negative definite diagonal matrices. Then $D_{a} + D_{b} M$ is nonsingular.

Let $Φ : R^{n} \to R^{n}$ be locally Lipschitz continuous; by Rademacher’s theorem, Φ is differentiable almost everywhere.

Definition 5 Let $D_{Φ}$ denote the set ${x \in R^{n} | Φ is differentiable at x}$ , then the B-subdifferential of Φ at x is defined as

\partial_{B} Φ (x) = {v \in R^{n \times n} | v = \underset{x^{k} \to x}{lim_{x^{k} \in D_{Φ}}} Φ^{'} (x^{k})} .

The Clarke subdifferential of Φ at x is defined as

\partial Φ (x) = co \partial_{B} Φ (x),

where co denotes the convex hull of a set.

By the definition of $Φ_{λ}$ , we know that $Φ_{λ}$ is not differentiable at x if $x_{i} = 0 = F_{i} (x)$ for some i. However, since $Φ_{λ}$ is locally Lipschitz continuous [[6], Lemma 2.1], $\partial_{B} Φ_{λ} (x)$ is nonempty at every $x \in R^{n}$ . But how to specify the set $\partial_{B} Φ_{λ} (x)$ exactly at x where $\nabla Φ_{λ} (x)$ does not exist?

To solve this problem, we construct two mappings $\tilde{H}$ and $\hat{H}$ which approximate $\partial_{B} Φ_{λ}$ . For a set X, we denote the power set of X by $P (X)$ .

Define the mapping $\tilde{H} : R^{n} \to P (R^{n \times n})$ as

\tilde{H} (x) = {\tilde{H} \in R^{n \times n} | \tilde{H} = D_{\tilde{a}} + D_{\tilde{b}} F^{'} (x), (\tilde{a}, \tilde{b}) \in \tilde{Ω} (x)},

where $\tilde{Ω} : R^{n} \to P (R^{2 n})$ is given by

\tilde{Ω} (x) = {(\tilde{a}, \tilde{b}) \in R^{2 n} | ({\tilde{a}}_{i}, {\tilde{b}}_{i}) \in {\tilde{Ω}}_{i} (x), i = 1, 2, \dots, n}

with

{\tilde{Ω}}_{i} (x) = {\begin{cases} {({\tilde{a}}_{i}, {\tilde{b}}_{i}) \in R^{2} | {({\tilde{a}}_{i} + 1)}^{2} + {({\tilde{b}}_{i} + 1)}^{2} \leq C_{λ}}, & if x_{i} = 0 = F_{i} (x), \\ {({\tilde{a}}_{i}, {\tilde{b}}_{i}) \in R^{2} | {\tilde{a}}_{i} = {\hat{a}}_{i}, {\tilde{b}}_{i} = {\hat{b}}_{i}}, & otherwise . \end{cases}

(2)

Here, $C_{λ}$ denotes the constant $2 - \frac{λ (4 - λ)}{8}$ , and

\begin{matrix} {\hat{a}}_{i} = \frac{2 (x_{i} - F_{i} (x)) + λ F_{i} (x)}{2 \sqrt{{(x_{i} - F_{i} (x))}^{2} + λ x_{i} F_{i} (x)}} - 1, \\ {\hat{b}}_{i} = \frac{- 2 (x_{i} - F_{i} (x)) + λ x_{i}}{2 \sqrt{{(x_{i} - F_{i} (x))}^{2} + λ x_{i} F_{i} (x)}} - 1 . \end{matrix}

In the following, we define $\hat{H}$ similarly to $\tilde{H}$ , which is a subset of $\tilde{H}$ .

The mapping $\hat{H} : R^{n} \to P (R^{n \times n})$ is defined by

\hat{H} (x) = {\hat{H} \in R^{n \times n} | \hat{H} = D_{\hat{a}} + D_{\hat{b}} F^{'} (x), (\hat{a}, \hat{b}) \in \hat{Ω} (x)},

where $\hat{Ω} : R^{n} \to P (R^{2 n})$ is defined by

\hat{Ω} (x) = {(\hat{a}, \hat{b}) \in R^{2 n} | (\hat{a}, \hat{b}) = (g (x, z), h (x, z)), z \in Z (x)} .

Here $Z (x) = {z \in R^{n} | z_{i} \neq 0, if i \in β}$ , and β denotes the set ${i | x_{i} = 0 = F_{i} (x)}$ . The components of a vector $g (x, z)$ are given by

g_{i} (x, z) = {\begin{matrix} \frac{2 (z_{i} - \nabla F_{i}^{T} (x) z) + λ \nabla F_{i}^{T} (x) z}{2 \sqrt{{(z_{i} - \nabla F_{i}^{T} (x) z)}^{2} + λ z_{i} \nabla F_{i}^{T} (x) z}} - 1, & if x_{i} = 0 = F_{i} (x), \\ \frac{2 (x_{i} - F_{i} (x)) + λ F_{i} (x)}{2 \sqrt{{(x_{i} - F_{i} (x))}^{2} + λ x_{i} F_{i} (x)}} - 1, & otherwise; \end{matrix}

and the components of a vector $h (x, z)$ are given by

h_{i} (x, z) = {\begin{matrix} \frac{- 2 (z_{i} - \nabla F_{i}^{T} (x) z) + λ z_{i}}{2 \sqrt{{(z_{i} - \nabla F_{i}^{T} (x) z)}^{2} + λ z_{i} \nabla F_{i}^{T} (x) z}} - 1, & if x_{i} = 0 = F_{i} (x), \\ \frac{- 2 (x_{i} - F_{i} (x)) + λ x_{i}}{2 \sqrt{{(x_{i} - F_{i} (x))}^{2} + λ x_{i} F_{i} (x)}} - 1, & otherwise . \end{matrix}

Remark From (2), we find that, for every $x \in R^{n}$ , $(\tilde{a}, \tilde{b}) \in \tilde{Ω} (x)$ satisfies $- \sqrt{C_{λ}} - 1 \leq {\tilde{a}}_{i}, {\tilde{b}}_{i} \leq 0$ (see [[6], Proposition 2.6]), and ${\tilde{a}}_{i}$ , ${\tilde{b}}_{i}$ do not vanish simultaneously. It is the same with elements in $\hat{H}$ .

The mappings $\tilde{H}$ and $\hat{H}$ have the following property which will play an important role in our analysis.

Theorem 6 For an arbitrary $x \in R^{n}$ , we have $\hat{H} (x) \subseteq \partial_{B} Φ_{λ} (x) \subseteq \tilde{H} (x)$ .

Proof $\partial_{B} Φ_{λ} (x) \subseteq \tilde{H} (x)$ was shown in [[6], Proposition 2.5]. Hence, we prove $\hat{H} (x) \subseteq \partial_{B} Φ_{λ} (x)$ in the following.

For an arbitrary $\hat{H} \in \hat{H} (x)$ , we shall build a sequence of points ${y^{k}}$ where $Φ_{λ}$ is differentiable at every $y^{k}$ and such that $\nabla Φ_{λ} {(y^{k})}^{T}$ tends to $\hat{H}$ ; then the theorem will be obtained by the definition of B-subdifferential.

Let $y^{k} = x + ε^{k} z$ , where $z \in Z (x)$ and ${ε^{k}}$ is a sequence of positive numbers converging to 0. If $i \notin β$ , either $x_{i} \neq 0$ or $F_{i} (x) \neq 0$ , and $z_{i} \neq 0$ for all $i \in β$ .

We can see, by continuity, that if $ε^{k}$ is small enough, then for each i, either $y_{i}^{k} \neq 0$ or $F_{i} (y^{k}) \neq 0$ , so $Φ_{λ}$ is differentiable at $y^{k}$ . If $i \notin β$ , by continuity, the i th row of $\nabla Φ_{λ} {(y^{k})}^{T}$ tends to the i th row of $\hat{H}$ . So, we only concern the case of $i \in β$ .

From [[6], Proposition 2.5], we know that the i th row of $\nabla Φ_{λ} {(y^{k})}^{T}$ is

(a_{i} (y^{k}) - 1) e_{i}^{T} + (b_{i} (y^{k}) - 1) \nabla F_{i} {(y^{k})}^{T},

(3)

where

\begin{matrix} a_{i} (y^{k}) = \frac{2 (ε^{k} z_{i} - F_{i} (y^{k})) + λ F_{i} (y^{k})}{2 \sqrt{{(ε^{k} z_{i} - F_{i} (y^{k}))}^{2} + λ ε^{k} z_{i} F_{i} (y^{k})}}, \\ b_{i} (y^{k}) = \frac{- 2 (ε^{k} z_{i} - F_{i} (y^{k})) + λ ε^{k} z_{i}}{2 \sqrt{{(ε^{k} z_{i} - F_{i} (y^{k}))}^{2} + λ ε^{k} z_{i} F_{i} (y^{k})}} . \end{matrix}

By Taylor-expansion, we have, for each $i \in β$ ,

F_{i} (y^{k}) = F_{i} (x) + ε^{k} \nabla F_{i} {(ξ^{k})}^{T} z = ε^{k} \nabla F_{i} {(ξ^{k})}^{T} z, with ξ^{k} \to x .

(4)

Substituting (4) into (3) and passing to limit, we have, by the continuity of ∇F, that the rows of $\nabla Φ_{λ} {(y^{k})}^{T}$ tend to the corresponding rows of $\hat{H}$ when $i \in β$ . Hence, $\nabla Φ_{λ} {(y^{k})}^{T}$ tends to $\hat{H}$ . □

In this paper, we present two algorithms. The first one, which is presented in Section 3, uses matrices obtained by perturbing $\tilde{H} \in \tilde{H}$ . We will establish its global convergence. While in Section 4, we present another algorithm based on $\hat{H} \in \hat{H}$ , which is a restricted version of the first one. The second algorithm can be superlinearly convergent.

3 Algorithm and global convergence

Considering the Newton-type method, the direction-finding problem is solved by $\tilde{H} d = - Φ_{λ} (x^{k})$ , where $\tilde{H} \in \tilde{H} (x^{k})$ . However, $\tilde{H}$ is not necessarily nonsingular. In this section, we will perturb $\tilde{H}$ to $\tilde{G}$ , which is nonsingular. Then a search direction can be obtained by solving $\tilde{G} d = - Φ_{λ} (x^{k})$ . Now, let us construct $\tilde{G}$ as follows.

First, mapping $Λ_{i} : R^{n + 2} \to P (R^{2})$ , $i = 1, 2, \dots, n$ are defined by

Λ_{i} (x, a_{i}, b_{i}) = {\begin{cases} {({\bar{a}}_{i}, {\bar{b}}_{i}) \in R^{2} | \begin{array}{l} {\bar{a}}_{i} = \frac{σ (θ_{λ} (x))}{b_{i}} \\ {\bar{b}}_{i} = 0 \end{array}}, & if - ε < a_{i} and b_{i} \leq - ε, \\ {({\bar{a}}_{i}, {\bar{b}}_{i}) \in R^{2} | \begin{array}{l} {\bar{a}}_{i} = τ \frac{σ (θ_{λ} (x))}{b_{i}} \\ {\bar{b}}_{i} = (1 - τ) \frac{σ (θ_{λ} (x))}{a_{i}} \\ τ \in [0, 1] \end{array}}, & if a_{i} \leq - ε and b_{i} \leq - ε, \\ {({\bar{a}}_{i}, {\bar{b}}_{i}) \in R^{2} | \begin{array}{l} {\bar{a}}_{i} = 0 \\ {\bar{b}}_{i} = \frac{σ (θ_{λ} (x))}{a_{i}} \end{array}}, & if a_{i} \leq - ε and - ε < b_{i}, \end{cases}

where $ε \in (0, 1 - \sqrt{C_{λ} / 2})$ , and $σ : R^{+} \to R^{+}$ is a nondecreasing continuous function such that $σ (0) = 0$ and $σ (t) > 0$ for all $t > 0$ .

Because $ε \in (0, 1 - \sqrt{\frac{C_{λ}}{2}})$ , it is obvious that for $(a, b) \in \tilde{Ω} (x)$ , the case of $- ε < a_{i}$ and $- ε < b_{i}$ will not happen.

In the following, we construct $\tilde{G}$ as

\tilde{G} = D_{\tilde{p}} + D_{\tilde{q}} F^{'} (x),

where $\tilde{p}$ and $\tilde{q}$ are vectors such that

({\tilde{p}}_{i}, {\tilde{q}}_{i}) = ({\tilde{a}}_{i} + {\bar{a}}_{i}, {\tilde{b}}_{i} + {\bar{b}}_{i}), i = 1, 2, \dots, n,

(5)

with $(\tilde{a}, \tilde{b}) \in \tilde{Ω} (x)$ , and $({\bar{a}}_{i}, {\bar{b}}_{i}) \in Λ_{i} (x, {\tilde{a}}_{i}, {\tilde{b}}_{i})$ , $i = 1, 2, \dots, n$ .

If $θ_{λ} (x) > 0$ , the definition of $Λ_{i}$ and (5) imply that both $D_{\tilde{p}}$ , $D_{\tilde{p}}$ are negative definite matrices. Furthermore, we define $\tilde{G} : R^{n} \to P (R^{n \times n})$ as follows:

\tilde{G} (x) = {\tilde{G} \in R^{n \times n} | \begin{array}{l} \tilde{G} = D_{\tilde{p}} + D_{\tilde{q}} F^{'} (x), (\tilde{p}, \tilde{q}) is defined by (5) \\ with (\tilde{a}, \tilde{b}) \in \tilde{Ω} (x) and ({\bar{a}}_{i}, {\bar{b}}_{i}) \in Λ_{i} (x, {\tilde{a}}_{i}, {\tilde{b}}_{i}) \\ for i = 1, 2, \dots, n \end{array}} .

It is obvious that $\tilde{G} = D_{\tilde{p}} + D_{\tilde{q}} F^{'} (x)$ and $\tilde{H} = D_{\tilde{a}} + D_{\tilde{b}} F^{'} (x)$ are closely related. $\tilde{G}$ is nonsingular under proper conditions.

Theorem 7 If x is not a solution of $NCP (F)$ , i.e., $θ_{λ} (x) > 0$ , then every $\tilde{G} \in \tilde{G} (x)$ is nonsingular.

Proof For every $\tilde{G} \in \tilde{G} (x)$ , if $θ_{λ} (x) > 0$ , then it follows from the definition of $\tilde{G}$ that $D_{\tilde{p}}$ and $D_{\tilde{q}}$ are negative definite matrices.

Since F is a $P_{0}$ -function, the Jacobian of F is a $P_{0}$ -matrix. So, $F^{'} (x^{k})$ is a $P_{0}$ -matrix. Hence, by Theorem 4, $\tilde{G}$ is nonsingular. □

By the mapping $\tilde{G}$ , we define $\tilde{Δ} : R^{n} \to P (R^{n})$ as

\tilde{Δ} (x) = {d \in R^{n} | \tilde{G} d = - Φ_{λ} (x), \tilde{G} \in \tilde{G} (x)} .

It is easy to see that $\tilde{Δ} (x)$ is nonempty for every x such that $θ_{λ} (x) > 0$ . Now we give the first algorithm.

Algorithm 1 Step 1. Initialization: choose $λ \in (0, 4)$ , $x^{0} \in R^{n}$ , $ρ \in (0, 0.5)$ , $β \in (0, 1)$ , and set $k : = 0$ .

Step 2. Termination criterion: if $θ_{λ} (x) = 0$ , stop. Otherwise, go to Step 3.

Step 3. Search direction calculation: find a vector $d^{k} \in \tilde{Δ} (x^{k})$ .

Step 4. Line search: let m be the smallest nonnegative integer such that

θ_{λ} (x^{k} + β^{m} d^{k}) - θ_{λ} (x^{k}) \leq β^{m} ρ \nabla θ_{λ} {(x^{k})}^{T} d^{k} .

Step 5. Update: set $x^{k + 1} : = x^{k} + t_{k} d^{k}$ , where $t_{k} = β^{m}$ , $k : = k + 1$ , and go to Step 2.

It is obvious that if $θ_{λ} (x^{k}) = 0$ , then $x^{k}$ is a solution of $NCP (F)$ . Next, we will prove the global convergence of Algorithm 1. First, we show that every $d \in \tilde{Δ} (x)$ is a descent direction of $θ_{λ}$ at x.

Lemma 8 (see [[8], Lemma 3.2])

If x is not a solution of $NCP (F)$ , i.e., $θ_{λ} (x) > 0$ , then every $d \in \tilde{Δ} (x)$ satisfies the descent condition for $θ_{λ}$ , i.e., $\nabla θ_{λ} {(x)}^{T} d < 0$ .

Theorem 9 Every accumulation point of a sequence ${x^{k}}$ generated by Algorithm 1 is a solution of $NCP (F)$ .

Proof Owing to Step 4, ${θ_{λ} (x^{k})}$ is decreasing monotonically and nonnegative. It must converge to some $θ_{λ}^{*} \geq 0$ . We assume $θ_{λ}^{*} > 0$ . Let $x^{*}$ be an accumulation point of ${x^{k}}$ and ${x^{k}}_{k \in K}$ be a subsequence converging to $x^{*}$ .

$\tilde{Δ}$ is uniformly compact near $x^{*}$ and closed at $x^{*}$ (see [[8], Lemma 3.4]), we assume, without loss of generality, that $lim \underset{}{{}_{k \to \infty}{k \in K}} d^{k} = d^{*} \in \tilde{Δ} (x^{*})$ . From Lemma 8, we will get the contradiction if we can prove $\nabla θ_{λ} {(x^{*})}^{T} d^{*} = 0$ . This can be obtained by considering the following two cases:

• Suppose that $inf {t_{k}} \geq t > 0$ . Then we have

θ_{λ} (x^{k} + t_{k} d^{k}) - θ_{λ} (x^{k}) \leq t_{k} ρ \nabla θ_{λ} {(x^{k})}^{T} d^{k} \leq 0 .

It is obvious that $\nabla θ_{λ} {(x^{*})}^{T} d^{*} = 0$ is satisfied.

• Suppose that $inf {t_{k}} = 0$ . In this case, we assume $lim \underset{k \to \infty}{k \in K} t_{k} = 0$ without loss of generality. By line search, we have

\frac{θ_{λ} (x^{k} + \frac{t_{k}}{β} d^{k}) - θ_{λ} (x^{k})}{\frac{t_{k}}{β}} > ρ \nabla θ_{λ} {(x^{k})}^{T} d^{k},

taking the limit yields $\nabla θ_{λ} {(x^{*})}^{T} d^{*} \geq ρ \nabla θ_{λ} {(x^{*})}^{T} d^{*}$ . Since $ρ \in (0, 0.5)$ , we have $\nabla θ_{λ} {(x^{*})}^{T} d^{*} \geq 0$ . Hence, $\nabla θ_{λ} {(x^{*})}^{T} d^{*} = 0$ .

We get the contradiction. The proof is complete. □

4 Modified algorithm and fast convergence

In the above section, we established global convergence of Algorithm 1. It determines a search direction based on $\tilde{H}$ which contains the generalized Jacobian $\partial_{B} Φ_{λ} (x)$ . However, it is hard for us to show the superlinear convergence of Algorithm 1. In the following, we should modify the search direction properly to accelerate the convergence of algorithm. By the definition of $\hat{H}$ , we know that $\hat{H} \in \hat{H}$ is not necessarily nonsingular. Can we perturb $\hat{H}$ similar to $\tilde{H} ?$ Next, we give a positive answer to this question.

Define $\hat{G}$ as

\hat{G} = D_{\hat{p}} + D_{\hat{q}} F^{'} (x),

where $\hat{p}$ and $\hat{q}$ are vectors such that

({\hat{p}}_{i}, {\hat{q}}_{i}) = ({\hat{a}}_{i} + {\bar{a}}_{i}, {\hat{b}}_{i} + {\bar{b}}_{i}), i = 1, 2, \dots, n,

(6)

with $(\hat{a}, \hat{b}) \in \hat{Ω} (x)$ , and $({\bar{a}}_{i}, {\bar{b}}_{i}) \in Λ_{i} (x, {\hat{a}}_{i}, {\hat{b}}_{i})$ , $i = 1, 2, \dots, n$ .

If $θ_{λ} (x) > 0$ , the definition of $Λ_{i}$ and (6) imply that both $D_{\hat{p}}$ , $D_{\hat{p}}$ are negative definite matrices.

Mapping $\hat{G} : R^{n} \to P (R^{n \times n})$ is defined by

\hat{G} (x) = {\hat{G} \in R^{n \times n} | \begin{array}{l} \hat{G} = D_{\hat{p}} + D_{\hat{q}} F^{'} (x), (\hat{p}, \hat{q}) is defined by (6) \\ with (\hat{a}, \hat{b}) \in \hat{Ω} (x) and ({\bar{a}}_{i}, {\bar{b}}_{i}) \in Λ_{i} (x, {\hat{a}}_{i}, {\hat{b}}_{i}) \\ for i = 1, 2, \dots, n \end{array}} .

From Theorem 6, $\hat{H} \subseteq \tilde{H}$ . It is obvious that $\hat{G} \subseteq \tilde{G}$ . And from Theorem 7, every $\hat{G} \in \hat{G} (x)$ is nonsingular if $θ_{λ} (x) > 0$ .

Define $\hat{Δ} : R^{n} \to P (R^{n})$ as

\hat{Δ} (x) = {d \in R^{n} | \hat{G} d = - Φ_{λ} (x), \hat{G} \in \hat{G} (x)} .

For any x, since $\hat{G} (x) \subseteq \tilde{G} (x)$ , we obtain $\hat{Δ} (x) \subseteq \tilde{Δ} (x)$ . Hence, $\hat{Δ} (x)$ is nonempty for every x such that $θ_{λ} (x) > 0$ . Next, we will give the second algorithm. The search direction is chosen from $\hat{Δ} (x)$ . The only difference from Algorithm 1 is the search direction.

Algorithm 2 Step 1. Initialization: choose $λ \in (0, 4)$ , $x^{0} \in R^{n}$ , $ρ \in (0, 0.5)$ , $β \in (0, 1)$ , and set $k : = 0$ .

Step 2. Termination criterion: if $θ_{λ} (x) = 0$ , stop. Otherwise, go to Step 3.

Step 3. Search direction calculation: find a vector $d^{k} \in \hat{Δ} (x^{k})$ .

Step 4. Line search: let m be the smallest nonnegative integer such that

θ_{λ} (x^{k} + β^{m} d^{k}) - θ_{λ} (x^{k}) \leq β^{m} ρ \nabla θ_{λ} {(x^{k})}^{T} d^{k} .

Step 5. Update: set $x^{k + 1} : = x^{k} + t_{k} d^{k}$ , where $t_{k} = β^{m}$ , $k : = k + 1$ , and go to Step 2.

Since $\hat{Δ} (x^{k}) \subseteq \tilde{Δ} (x^{k})$ at each $x^{k}$ as mentioned above, global convergence of Algorithm 2 is directly obtained from Theorem 9. We state the following theorem without proof.

Theorem 10 Every accumulation point of a sequence ${x^{k}}$ generated by Algorithm 2 is a solution of $NCP (F)$ .

In the following, we focus our attention on the superlinear convergence rate of Algorithm 2. To begin with, we assume that the sequence ${x^{k}}$ generated by Algorithm 2 has a unique limit point $x^{*}$ .

Lemma 11 We have $∥ x^{k} + d^{k} - x^{*} ∥ = o (∥ x^{k} - x^{*} ∥)$ .

Proof For each k, we have

\begin{aligned} ∥ x^{k} + d^{k} - x^{*} ∥ \\ = ∥ x^{k} - {\hat{G}}_{k}^{- 1} Φ_{λ} (x^{k}) - x^{*} ∥ \\ = ∥ {\hat{G}}_{k}^{- 1} (Φ_{λ} (x^{*}) - Φ_{λ} (x^{k}) + {\hat{H}}_{k} (x^{k} - x^{*}) + ({\hat{G}}_{k} - {\hat{H}}_{k}) (x^{k} - x^{*})) ∥ \\ \leq ∥ {\hat{G}}_{k}^{- 1} ∥ (∥ Φ_{λ} (x^{*}) - Φ_{λ} (x^{k}) + {\hat{H}}_{k} (x^{k} - x^{*}) ∥ + ∥ {\hat{G}}_{k} - {\hat{H}}_{k} ∥ ∥ x^{k} - x^{*} ∥), \end{aligned}

where ${\hat{H}}_{k} \in \hat{H} (x^{k})$ is the matrix corresponding to ${\hat{G}}_{k}$ . Since $Φ_{λ}$ is semismooth [[6], Lemma 2.2] and $\hat{H} (x^{k}) \in \partial_{B} Φ_{λ} (x^{k})$ for each k, by Theorem 6, we have

∥ Φ_{λ} (x^{*}) - Φ_{λ} (x^{k}) + {\hat{H}}_{k} (x^{k} - x^{*}) ∥ = o (∥ x^{k} - x^{*} ∥)

(see the proof of [[9], Theorem 3.1]). Moreover, by the definition of $Λ_{i}$ and (6), we have

∥ {\hat{G}}_{k} - {\hat{H}}_{k} ∥ = O (σ (θ_{λ} (x^{k}))) .

Consequently, it follows that

∥ x^{k} + d^{k} - x^{*} ∥ \leq ∥ {\hat{G}}_{k}^{- 1} ∥ (o (∥ x^{k} - x^{*} ∥) + O (σ (θ_{λ} (x^{k}))) ∥ x^{k} - x^{*} ∥) .

Since $σ (θ_{λ} (x^{k})) \to 0$ and ${∥ {\hat{G}}_{k}^{- 1} ∥}$ is bounded (see the proof of [[8], Lemma 4.2]), we obtain the desired result. □

Now, we prove the superlinear convergence of Algorithm 2.

Theorem 12 Algorithm 2 has a superlinear rate of convergence.

Proof We have $x^{k + 1} = x^{k} + d^{k}$ for all k sufficiently large (see the proof of [[8], Lemma 4.6]). It then follows from Lemma 11 that

lim_{k \to \infty} \frac{∥ x^{k + 1} - x^{*} ∥}{∥ x^{k} - x^{*} ∥} = 0 .

The proof is complete. □

5 Numerical results

In this section, we do some preliminary numerical experiments to test Algorithm 2 and compare its performance with that of the algorithms proposed in Chen and Pan [2] and Sun and Zeng [10].

First, we set $β = 0.5$ , $ρ = 10^{- 4}$ , $ε = 0.05$ and $σ (t) = 0.1 min {1, t}$ .

For $z \in Z (x)$ , we define

z_{i} = {\begin{matrix} 1, & if x_{i}^{k} = F_{i} (x^{k}) = 0, \\ 0, & otherwise, \end{matrix} i = 1, 2, \dots, n .

The stopping criterion for Algorithm 2 is $θ_{λ} (x^{k}) \leq 10^{- 8}$ . The programs are coded in MATLAB and run on a personal computer with a 2.1 GHZ CPU processor.

The meaning of the columns in the tables are stated as follows:

iter: the total iteration number, resi: the value of $θ_{λ} (x)$ .

Problem 1 Let $F (x) = A x + q$ , where

A = [\begin{array}{ccccc} 3 & - 1 & 0 & \dots & 0 \\ - 1 & 3 & - 1 & \dots & 0 \\ 0 & 0 & 0 & \dots & - 1 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 3 \end{array}], q = {(- 1, \dots, - 1)}^{T} .

The corresponding complementarity problem has the unique solution. Table 1 lists the test results for Problem 1 with different n, λ and initial points $a = {(- 5, \dots, - 5)}^{T}$ , $b = {(0, \dots, 0)}^{T}$ , $c = {(1, \dots, 1)}^{T}$ , $d = {(8, \dots, 8)}^{T}$ .

Table 1 Test results for Problem 1

Full size table

From Table 1, we see that the test results for $λ \in (0, 1)$ are better than for other cases. Especially, the good numerical results are obtained when λ closes to 0. Then we compare the test results with Chen and Pan [2], where we set $p = 2$ , $ε = 1.0 e - 08$ , $σ = 1.0 e - 10$ , $β = 0.2$ for convenience. Table 2 lists the test results for [2].

Table 2 Test results for Chen and Pan [2] on Problem 1

Full size table

Tables 1 and 2 indicate that Algorithm 2 performed much better than Chen and Pan [2] did on Problem 1.

Problem 2 Free boundary problems can also be solved by the method we presented. The following problem arises from the discretization of a free boundary problem (see [10]). Let $Ω = (0, 1) \times (0, 1)$ and a function g satisfy $g (x, 0) = x (1 - x)$ , $g (x, y) = 0$ on $x = 0, 1$ or $y = 1$ .

Consider the following problem: find u such that

{\begin{matrix} u \geq 0 & in Ω, \\ - Δ u + f (u, x, y) - 8 (y - 0.5) \geq 0 & in Ω, \\ u (- Δ u + f (u, x, y) - 8 (y - 0.5)) = 0 & in Ω, \\ u = g & on \partial Ω, \end{matrix}

where $f (u, x, y)$ is a continuously differentiable $P_{0}$ -function. We discretize the problem by the five-point difference scheme with mesh-step h. Then we get the following complementarity problem: find $x \in R^{n}$ such that

x \geq 0, A x + Ψ (x) \geq 0 and x^{T} (A x + Ψ (x)) = 0 .

Set initial point as $x_{0} = {(0, \dots, 0)}^{T}$ . Table 3 lists the test results with different functions f, λ, and h.

Table 3 Test results for Problem 2

Full size table

From Table 3, we have the following observations.

• Our test results become better when λ decreases. It is obvious that when $λ = 2$ the result is not good enough. That is to say, Algorithm 2 with the NCP-function $ϕ_{λ}$ ( $0 < λ < 2$ ) is better than the one discussed in [8] where the Fischer-Burmeister function was used.

• Whether the function $f (u, x, y)$ is linear or nonlinear, the test results are good. The results are especially better when λ closes to 0.

We compare the test results with Sun and Zeng [10] where we set $β = 0.5$ , $c = {0.5}^{4}$ . Table 4 lists the test results for [10] with different functions f and h.

Table 4 Test results for Sun and Zeng [10] on Problem 2

Full size table

Tables 3 and 4 indicate that Algorithm 2 performed as well as Sun and Zeng [10] did on Problem 2.

Problem 3 We implemented Algorithm 2 for some test problems with all available starting points in MCPLIB [11]. The results are reported in Table 5 with seconds for unit of time.

Table 5 Test results for MCPLIB problems

Full size table

The above examples indicate that the results are better when λ closes to 0. A reasonable interpretation for this is that the values of $g_{i} (x, z)$ and $h_{i} (x, z)$ become smaller when λ increases and hence causes some difficulty for Algorithm 2. This also implies that the performance of Algorithm 2 will become worse when p increases. When $λ \to 0$ , the NCP obviously reduces to $min {x, F (x)} = 0$ . But it is a nonsmooth equation so we cannot use this method.

6 Concluding remarks

In this paper, we have studied a class of one-parametric NCP-functions $ϕ_{λ} (\cdot, \cdot)$ which include the well-known Fischer-Burmeister function as a special case and proposed modified Newton-type algorithms for solving $P_{0}$ complementarity problems.

Numerical results for the test problems have shown that this method is promising when $λ \in (0, 4)$ . Moreover, our numerical results indicated that the performance of the modified Newton-type method becomes better when λ decreases, which is a new and important numerical result. We believe that Algorithm 2 can effectively solve more practical problems if they can be reformulated as an $NCP (F)$ . We leave this as a future research topic.

References

Chen J-S: On some NCP-functions based on the generalized Fischer-Burmeister function. Asia-Pac. J. Oper. Res. 2007, 24: 401–420. 10.1142/S0217595907001292
Article MathSciNet MATH Google Scholar
Chen J-S, Pan S: A family of NCP functions and a descent method for the nonlinear complementarity problem. Comput. Optim. Appl. 2008, 40: 389–404. 10.1007/s10589-007-9086-0
Article MathSciNet MATH Google Scholar
Fischer A: A special Newton-type optimization methods. Optimization 1992, 24: 269–284. 10.1080/02331939208843795
Article MathSciNet MATH Google Scholar
Hu SL, Huang ZH, Chen J-S: Properties of a family of generalized NCP-functions and a derivative free algorithm for complementarity problems. J. Comput. Appl. Math. 2009, 230(1):69–82. 10.1016/j.cam.2008.10.056
Article MathSciNet MATH Google Scholar
Hu SL, Huang ZH, Lu N: Smoothness of a class of merit functions for the second-order cone complementarity problem. Pac. J. Optim. 2010, 6(3):551–571.
MathSciNet MATH Google Scholar
Kanzow C, Kleinmichel H: A new class of semismooth Newton-type methods for nonlinear complementarity problems. Comput. Optim. Appl. 1998, 11: 227–251. 10.1023/A:1026424918464
Article MathSciNet MATH Google Scholar
Lu LY, Huang ZH, Hu SL: Properties of a family of merit functions and a derivative-free method for the NCP. Appl. Math. J. Chin. Univ. Ser. A 2010, 25(4):379–390. 10.1007/s11766-010-2179-z
Article MathSciNet MATH Google Scholar
Yamashita N, Fukushima M: Modified Newton methods for solving a semismooth reformulation of monotone complementarity problems. Math. Program. 1997, 76: 469–491.
MathSciNet MATH Google Scholar
Qi L: Convergence analysis of some algorithms for solving nonsmooth equations. Math. Oper. Res. 1993, 18: 227–244. 10.1287/moor.18.1.227
Article MathSciNet MATH Google Scholar
Sun Z, Zeng JP: A monotone semismooth Newton type method for a class of complementarity problem. J. Comput. Appl. Math. 2011, 235: 1261–1274. 10.1016/j.cam.2010.08.012
Article MathSciNet MATH Google Scholar
Billups SC, Dirkse SP, Soares MC: A comparison of algorithms for large scale mixed complementarity problems. Comput. Optim. Appl. 1997, 7: 3–25. 10.1023/A:1008632215341
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the NSFC (50975200).

Author information

Authors and Affiliations

Department of Mathematics, Tianjin University, Tianjin, 300072, China
Weisong Xie & Zijun Deng

Authors

Weisong Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zijun Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zijun Deng.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

WX participated in the design of the algorithm. ZD performed the numerical experiment and statistical analysis. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Xie, W., Deng, Z. Modified Newton-type methods for the NCP by using a class of one-parametric NCP-functions. J Inequal Appl 2012, 286 (2012). https://doi.org/10.1186/1029-242X-2012-286

Download citation

Received: 24 February 2012
Accepted: 12 November 2012
Published: 07 December 2012
DOI: https://doi.org/10.1186/1029-242X-2012-286

Modified Newton-type methods for the NCP by using a class of one-parametric NCP-functions

Abstract

1 Introduction

2 Preliminaries

3 Algorithm and global convergence

4 Modified algorithm and fast convergence

5 Numerical results

6 Concluding remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords