Skip to main content

A new filter QP-free method for the nonlinear inequality constrained optimization problem

Abstract

In this paper, a filter QP-free infeasible method with nonmonotone line search is proposed for minimizing a smooth optimization problem with smooth inequality constraints. This proposed method is based on the solution of nonsmooth equations, which are obtained by the Lagrangian multiplier method and the function of the nonlinear complementarity problem for the Karush–Kuhn–Tucker optimality conditions. Especially, each iteration of this method can be viewed as a perturbation of a Newton or quasi-Newton iteration on both the primal and dual variables for the solution of the Karush–Kuhn–Tucker optimality conditions. What is more, it is considered to use the function of the nonlinear complementarity problem in the filter, which makes the proposed algorithm avoid the incompatibility. Then the global convergence of the proposed method is given. And under some mild conditions, the superlinear convergence rate can be obtained. Finally, some preliminary numerical results are shown to illustrate that the proposed filter QP-free infeasible method is quite promising.

1 Introduction

In this paper, we mainly consider solving the nonlinear optimization problem (NLP) with the inequality constraints, where the objective function and the constrained functions are Lipschitz continuously differentiable functions. We give the Lagrangian function associated with this problem, then the Karush–Kuhn–Tucker (KKT) optimality conditions for our solved problem can be obtained.

It is well known that the KKT optimality conditions is a mixed nonlinear complementarity problem (NCP). And this NCP has attracted much attention due to its various applications [13] such as the economic equilibrium problem, the restructuring problems of electricity and gas markets, and so on. Of course, there are many efficient methods for solving the NCP, which can be seen in [47]. One popular way to solve the NCP is to construct a Newton method for solving the related nonlinear equations, which is a reformulation of the KKT optimality condition. Another way is to use the filter method to directly solve the NLP with the inequality constraints. Recently Pu, Li, and Xue [8] proposed a new quadratic programming (QP)-free infeasible method for minimizing a smooth function subject to some inequality constraints. This method is based on the solution of nonsmooth equations which are obtained by the multiplier and the Fischer–Burmeister NCP function for the KKT conditions. They proved that the method had a superlinear convergence rate under some mild conditions.

Fletcher and Leyffer [9] proposed a filter method for solving the NLP problem, which was an alternative to the traditional merit functions approach. Provided that there is a sufficient decrease in the objective function or the constraints violation function, it was shown that the trial points generated from solving a sequence of trust region QP subproblems are accepted. In addition, the computational results reported in [9, 10] are also very encouraging. For more related methods, one can refer to [1116].

Stimulated by the progress in these two aspects, in this paper, we propose a nonmonotone filter QP-free infeasible method for minimizing a smooth function subject to smooth inequality constraints. This proposed iterative method is based on the solution of nonsmooth equations, which are obtained by the multiplier and some NCP functions for the KKT first order optimality conditions. And each iteration of this method can be viewed as a perturbation of a Newton or quasi-Newton iteration on both the primal and dual variables for the solution of the KKT optimality conditions. Specifically, we use the filter on the linear search with a nonmonotone acceptance mechanism [17, 18]. Moreover, we also consider to use the NCP function in the filter. Thus our algorithm can avoid the incompatibility, which may appear in the filter SQP algorithm. We also give the global convergence and the superlinear convergence rate of the proposed method under some mild conditions. Finally, we take some numerical tests to illustrate the effectiveness of the proposed filter QP-free infeasible method.

The rest of this paper is organized as follows. In Sect. 2, we give some preliminaries and the formulation of the solved problem. Then we propose an infeasible filter QP-free method. In Sect. 3, we show that the proposed method is well defined and establish its global convergence and superlinear convergence rate under some mild conditions. Some numerical tests are given in Sect. 4. Finally, we give some brief conclusions in Sect. 5.

2 Preliminaries and algorithm

In this section, we firstly introduce the formulation of the solved problem. Then we give some preliminaries for structuring a new filter QP-free method. Finally, we present the structure of our proposed method in detail.

In this paper, we mainly consider solving the nonlinear optimization problem (NLP) with the inequality constraints, which can be formulated as

$$ \min f(x), \quad \text{s.t.}\quad x\in D=\bigl\{ x\in R^{n}| G(x)\le0\bigr\} , $$
(1)

where \(f:R^{n}\to R\) and \(G=(g_{1},g_{2}, \ldots, g_{m})^{T}:R^{n}\to R^{m}\) are Lipschitz continuously differentiable functions.

The Lagrangian function associated with problem (1) is

$$L( x, \mu)=f(x)+\mu^{T}G(x), $$

where \(\mu= (\mu_{1},\mu_{2},\ldots,\mu_{m})^{T}\in R^{m}\) is the multiplier vector. For simplicity, we use \((x,\mu)\) to denote the column vector \((x^{T},\mu^{T})^{T}\).

Then we can obtain the KKT point \((\bar{x}, \bar{\mu})\in R^{n}\times R^{m}\) for problem (1), which satisfies the necessary optimality conditions :

$$ \nabla_{x}L(\bar{x}, \bar{\mu})=0, \qquad G(\bar{x})\le0, \qquad \bar{\mu}\ge0,\qquad \bar{\mu}_{i}g_{i}(\bar{x})=0, $$
(2)

where \(1\le i\le m\). We also say that \(\bar{x}\in D\) is a KKT point of problem (1) if there exists \(\bar{\mu}\in R^{m}\) such that \((\bar{x}, \bar{\mu})\) satisfies (2). It is well known that the KKT optimality condition is a mixed NCP. And the reformulation of (2) can be viewed as the following nonlinear equation:

$$ \Phi(x, \mu)=0. $$

2.1 Preliminaries

In this subsection, we give the definition of Fischer–Burmeister NCP function and some related Jacobian functions in different cases. Both theoretical results and computational experience have indicated that the nonsmooth methods based on the Fischer–Burmeister NCP function are efficient. The Fischer–Burmeister function has a very simple structure, which is defined as

$$\psi(a, b)=\sqrt{a^{2}+b^{2}}-a-b. $$

It is clear that this function ψ is continuously differentiable everywhere except at the origin, but it is strongly semismooth at the origin, i.e., if \(a\neq0\) or \(b\neq0\), then ψ is continuously differentiable at \((a, b)\in R^{2}\), and

$$\nabla\psi(a,b)= \biggl(\frac{a}{\sqrt{a^{2}+b^{2}}}-1, \frac{b}{\sqrt {a^{2}+b^{2}}}-1 \biggr); $$

if \(a=0\) and \(b=0\), then the generalized Jacobian of ψ at \((0,0)\) is (see [14])

$$\partial\psi(0,0)=\bigl\{ (\xi-1,\eta-1)| \xi^{2}+\eta^{2}=1 \bigr\} . $$

Let \(\phi_{i}(x, \mu)=\psi(-g_{i}(x), \mu_{i})\), \(1\le i\le m\). Given the above formulation of problem (1), we can denote \(\Phi (x, \mu)=((\nabla_{x}L(x, \mu))^{T}, (\Phi_{1}(x, \mu))^{T})^{T}\), where \(\Phi_{1}(x, \mu)=( \phi_{1}(x, \mu), \ldots , [4] \phi_{m}(x, \mu) )^{T}\).

Clearly, the KKT optimality conditions (2) can be equivalently reformulated as the nonsmooth equations \(\Phi(x, \mu)=0\).

If \((g_{i}(x), \mu_{i})\neq(0,0)\), then \(\phi_{i}\) is continuously differentiable at \((x, \mu)\in R^{n+m}\). In this case, we have

$$\nabla_{x}\phi_{i}= \biggl(\frac{-g_{i}(x)}{\sqrt{(g_{i}(x))^{2}+\mu_{i}^{2}}}+1 \biggr) \nabla g_{i}(x); \qquad \nabla_{\mu}\phi_{i}= \biggl(\frac{\mu_{i}}{\sqrt{(g_{i}(x))^{2}+\mu_{i}^{2}}}-1 \biggr)e_{i}, $$

where \(e_{i}=(0,\ldots, 0,1,0,\ldots, 0)^{T}\in R^{m}\) is the ith column of the unit matrix, its ith element is 1, and other elements are 0.

If \(g_{i}(x)=0\) and \(\mu_{i}=0\), \(1\le i\le m\), then \(\phi_{i}(x, \mu)\) is strongly semismooth and directionally differentiable at \((x, \mu)\). We have

$$\partial_{x}\phi_{i}(x, \mu)=\bigl\{ (\xi+1)\nabla g_{i}(x)|-1\le\xi\le1\bigr\} $$

and

$$\partial_{\mu_{i}}\phi_{i}(x, \mu)=\bigl\{ (\xi-1)|-1\le\xi\le1 \bigr\} . $$

We may reformulate the KKT at point \((\bar{x}, \bar{\mu})\) conditions as a system of equations:

$$\Phi(\bar{x}, \bar{\mu})=\bigl(\nabla_{x}L(\bar{x}, \bar{\mu}), \Phi_{1}(\bar{x},\bar{\mu})\bigr)= 0, $$

where \(\mu=(\mu_{1}, \mu_{2},\ldots,\mu_{m})^{T}\in R^{m}\) is the multiplier vector, \(\phi_{j}(x,\mu_{j})=\psi(-g_{j}(x), \mu_{j})\), \(\Phi_{1}(\bar{x},\bar{\mu})=(\phi_{1}(\bar{x},\bar{\mu}_{1}), \phi_{2}(\bar{x},\bar{\mu}_{2}),\ldots ,\phi_{m}(\bar{x},\bar{\mu}_{m}))^{T}\). To replace the violation constrained function \(p(G(x))\) in the filter F of Fletcher and Leyffer method [9], we use the violation constrained function \(p(G(x),\mu)=\|\Phi_{1}(x,\mu)\|\).

2.2 Algorithm

In this subsection, we give the process and the framework of the filter QP-free method for solving problem (1). We firstly give some closed forms for preparing the method.

If \((g(x^{k}), \mu^{k})\neq(0,0)\), let \(\xi_{j}^{k}=\xi_{j}(x^{k}, \mu^{k})=\frac{-g_{j}^{k}}{\sqrt{(g_{j}^{k})^{2}+(\mu^{k}_{j})^{2}}}+1\); \(\eta_{j}^{k}=\eta_{j}(x^{k}, \mu^{k})=\frac{\mu^{k}_{j}}{\sqrt{(g_{j}^{k})^{2}+(\mu^{k}_{j})^{2}}}-1\); otherwise we denote \(\xi_{j}^{k}=\xi_{j}(x^{k}, \mu^{k})=1+\sqrt{2}/2\); \(\eta_{j}^{k}=\eta_{j}(x^{k}, \mu^{k})=-1+\sqrt{2}/2\). Then let

V k = ( V 11 k V 12 k V 21 k V 22 k ) = ( H k G k diag ( ξ k ) ( G k ) T diag ( η k c k ) ) ,
(3)

where \(H^{k}\) is a positive matrix, which may be modified by BFGS update. The \(\operatorname {diag}(\xi^{k})\) or \(\operatorname{diag}(\eta^{k}-c^{k})\) denotes the diagonal matrix whose jth diagonal element is \(\xi_{j}^{k}\) or \(\eta_{j}^{k}-c_{j}^{k}\) respectively, and

$$c_{j}^{k} = c \min\bigl\{ 1, \bigl\Vert \Phi^{k} \bigr\Vert ^{\nu}\bigr\} , $$

where \(c>0\) and \(\nu>1\) are given parameters.

Secondly, we give the nonmonotone sequence for structuring our method. We may assume that the elements \(\Phi^{k}\) and \(F^{k}\) are sorted in the decreasing order, that is, \(\hat{F}^{k1}\ge\hat{F}^{k2}\ge\hat{F}^{k3}\ge\cdots\ge\hat{F}^{kl}\), \(\hat{\Phi}^{k1}\ge\hat{\Phi}^{k2}\ge\hat{\Phi}^{k3}\ge\cdots\ge\hat{\Phi}^{kl}\). Let

$$ \bar{\Phi}^{k}= = \textstyle\begin{cases} \{\Phi^{k}, \hat{\Phi}^{k2} , \hat{\Phi}^{k3},\ldots, \hat{\Phi}^{kl}\},&\mbox{if } \Vert \Phi^{k} \Vert < \hat{\Phi}^{k1} \mbox{ and } \Vert \Phi^{k} \Vert >0,\\ \{\hat{\Phi}^{k1}, \hat{\Phi}^{k2} , \hat{\Phi}^{k3},\ldots, \hat{\Phi}^{kl}\}, & \mbox{if }\Phi^{k}\ge\hat{\Phi}^{k1} \mbox{ or } \Phi^{k}=0; \end{cases} $$
(4)

and

$$ \bar{F}^{k}= = \textstyle\begin{cases} \{F^{k}, \hat{F}^{k2} , \hat{F}^{k3},\ldots, \hat{F}^{kl}\}, & \mbox{if }F^{k}< \hat{F}^{k1},\\ \{\hat{F}^{k1}, \hat{F}^{k2} , \hat{F}^{k3},\ldots, \hat{F}^{kl}\}, &\mbox{if } F^{k}\ge\hat{F}^{k1}. \end{cases} $$
(5)

We denote the maximal elements in \(\bar{\Phi}^{k}\), \(\bar{F}^{k}\) by \(p^{k}_{\max}\), \(\bar{F}^{k}_{\max}\), respectively.

Based on the above given information, we now give the framework of the nonmonotone filter QP-free infeasible method (NFQPIM) for minimizing a smooth function subject to smooth inequality constraints as follows in Algorithm 1.

Algorithm 1
figure a

NFQPIM

Remark 1

Let \(\Phi(x, \mu)=((\nabla_{x}L, H(x)),(\Phi_{1}(x, \mu))^{T})^{T}\), the above proposed NFQPIM can also be used to solve the following constrained NLP:

$$\begin{aligned} &\min\mbox{ }f(x) \\ &\text{s.t.}\quad G(x)\le0, \qquad H(x)=0, \quad x\in R^{n}, \end{aligned}$$

where \(f:R^{n}\to R\) and \(G(x)=(g_{1}(x),g_{2}(x), \ldots, g_{m}(x))^{T}:R^{n}\to R^{m}\) and \(H(x)=(h_{1}(x),h_{2}(x), \ldots, h_{p}(x))^{T}:R^{p}\to R^{m}\) are Lipschitz continuously differentiable functions.

2.3 Implementation

In this subsection, we give the implementation of the proposed NFQPIM. Firstly, we suppose that the following assumptions A1–A3 hold.

A1.:

The level set \(\{x|f(x)\le f(x^{0})\}\) is bounded, and for sufficiently large k, \(\|\mu^{k}+\lambda^{k0}+\lambda^{k1}\|< \bar{\mu}\).

A2.:

f and \(g_{i}\) are Lipschitz continuously differentiable, and for all y, \(z\in R^{n+m}\),

$$\bigl\Vert \nabla L(y)-\nabla L(z) \bigr\Vert \le m_{0} \Vert y-z \Vert , \qquad \bigl\Vert \Phi(y)-\Phi(z) \bigr\Vert \le m_{0} \Vert y-z \Vert , $$

where \(m_{0}>0\) is the Lipschitz constant.

A3.:

\(H^{k}\) is positive definite and there exist positive numbers \(m_{1}\) and \(m_{2}\) such that \(m_{1}\|d\|^{2}\le d^{T}H^{k}d\le m_{2}\|d\|^{2}\) for all \(d\in R^{n}\) and all k.

Lemma 1

If \(\Phi^{k} \neq0\), then \(V^{k} \) is nonsingular.

Proof

Assume \(\Phi^{k} \neq0\). If \(V^{k}(u, v)=0\) for some \((u,v)\in R^{n+m}\), where \(u=(u_{1},\ldots, u_{n})^{T}\), \(v=(v_{1},\ldots, v_{m})^{T}\), then

$$ H^{k}u+\nabla G^{k}v=0 $$
(6)

and

$$ \operatorname{diag}\bigl(\xi^{k}\bigr) \bigl(\nabla G^{k} \bigr)^{T}u+\operatorname{diag}\bigl(\eta ^{k}-c^{k} \bigr)v=0. $$
(7)

From the definitions of \(\xi_{j}^{k}\) and \(\eta_{j}^{k}\), we know that \(\xi_{j}^{k}\ge0\) and \(\eta_{j}^{k} -c^{k}\neq0\) for all j. So \(\operatorname{diag}(\eta^{k}-c_{j}^{k})\) is nonsingular. We have

$$ v=-\bigl(\operatorname{diag}\bigl(\eta^{k}-c^{k}\bigr) \bigr)^{-1}\operatorname{diag}\bigl(\xi ^{k}\bigr) \bigl( \nabla G^{k}\bigr)^{T}u . $$
(8)

Taking (13) into (11), we have

$$u^{T}\bigl(H^{k}u+\nabla G^{k}v\bigr) =u^{T}H^{k}u- u^{T}\nabla G^{k} \operatorname{diag}\bigl(\xi ^{k}\bigr) \bigl(\operatorname{diag}\bigl( \eta^{k}-c^{k}\bigr)\bigr)^{-1}\bigl(\nabla G^{k}\bigr)^{T}u=0. $$

The fact that \(-\nabla G^{k}\operatorname{diag}(\xi^{k})(\operatorname {diag}(\eta^{k}-c^{k}))^{-1}(\nabla G^{k})^{T}\) is positive semidefinite implies \(u=0\), and then \(v=0\) by (13). \(V^{k} \) is nonsingular. This lemma holds. □

Lemma 2

\(d^{k0}=0\) if and only if \(\nabla f^{k}=0\), and \(d^{k0}=0\) implies \(\bar{\lambda}^{k0}=0\) and \(\lambda^{k0}=0\).

If \((x^{*}, \mu^{*})\) is an accumulation point of \(\{( x^{k},\mu^{k})\}\), then \(d^{*0}=0\), and \((d^{*0},\lambda^{*0})^{T}\) is the solution of the following equations:

V ( d λ ) = ( f 0 )
(9)

and \(\nabla L(x^{*},\mu^{*})=0\).

It is clear that the following lemma holds, with reference to [8].

Lemma 3

If \(d^{k0}\neq0\), then

$$\bigl(d^{k0}\bigr)^{T}H^{k}d^{k0}\le- \bigl(d^{k0}\bigr)^{T}\nabla f^{k}. $$

Proof

(14) implies

$$ H^{k}d^{k0}+\nabla G^{k}\lambda^{k0}=- \nabla f^{k}, $$
(10)

and

$$ \operatorname{diag}\bigl(\xi^{k}\bigr) \bigl(\nabla G^{k} \bigr)^{T}d^{k0}+\operatorname{diag}\bigl(\eta^{k}- c^{k}\bigr)\hat{\lambda}^{k0}=0. $$
(11)

We have

$$ \hat{\lambda}^{k0}=-\bigl(\operatorname{diag}\bigl(\eta^{k}- c^{k}\bigr)\bigr)^{-1}\operatorname{diag}\bigl( \xi^{k}\bigr) \bigl(\nabla G^{k}\bigr)^{T}d^{k0} . $$
(12)

Taking (17) into (15), we have

$$\begin{aligned} &\bigl(d^{k0}\bigr)^{T}\bigl(H^{k}d^{k0}+ \nabla G^{k}\lambda^{k0}\bigr) \\ &\quad =\bigl(d^{k0}\bigr)^{T}H^{k}d^{k0}- \bigl(d^{k0}\bigr)^{T}\nabla G^{k} \operatorname{diag}\bigl(\xi ^{k}\bigr) \bigl(\operatorname{diag}\bigl( \eta^{k}- c^{k}\bigr)\bigr)^{-1}\bigl(\nabla G^{k}\bigr)^{T}d^{k0} \\ &\quad =-\bigl(d^{k0}\bigr)^{T}\nabla f^{k}. \end{aligned}$$
(13)

\((d^{k0})^{T}\nabla G(x^{k})\operatorname{diag}(\xi^{k})(\operatorname {diag}(\eta^{k}-c^{k}))^{-1}(\nabla G^{k})^{T}d^{k0}\le0\) implies

$$ \bigl(d^{k0}\bigr)^{T}H^{k}d^{k0}\le- \bigl(d^{k0}\bigr)^{T}\nabla f^{k}. $$
(14)

The lemma holds. □

Lemma 4

There exists \(m_{3}>0\) such that, for any \(0< t\le1\),

$$\bigl\Vert \Phi_{1}\bigl(x^{k}+td^{k0}, \mu^{k}+t\lambda^{k0}\bigr) \bigr\Vert ^{2}- \Vert \Phi_{1} \Vert ^{2} \le m_{3} t^{2}. $$

Proof

If \(\Phi_{1}^{k}=0\), then there exists \(m_{4}>0\) such that, for any \(0< t\le1\),

$$\begin{aligned} \bigl\Vert \Phi_{1}\bigl(x^{k}+td^{k0}, \mu^{k}+t\lambda^{k0}\bigr) \bigr\Vert ^{2} &= \bigl\Vert \Phi_{1}\bigl(x^{k}+td^{k0}, \mu^{k}+t\lambda^{k0}\bigr)-\Phi_{1}^{k} \bigr\Vert ^{2} \le t^{2}m_{2}^{2} \bigl\Vert \bigl(d^{k0},\lambda^{k0}\bigr) \bigr\Vert ^{2}. \end{aligned} $$

The lemma holds for \(\Phi_{1}^{k}=0\). □

We define that if \((g_{i}^{k},\mu_{i}^{k})\neq(0,0)\), then \((\bar{\xi}_{i}^{k0},\bar{\eta}_{i}^{k0})=(\xi_{i}^{k},\eta_{i}^{k})\), otherwise \(\bar{\xi}_{i}^{k0}(\nabla g_{i}^{k})^{T}d^{k0}+\bar{\eta}_{i}^{k0} \lambda_{i}^{k0}= \phi_{i}'((x^{k},\mu^{k}), (d^{k0}, \lambda^{k0}))\), where \(\phi_{i}'((x^{k},\mu^{k}), (d^{k0}, \lambda^{k0}))\) is the direction derivative of \(\phi_{i}(x,\mu )\) at \((x^{k},\mu^{k})\) in the direction \((d^{k0}, \lambda^{k0})\).

Let \(\operatorname{diag}(\bar{\xi}^{k0})\) or \(\operatorname{diag}(\bar{\eta}^{k0})\) denote the diagonal matrix whose jth diagonal element is \(\bar{\xi}_{j}^{k0}\) or \(\bar{\eta}_{j}^{k0}\), respectively. Then \(\phi_{i}(0,0)=0\) implies

$$\bigl(\Phi_{1}^{k}\bigr)^{T}\bigl( \operatorname{diag}\bigl(\bar{\xi}^{k0}\bigr) \bigl(\nabla G^{k} \bigr)^{T}, \operatorname{diag}\bigl(\bar{\eta}^{k0}\bigr) \bigr)=\bigl(\Phi_{1}^{k}\bigr)^{T}\bigl( \operatorname{diag}\bigl(\xi^{k}\bigr) \bigl(\nabla G^{k} \bigr)^{T}, \operatorname{diag}\bigl(\eta^{k}\bigr)\bigr). $$

Then

$$\begin{aligned} & \bigl\Vert \Phi_{1}^{k}+t\bigl(\operatorname{diag} \bigl(\bar{\xi}^{k0}\bigr) \bigl(\nabla G^{k} \bigr)^{T}d^{k0}+ \operatorname{diag}\bigl(\bar{\eta}^{k0}\bigr)\lambda^{k0}\bigr) \bigr\Vert ^{2} \\ &\quad = \bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}+t^{2}\| \operatorname{diag}\bigl(\bar{\xi}^{k0} \bigr) \bigl(\nabla G^{k}\bigr)^{T}d^{k0}+ \operatorname{diag}\bigl(\bar{\eta}^{k0}\bigr)\lambda^{k0}) \|^{2}. \end{aligned}$$
(15)

It is clear that

$$\bigl\Vert \Phi_{1}\bigl(x^{k}+td^{k0}, \mu^{k}+t\lambda^{k0}\bigr) \bigr\Vert ^{2}= \bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}+O \bigl(t^{2}\bigr). $$

This lemma holds.

Lemma 5

If \(\Phi_{1}^{k}\neq0\), then given any \(\varepsilon>0\) there is \(\bar{t}>0\) such that, for any \(0< t\le\bar{t}\),

$$\bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}- \bigl\Vert \Phi_{1}\bigl(x^{k}+td^{k1}, \mu^{k}+t\lambda^{k1}\bigr) \bigr\Vert ^{2} \ge(2- \varepsilon)t \bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}. $$

Proof

If \(\Phi_{1}^{k}\neq0\), (7) implies

$$ \operatorname{diag}\bigl(\xi^{k}\bigr) \bigl(\nabla G^{k} \bigr)^{T}d^{k1}+\operatorname{diag}\bigl( \eta^{k}-c^{k}\bigr)\lambda^{k1}=- \Phi_{1}^{k}. $$
(16)

We define that if \((g_{i}^{k},\mu_{i}^{k})\neq(0,0)\) then \((\bar{\xi}_{i}^{k1},\bar{\eta}_{i}^{k1})=(\xi_{i}^{k},\eta_{i}^{k})\), otherwise \((\bar{\xi}_{i}^{k1}\nabla g_{i}^{k}, \bar{\eta}_{i}^{k1})(d^{k1}, \lambda^{k1})= \phi_{i}'((x^{k},\mu^{k}), (d^{k1}, \lambda^{k1}))\), where \(\phi_{i}'((x^{k},\mu^{k}), (d^{k1}, \lambda^{k1}))\) is the direction derivative of \(\phi_{i}(x,\mu )\) at \((x^{k},\mu^{k})\) in the direction \((d^{k1}, \lambda^{k1})\). Let \(\operatorname{diag}(\bar{\xi}^{k1})\) or \(\operatorname{diag}(\bar{\eta}^{k1})\) denote the diagonal matrix whose ith diagonal element is \(\bar{\xi}_{i}^{k1}\) or \(\bar{\eta}_{i}^{k1}\), respectively. Clearly, for all i,

$$ \phi_{i}\bigl(x^{k}+td^{k1},\mu^{k}+t \lambda^{k1}\bigr) -\phi_{i}^{k}\leq t\bigl(\bar{\xi}_{i}^{k1}\bigl(\nabla g_{i}^{k} \bigr)^{T}d^{k1}+\bigl(\bar{\eta}_{i}^{k1} \bigr)\bigr). $$
(17)

Since \(c^{k}_{i}\neq0\), it follows by the definitions of \(c^{k}_{i}\) and \(\eta_{i}^{k}\) that \(\eta_{i}^{k}=0\), \(g_{i}^{k}=0\), \(\mu_{i}^{k}\ge0\), and \(\phi_{i}^{k}=0\). We have

$$\begin{aligned} & \bigl\Vert \Phi_{1}^{k}+t\bigl(\operatorname{diag} \bigl(\bar{\xi}^{k1}\bigr) \bigl(\nabla G^{k} \bigr)^{T}d^{k1}+ \operatorname{diag}\bigl(\bar{\eta}^{k1} \bigr)\lambda^{k1}\bigr) \bigr\Vert ^{2} \\ &\quad = (1-2t) \bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}+ t \bigl\Vert \operatorname{diag}\bigl(\bar{\xi}^{k1} \bigr) \bigl(\nabla G^{k}\bigr)^{T}d^{k1} + \operatorname{diag}\bigl(\bar{\eta}^{k1}\bigr)\lambda^{k1} \bigr\Vert ^{2}. \end{aligned}$$
(18)

It follows from (22) and (23) that, given any \(\varepsilon>0\), there is \(\bar{t}>0\) such that, for any \(0< t\le\bar{t}\),

$$\bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}- \bigl\Vert \Phi_{1}\bigl(x^{k}+t^{2}d^{k1}, \mu^{k}+t \lambda^{k1}\bigr) \bigr\Vert ^{2}\ge(2- \varepsilon)t \bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}. $$

Hence this lemma holds. □

Lemma 6

\(d^{k0}=0\) if and only if \(\nabla f^{k}=0\), and \(d^{k0}=0\) implies \(\bar{\lambda}^{k0}=0\) and \(\lambda^{k0}=0\).

Proof

If \(\nabla f^{k}=0\), then \((x^{k},\bar{\lambda}^{k0})=\hat{V}^{k}(0,0)=(0,0)\). If \(d^{k0}=0\), then (14) implies

$$ \operatorname{diag}\bigl(\xi^{k}\bigr) \bigl(\nabla G^{k} \bigr)^{T}d^{k0}+\operatorname{diag}\bigl( \eta^{k}-c^{k}\bigr) \bar{\lambda}^{k0}= \operatorname{diag}\bigl(\eta^{k}-c^{k}\bigr)\bar{\lambda}^{k0}=0. $$
(19)

Clearly, \(\bar{\lambda}^{k0}=0\), \(\lambda^{k0}=0\), and \((\nabla f^{k},0)=(V^{k})^{-1}(0,0)=(0,0)\). □

From Lemmas 36, we know that, if \(\Phi _{1}^{k}\neq0\), then \((d^{k},\lambda^{k})\) is the decreasing direction of \(\|\Phi^{k}\|\); if \(d^{k0}\neq0\), then \(d^{k}\) is the decreasing direction of \(f^{k}\). If \(\Phi_{1}^{k}=0\) and \(d^{k0}=0\), then \((x^{k},\mu^{k})\) is a KKT point. We consider four cases for linear searches.

Case 1. \(k-1\) iteration has a Φ-step and \(\Phi_{1}^{k}=0\). In this case, \(p^{k}_{\max}=p^{k-1}_{\max}\) and \(\min\{p^{k} j_{\max}|j\in F^{k}>0\}\). Clearly, we can find \(\alpha _{k}\) such that \(\hat{x}^{k+1}\) satisfies (9).

Case 2. \(k-1\) iteration has a Φ-step and \(\Phi_{1}^{k}\neq0\). In this case, it follows from Lemma 5 that, given any \(\varepsilon>0\), there is \(\bar{t}>0\) such that, for any \(0< t\le\bar{t}\),

$$\bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}- \bigl\Vert \Phi_{1}\bigl(x^{k}+td^{k1}, \mu^{k}+t\lambda^{k1}\bigr) \bigr\Vert ^{2} \ge(2- \varepsilon)t \bigl\Vert \Phi_{1}^{k} \bigr\Vert ^{2}. $$

\(p^{k}_{\max}>0\) is monotonically nonincreasing. So, we can find \(\alpha_{k}\) such that \(\hat{x}^{k+1}\) satisfies (9).

Case 3. \(k-1\) iteration has an f-step and \(d^{k0}\neq0\). In this case, it follows from Lemma 5 that, if \(d^{k0}\neq0\), then

$$\bigl(d^{k0}\bigr)^{T}H^{k}d^{k0}\le- \bigl(d^{k0}\bigr)^{T}\nabla f^{k}, $$

where \(f^{k}_{\max}\) is monotonically nonincreasing. We can find \(\alpha_{k}\) such that \(\bar{x}^{k+1}\) satisfies (10).

Case 4. The \((k-1)\) iteration has an f-step and \(d^{k0}=0\). In this case, if \(\Phi_{1}^{k}=0\), then \((x^{k}, \mu^{k} )\) is a KKT point, otherwise \(x^{k}\) may be an infeasible stationary point.

If there are no such \(x^{k+1}\) and \(\mu^{k+1}\) or \(\alpha_{k}\) too small, we use the backtracking technology or use the feasibility restoration phase to find \(x^{k+1}\) and \(\mu^{k+1}\) so that it is acceptable that the filter and the \(QP(x^{k+1})\) subproblem are compatible.

3 Convergence

In this section, we discuss the global and superlinear convergence rate of the proposed method. We give the following A4 and suppose that the assumptions A1–A4 hold in this section.

A4.:

For all k and some \(\alpha_{\min}>0\), \(\alpha_{k}>\alpha_{\min}>0\).

It implies from (3) and (4) that \(p^{k}_{\max}>0\) is monotonically nonincreasing and, if \(\|\Phi_{1}(x^{k})\| \to0\), then \(p^{k}_{\max}\to0\).

Lemma 7

Consider the sequence \(\{\|\Phi_{1}(x^{k})\|^{2}\}\) and \(\{f^{k}\}\) such that \(\{\|\Phi_{1}(x^{k})\|^{2}\ge0\}\) and \(\{f^{k}\}\) is monotonically decreasing and bounded below. Let a constant θ satisfy, for all k and \(l\in F ^{k}\), that

$$ \bigl\Vert \Phi_{1}\bigl(\hat{x}^{k+1}, \hat{\mu}^{k+1}\bigr) \bigr\Vert \le\theta_{1}\max\bigl\{ \bigl\Vert \Phi_{1}\bigl(x^{k},\mu^{k}\bigr) \bigr\Vert p^{j}_{\max}\bigr\} $$
(20)

or

$$ f\bigl(\bar{x}^{k+1}\bigr)-\max\bigl\{ f^{k},\bar{F}^{j}_{\max}\bigr\} \le-\alpha_{k} \theta_{1} \bigl\Vert \Phi _{1}^{k+1} \bigr\Vert , $$
(21)

where \(\alpha_{k}\ge\alpha_{\min} >0\) is the step length, θ is a given positive number. Then \(p^{k}_{\max}\to0\).

Proof

Suppose that the theorem is not true, then \(\Phi_{1}(x^{k})\not \to0\), and there exist \(\varepsilon>0\) and infinitely many members of index set K such that \(\|\Phi_{1}(x^{k+1},\mu^{k+1})\|\ge\varepsilon>0\), \(p^{k}_{\max}\ge \varepsilon>0\), and \(\|\Phi_{1}(x^{k+1},\mu^{k+1})\|\ge\theta\|\Phi_{1}(x^{k},\mu^{k})\|\) for any \(k\in K\). We have

$$ f\bigl(x^{k}\bigr)-f\bigl(x^{k+1}\bigr)\ge\alpha_{k} \theta \bigl\Vert \Phi_{1}\bigl(x^{k+1},\mu ^{k+1} \bigr) \bigr\Vert > \alpha_{\min}\theta\varepsilon. $$
(22)

Because \(\{f^{k}\}\) is monotonically decreasing, (27) implies \(f(x^{k})\to-\infty\) as \(k \to+\infty\), which is contravention of \(\{f^{k}\}\) being bounded below. This lemma holds. □

Lemma 8

Consider an infinite sequence iterations on which \(\{f^{k},\|\Phi_{1}(x^{k})\|^{2}\}\) entered into the filter, where \(\|\Phi _{1}(x^{k})\|>0\) and \(\{f^{k}\}\) is bounded below. It follows that \(\Phi_{1}(x^{k})\to0 \).

Theorem 1

If \((x^{*}, \mu^{*})\) is an accumulation point of \(\{( x^{k},\mu^{k})\}\), then \(x^{*}\) is a KKT point of problem (1).

Proof

It is obvious that Lemmas 8 and 2 imply that Theorem 1 holds. □

Next we consider the superlinear convergence of the method and firstly give the following assumptions we need.

A5.:

The Mangasarian–Fromovitz (M-F) qualification condition is satisfied at \(x^{*}\), i.e., \(\{\nabla g_{i}(x^{*})\}\) are linear independent for all \(i\in I=\{i| g_{i}(x^{*})=0\}\), and there exists a direction such that \(d^{T}\nabla g_{i}(x^{*})<0\), \(i\in I=\{i| g_{i}(x^{*})=0\}\), where \(i\in I=\{i| g_{i}(x^{*})=0\}\), where \(x^{*}\) is an accumulation point of \(\{x^{k}\}\) and a KKT point of problem (1).

A6.:

The sequence of \(\{H^{k}\}\) satisfies

$$\frac{ \Vert (H^{k}-\nabla_{x}^{2}L(x^{k},\mu^{k} ))d^{k1} \Vert }{ \Vert d^{k1} \Vert }\to0. $$
A7.:

The strict complementarity condition holds at each KKT point \((x^{*},\mu^{*} )\).

It follows that \(\phi^{k}\) is differentiable at each KKT point \((x^{*},\mu^{*} )\).

Assumption A7 implies that Φ is continuously differentiable at each KKT point \((x^{*},\mu^{*} )\). As Lemma 1, we have that the following lemmas hold.

Lemma 9

Assume A1–A7 hold, then \(\{ \| (V^{k})^{-1}\| \}\) and \(\{ \|(\hat{V}^{k})^{-1}\| \}\) are bounded. Furthermore, if \(V^{*}\) is an accumulation matrix of \(\{ V^{k}\}\), then \(V^{*}\) is nonsingular.

Proof

By Theorem 1, \(\Phi^{*}=0\) and \(c^{k}\to0\). Without loss of generality, we may assume that \(( x^{k},\mu^{k})\to(x^{*}, \mu^{*})\), \(H^{k}\to H^{*}\), \(\operatorname{diag}(\xi^{k})\to\operatorname{diag}(\xi ^{*})\), and \(\operatorname{diag} (\eta^{k})\to \operatorname{diag}(\eta^{*})\). By the definitions of \(\xi_{i}^{k}\) and \(\eta _{i}^{k}\), we know that \((\xi_{i}^{*})^{2}+(\eta_{i}^{*})^{2}\neq0\). \(H^{k}\to H^{*}\) implies that \(H^{*}\) is positive definite.

If \(V^{*}(u, v)=0\), where \((u,v)\in R^{n+m}\) and \(u=\{(u_{1},\ldots, u_{n})^{T}\}\), \(v=\{(v_{1},\ldots, v_{m})^{T}\}\), then we have

$$ H^{*}u+\nabla G^{*}v=0 $$
(23)

and

$$ \operatorname{diag}\bigl(\xi^{*}\bigr) \bigl(\nabla G^{*}\bigr)^{T}u+ \operatorname{diag}\bigl(\eta ^{*}\bigr)v=0. $$
(24)

From (29) and the definitions of \(\xi_{j}^{*}\) and \(\eta_{j}^{*}\), we know that if \(\xi_{j}^{*}=0\) then \(\eta_{j}^{*}\neq0\). If \(\xi_{j}^{*}\neq0\) then

$$ u^{T}\nabla g_{j}^{*}=-\frac{\eta_{j}^{*}}{\xi_{j}^{*}}v_{j} . $$
(25)

Putting (29) and (30) into (28), we have

$$\begin{aligned} &u^{T}\bigl( H^{*}u+\nabla G^{*}v\bigr) \\ &\quad =u^{T}H^{*}u+\sum_{j:\xi_{j}^{*}\neq0}- \frac{\eta_{j}^{*}}{\xi_{j}^{*}}v_{j}^{2}=0. \end{aligned}$$
(26)

\(\eta_{j}^{*}/\xi_{j}^{*}\le0\) implies \(u=0\), and if \(\eta_{j}^{*}\neq0\) then \(v_{j}=0\). Let \(I= \{j| g_{j}^{*}=0\}\), because \(g_{j}^{*}\neq0\) implies \(\eta_{j}^{*}\neq0\) and \(v_{j}=0\), we have

$$ \sum_{j\in I}\nabla g_{j}^{*}v_{j}=0, $$
(27)

and \(v_{j}=0\) (\(j\in I\)) by A4, i.e., \((u,v)=0\) and \(V^{*}\) is nonsingular.

On the other hand, suppose to the contrary that there exists a subsequence \(\{(x^{k(i)}, \lambda^{k(i)})\}\) such that \(\|( V^{k(i)})^{-1}\|\to\infty\) as \(k(i)\to\infty\) and \((x^{k(i)}, \lambda^{k(i)}) \to(x^{*}, \lambda^{*})\). We can choose \(k(i)\) properly such that \(V^{k(i)}\to V^{*}\) including \(\xi^{k(i)}\to\xi^{*}\) and \(\eta^{k(i)}\to\eta^{*}\). Clearly, \((\xi_{j}^{*})^{2}+ (\eta_{j}^{*})^{2}\ge3-2\sqrt{2}>0\) and \(V^{*}\in\partial \Phi^{*}\). But \(V^{*}\) is nonsingular by the above proof, which contradicts the assumption \(\|( V^{k(i)})^{-1}\|\to\infty\). Hence, \(\{ \|( V^{k(i)})^{-1}\| \}\) is bounded. \(\Phi^{k}\to0\) implies \(\lim_{k\to\infty}V^{k}=\lim_{k\to\infty}\hat{V}^{k}\), we can also obtain that \(\{ \|( \hat{V}^{k})^{-1}\| \}\) is bounded. This lemma holds. □

Assumption A5 shows that \((x^{k}, \mu^{k})\) is a Newton direction with a high order perturbation. We obtain the following lemma.

Lemma 10

For sufficiently large k, \(x^{k+1}=x^{k}+d^{k1}\) and \(\mu^{k+1}=\mu^{k}+\lambda^{k1}\).

Furthermore, Lemma 10 implies that the following theorem holds.

Theorem 2

Assume A1–A7 hold. Let Algorithm 1 (NFQPIM) be implemented to generate a sequence \(\{(x^{k},\mu^{k})\}\), \((x^{*}, \mu^{*})\) be an accumulation point of \(\{(x^{k},\lambda^{k})\}\). Then \((x^{*}, \mu^{*})\) is a KKT point of problem (1), and \((x^{k},\mu^{k})\) converges to \((x^{*}, \mu^{*})\) superlinearly.

4 Numerical tests

We use Algorithm 1 (NFQPIM) for the constrained optimization problems (see [19]): \(H^{k}\) is updated by the BFGS method. The termination criterion is \(\|\phi\|\le10^{-5}\). The parameters are chosen as follows: \(c=0.1\), \(\nu=2\), \(\tau=0.7\), \(\theta_{1}=0.8\), \(\theta=0.6\), \(\bar{\mu}=10\text{,}000\). In the “NIT/NG” entry of the table below, NIT is the number of iterations, NF represents the number of function evaluations, NG denotes the number of gradient evaluations. The numerical results can be seen in the Table 1. We test the proposed NFQPIM for solving almost 100 optimization problems. And the numerical results illustrate that the proposed method is efficient and promising.

Table 1 Numerical results on the NFQPIM for some constrained optimization problems

5 Conclusions

In this paper, we developed a nonmonotone filter QP-free infeasible method for minimizing a smooth optimization problem with inequality constraints. This proposed method is based on the solution of nonsmooth equations which are obtained by the multiplier and some NCP functions for the KKT first-order optimality conditions. At each iteration of the proposed method, it was a perturbation of a Newton or quasi-Newton iteration on both the primal and dual variables for the solution of the KKT optimality conditions. Moreover, we used the filter on linear searches with a nonmonotone acceptance mechanism. We also showed that the proposed method had a global convergence and a superlinear convergence rate. Finally, the numerical results illustrated that the proposed method was efficient. However, how to apply this method to the real optimal problem will be studied in the near future.

References

  1. Chen, X.: Smoothing methods for complementarity problems and their applications: a survey. J. Oper. Res. Soc. Jpn. 43, 32–47 (2000)

    MathSciNet  MATH  Google Scholar 

  2. Arrow, K.J., Debreu, G.: Existence of an equilibrium for a competitive economy. Econometrica, 22, 265–290 (1954)

    Article  MathSciNet  Google Scholar 

  3. Smeers, Y.: Computable equilibrium models and the restructuring of the European electricity and gas markets. Energy J. 0(4), 1–31 (1997)

    Google Scholar 

  4. Tian, B., Yang, X.: Smoothing power penalty method for nonlinear complementarity problems. Pac. J. Optim. 12(2), 461–484 (2016)

    MathSciNet  MATH  Google Scholar 

  5. Xie, S.L., Xu, H.R., Zeng, J.P.: Two-step modulus-based matrix splitting iteration method for a class of nonlinear complementarity problems. Linear Algebra Appl. 494, 1–10 (2016)

    Article  MathSciNet  Google Scholar 

  6. Otero, R.G., Iusem, A.: A proximal method with logarithmic barrier for nonlinear complementarity problems. J. Glob. Optim. 64(4), 663–678 (2016)

    Article  MathSciNet  Google Scholar 

  7. Hao, Z., Wan, Z., Chi, X., Chen, J.: A power penalty method for second-order cone nonlinear complementarity problems. J. Comput. Appl. Math. 290, 136–149 (2015)

    Article  MathSciNet  Google Scholar 

  8. Pu, D.G., Li, K.D., Xue, J.: Convergence of QP-free infeasible methods for nonlinear inequality constrained optimization problems. Tongji Daxue Xuebao Ziran Kexue Ban 33, 525–529 (2005)

    MathSciNet  MATH  Google Scholar 

  9. Fletcher, R., Leyffer, S., Toint, P.: On the global convergence of a filter-SQP algorithm. SIAM J. Optim. 13, 44–59 (2002)

    Article  MathSciNet  Google Scholar 

  10. Ulbrich, M., Ulbrich, S.: Non-monotone trust region methods for nonlinear equality constrained optimization without a penalty function. Math. Program. 95, 103–135 (2003)

    Article  MathSciNet  Google Scholar 

  11. Yang, Y., Shang, Y.: A new filled function method for unconstrained global optimization. Appl. Math. Comput. 173, 501–512 (2006)

    MathSciNet  MATH  Google Scholar 

  12. Qi, H.D., Qi, L.: A new QP-free, globally convergent, locally superlinearly convergent algorithm for inequality constrained optimization. SIAM J. Optim. 11, 113–132 (2000)

    Article  MathSciNet  Google Scholar 

  13. Liu, F.T., Fan, Y.H., Yin, J.H.: The use of QP-free algorithm in the limit analysis of slope stability. J. Comput. Appl. Math. 235, 3889–3897 (2011)

    Article  MathSciNet  Google Scholar 

  14. Pu, D.G., Zhou, Y., Zhang, H.Y.: A QP free feasible method. J. Comput. Math. 22(5) 651–660 (2004)

    MathSciNet  MATH  Google Scholar 

  15. Jian, J.B., Guo, C.H., Tang, C.M., Bai, Y.Q.: A new superlinearly convergent algorithm of combining QP subproblem with system of linear equations for nonlinear optimization. J. Comput. Appl. Math. 273, 88–102 (2015)

    Article  MathSciNet  Google Scholar 

  16. Zhang, H., Li, G., Zhao, H.: A kind of QP-free feasible method. J. Comput. Appl. Math. 224, 230–241 (2009)

    Article  MathSciNet  Google Scholar 

  17. Huang, M., Pu, D.: Line search SQP method with a flexible step acceptance procedure. Appl. Numer. Math. 92, 98–110 (2015)

    Article  MathSciNet  Google Scholar 

  18. Huang, M., Pu, D.: A line search SQP method without a penalty or a filter. Comput. Appl. Math. 34, 741–753 (2015)

    Article  MathSciNet  Google Scholar 

  19. Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous referees and the associate editor for their useful comments and suggestions which improved this paper greatly.

Funding

The work of Y. Shang is supported by the National Natural Science Foundation of China Grant No.11471102. The work of D. Pu is supported by the National Natural Science Foundation of China Grant No. 11371281. The work of Z.-F. Jin is supported by the National Natural Science Foundation of China Grant No. 61772174, Plan for Scientific Innovation Talent of Henan Province Grant No. 174200510011.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally to this work. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Dingguo Pu.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shang, Y., Jin, ZF. & Pu, D. A new filter QP-free method for the nonlinear inequality constrained optimization problem. J Inequal Appl 2018, 278 (2018). https://doi.org/10.1186/s13660-018-1851-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13660-018-1851-3

MSC

Keywords