 Research
 Open Access
Primaldual interior point QPfree algorithm for nonlinear constrained optimization
 Jinbao Jian^{1},
 Hanjun Zeng^{2},
 Guodong Ma^{1}Email author and
 Zhibin Zhu^{3}
https://doi.org/10.1186/s1366001715002
© The Author(s) 2017
Received: 10 March 2017
Accepted: 6 September 2017
Published: 29 September 2017
Abstract
In this paper, a class of nonlinear constrained optimization problems with both inequality and equality constraints is discussed. Based on a simple and effective penalty parameter and the idea of primaldual interior point methods, a QPfree algorithm for solving the discussed problems is presented. At each iteration, the algorithm needs to solve two or three reduced systems of linear equations with a common coefficient matrix, where a slightly new working set technique for judging the active set is used to construct the coefficient matrix, and the positive definiteness restriction on the Lagrangian Hessian estimate is relaxed. Under reasonable conditions, the proposed algorithm is globally and superlinearly convergent. During the numerical experiments, by modifying the technique in Section 5 of (SIAM J. Optim. 14(1): 173199, 2003), we introduce a slightly new computation measure for the Lagrangian Hessian estimate based on second order derivative information, which can satisfy the associated assumptions. Then, the proposed algorithm is tested and compared on 59 typical test problems, which shows that the proposed algorithm is promising.
Keywords
MSC
1 Introduction
It is known that the sequential quadratic programming (SQP) method is one of the efficient methods for constrained optimization due to its fast convergence, and it has been widely studied by many authors, see Refs. [8–17]. However, the quadratic program (QP) subproblems solved in the SQP methods may be inconsistent, and the computational cost for the QPs is high. Therefore, motivated by the KKT condition of the QPs and/or the quasiNewton method, QPfree methods are put forward, in which the QPs are replaced by suitable systems of linear equations (SLEs), see Refs. [18–26].
Now we review briefly the study on the primaldual interior point (PDIP) QPfree algorithms associated with our work. First, for problem (P) with no equality constraints, i.e., \(I^{\ell }=\emptyset \), in 1987, Panier et al. [22] presented a QPfree algorithm denoted by PTH, at iterate k, two SLEs are solved to yield a master search direction. Then a least squares problem (LSP) needs to be solved to avoid the socalled Maratos effect [27]. However, the SLEs solved in [22] may become illconditioned, and the PTH algorithm may be instable. Furthermore, the initial point must lie on the strict interior of the feasible set, and an additional assumption that ‘the number of stationary points is finite’ is used to ensure the global convergence. Later, under the assumption that the multiplier approximation sequence remains bounded, the PTH algorithm was improved by Gao et al. [3] by solving an extra SLE. The PTH algorithm was also improved by Qi and Qi [23], Zhu [26] and Cai [28].
To improve the PTH algorithm [22], by using the idea of PDIP and choosing different barrier parameters for each constraint, Bakthiari and Tits [18] proposed a new PDIP QPfree algorithm. The algorithm can start from a feasible point at the boundary of the feasible set, and it possesses global convergence without both the additional assumption of isolatedness of the stationary points and the positive definite restriction on matrix \(H_{k}\). Almost at the same time, Tits et al. [1] extended and improved the PTH algorithm to problem (P) with both inequality and equality constraints. The algorithm [1] possesses two remarkable characters. One is that a new and simple rule to update the penalty parameter ρ in (P_{ ρ }) is derived, the other is that, same as in [18], the uniformly positive definite restriction on the Lagrangian Hessian estimate is relaxed.
More recently, for inequality constrained optimization, Jian et al. [21] proposed a strongly subfeasible primaldual quasi interiorpoint algorithm with superlinear convergence, where the initial point can be chosen arbitrarily, the number of feasible constraints is nondecreasing, and the iteration points all enter into the interior of the feasible region after finite iterations; a new kind of working set was introduced, which further reduced the computational cost; the uniformly positive definite restriction on the sequence \(\{H_{k}\}\) was relaxed; at each iteration, only two or three SLEs with the same coefficient matrix needed to be solved.
However, there are still some problems worthy of research on the PDIPtype algorithms [1, 18, 22]. First, the coefficient matrix of the KarushKuhnTucker (KKT) system of the LSP is not the same as the two previous SLEs, and this further increases the computational cost. Second, the coefficient matrices of the SLEs include all the constraints and their gradients, and this leads to a large increase in the scale of the SLEs. Third, the global convergence of the two algorithms [1, 18] relies on an additional assumption that the stationary points are finite or isolated.
On the other hand, to design more effective algorithms with small computational cost for solving constrained optimization, Facchinei et al. [29] first introduced the active set identifying technique (also called working set technique). And then this technique has been popularized and applied in many works, e.g., [17, 24, 25, 30, 31]. Particularly, the algorithm [30] needs to solve four SLEs at each iteration.
 (a)
A slightly new identifying technique for the active set different from [17, 25] is introduced. The multiplier yielded at the previous iteration is used to compute the working set, and no additional computational cost is needed, so the computational cost is expected to be reduced.
 (b)
At each iteration, to yield the search directions, only two or three SLEs with the same coefficient matrix need to be solved. Furthermore, the coefficient matrix has smaller scale than the ones in [1, 18, 22].
 (c)
For a strict interior point \(x^{k}\) of the feasible set of (P_{ ρ }), the iteration at \(x^{k}\) is well defined without any other constraint qualification (CQ).
 (d)
Under suitable CQ and assumptions including a relaxed positive definite restriction on the Lagrangian Hessian estimate \(H_{k}\), but without the isolatedness of the stationary points, the proposed algorithm is globally and superlinearly convergent.
 (e)
A slightly new computation technique for \(H_{k}\) based on second order derivative information is introduced, which is a modification of the one in [1], Section 5.1, and satisfies the relaxed positive definite restriction.
Throughout this paper, for simplicity, denote vector \((x^{T},y^{T},z ^{T},\ldots )^{T}\) by \((x,y,z,\ldots )\) for column vectors \(x, y\) and z, and \(\Vert \cdot \Vert \) denotes the Euclidean norm.
2 Construction of algorithm
 H1 :

The inner set \(\tilde{X_{0}}\) is nonempty, and the functions f and \(g_{j}\) (\(j\in I\)) are all continuously differentiable.
Remark 1
Note that if there exists a point belonging to the set X̃, namely, \(\hat{x}\in \tilde{X}\), and the active constraint gradient vectors \(\{ \nabla g_{j}(\hat{x}), j\in I(\hat{x})\}\) are linearly independent, then one can yield a point \(x^{0}\in \tilde{X_{0}}\) by simple computation, e.g., execute line search on g starting with x̂ along direction \(\hat{d}=\hat{N}(\hat{N} ^{T}\hat{N})^{1}e\), where \(\hat{N}=\nabla g_{I(\hat{x})}(\hat{x})\) and \(e=(1,\ldots,1)^{T}\).
Before proposing our algorithm, we give a proposition to show the equivalences between (P) and \((\mathrm{P}_{\rho })\).
Proposition 1
If \((x,\lambda )\) is a KKT pair for problem \((\mathrm{P}_{\rho})\) and \(g_{\ell }(x)=0\), then \((x,\lambda_{\rho })\) with multiplier \(\lambda_{\rho }=\lambda \rho \hat{e}\) is a KKT pair for the original problem (P).
Based on Proposition 1, it is known that if one can construct an effective algorithm for problem \((\mathrm{P}_{\rho})\) and adjust parameter ρ to force the iterate to asymptotically satisfy \(g_{\ell }(x)=0\), then the solution to (P) can be yielded.
It is clear that \((x^{*},\lambda )\) is a KKT pair of (P) if and only if \(\delta (x^{*},\lambda )=0\). Particularly, from [29] or/and [24], Definition 4.1, Theorems 4.1, 4.2 and 4.3, one can see that \(\{j\in I: g_{j}(x)+\delta (x,\lambda )\geq 0\}\) is an exact identification set for active constrain set \(I(x^{*})\) if \((x,\lambda )\) converges to a KKT pair \((x^{*},\lambda ')\) of problem (P), and the MangasarianFromovotz constraint qualification (MFCQ) and the second order sufficient conditions are satisfied at \((x^{*}, \lambda ')\).
Subsequently, to make the coefficient matrix in SLE (8) possess nice property and low computational cost, we consider its optimization and modification as follows. First, replace the Lagrangian Hessian by a suitable approximate symmetric matrix \(H_{k}\), and denote \(xx^{k}\) by direction d. Second, replace the diagonal matrix \(\Lambda_{k}\) by positive diagonal matrix \(Z_{k}=\operatorname{diag}(z^{k}_{I _{k}})\), where vector \(z^{k}_{I_{k}}\) is an approximation of \(\lambda^{k}\).
Subsequently, it is necessary to analyze the singularities of the coefficient matrix \(V_{k}\) above, i.e., the solvability of SLE (10).
Lemma 1
Proof
One knows that it is sufficient to show that SLE \(V_{k}u=0\) has a unique solution zero, and this is elementary and omitted here. □
Remark 2
Obviously, the positive definiteness request (11) on \(H_{k}\) is weaker than the positive definiteness of \(H_{k}\) itself on \(R^{n}\). But it is stronger than the positive definiteness of \(H_{k}\) on the null space of the gradients of approximate active constraints, i.e., on \(\Omega_{k}:=\{d\in R^{n}: \nabla g_{I_{k}}(x ^{k})^{T}d=0\}\). However, the latter cannot ensure the invertibility of \(V_{k}\).
Based on the above analysis and preparation, now we can describe the steps of our algorithm solving (P) as follows.
Algorithm A
Parameters: \(\alpha \in (0,\frac{1}{2}), \sigma,\beta, \theta, r\in (0,1), \xi \in (2,3)\), \(\nu >2\), \(\vartheta >1\), \(M, p>0\); suitable small positive parameters \(\gamma_{1}\), γ and \(\gamma_{3}\); sufficiently small lower bound \(\underline{\varepsilon}>0\) and sufficiently large upper bound \(\overline{\varepsilon}>0\); termination accuracy \(\epsilon >0\).
Data: \(x^{0} \in \tilde{X_{0}}, \rho_{0}>0\), vectors \(z^{0}\) with weights \(z^{0}_{j}\in [\underline{\varepsilon}, \overline{\varepsilon}], j\in I\). Set \(k:=0\).
Step 1 Compute working set. Compute \(\lambda^{k}\) by (5), \(\Phi (x^{k},\lambda^{k})\) and \(\delta (x^{k}, \lambda^{k})\) by (3)(4). If \(\Phi (x^{k},\lambda ^{k})\leq \epsilon \) or other suitable termination rule is satisfied, then \((x^{k},\lambda^{k})\) is an approximate KKT pair of problem (P) and stop; otherwise, generate the working sets \(I^{\imath }_{k}\) and \(I_{k}\) by (6).
Step 2 Yield matrix \(H_{k}\). Yield matrix \(H_{k}\) such that it approximates to the Hessian of the Lagrangian associated with \((\mathrm{P}_{\rho_{k}})\) and satisfies request (11).
Step 3 Compute the main search directions.
(i) Compute \((\bar{d}^{k},\bar{\lambda }^{k}_{I_{k}})\) by solving \(\operatorname{SLE}(V_{k}; 1,0)\), see (10), then set \(\bar{\lambda } ^{k}=(\bar{\lambda }^{k}_{I_{k}},0_{{I\setminus I_{k}}})=(\bar{ \lambda }^{k}_{\ell }, \bar{\lambda }^{k}_{\imath })\) with \(\bar{ \lambda }^{k}_{\imath }=(\bar{\lambda }^{k}_{I^{\imath }_{k}}, 0_{ I ^{\imath }\setminus I^{\imath }_{k}})\).
(ii) Check conditions: (a) \(\Vert \bar{d^{k}} \Vert \leq \gamma_{1}\), (b) \(\bar{\lambda }^{k}\geq \gamma_{2} e_{{I}}\), (c) \(\bar{\lambda } ^{k}_{\ell }\ngtr\gamma_{3}e_{{I^{\ell }}}\). If all the three conditions above hold, then increase penalty parameter ρ by \(\rho_{k+1}=\vartheta \rho_{{k}}\), set \(x^{k+1}=x^{k}, z^{k+1}=z ^{k}, H_{k+1}=H_{k}\), \(I^{\imath }_{k+1}=I^{\imath }_{k}, I_{k+1}=I _{k}\), \(k:=k+1\), and go back to Step 3(i). Otherwise, set \(\rho_{k+1}= \rho_{{k}}\), proceed to Step 3(iii) as follows.
(iv) Compute \((d^{k},\lambda^{k}_{I_{k}})\) by solving \(\operatorname{SLE}(V _{k}; 1,\mu^{k})\), see (10), then set \(\lambda^{k}=(\lambda ^{k}_{I_{k}},0_{{I\setminus I_{k}}})=(\lambda^{k}_{\ell }, \lambda ^{k}_{\imath })\) with \(\lambda^{k}_{\imath }=(\lambda^{k}_{I^{\imath }_{k}}, 0_{ I^{\imath }\setminus I^{\imath }_{k}})\).
If \(\Vert \tilde{d}^{k} \Vert >\Vert d^{k} \Vert \), reset \(\tilde{d}^{k}=0\).
Lemma 2
Proof
Third, if \(\bar{d}^{k}=0\), then, from \(\operatorname{SLE}(V_{k}; 1,0)\) (10), \(g(x^{k})<0\) and (9), it follows that \(\bar{\lambda }^{k}_{I_{k}}=0\). So, by the structure of Step 3, the iterate k does not go into Step 3(iii), (iv). Thus, \(\bar{d}^{k} \neq0\) when the iterative process goes into Step 3(iii), (iv).
Finally, \(\xi_{k}< 0\) follows from (25), (23) and \(\bar{d}_{k}\neq0\). The remaining claims in Lemma 2 are at hand by \(\xi_{k}<0\) and \(g(x^{k})<0\). □
As an end of this section, to help the readers understand our algorithm, we further analyze the steps/structure of Algorithm A with three remarks below.
Remark 3
Analysis for Step 3
 (i)
The role of solving \(\operatorname{SLE}(V_{k}; 1,0)\) with no perturbation in Step 3(i) is to check whether the current iterate \(x^{k}\) is an approximate KKT point of (P\(_{\rho_{{k}}}\)) and yield an ‘improved’ direction \(\bar{d}^{k}\) to a certain extent.
 (ii)
If conditions (a) and (b) in Step 3(ii) are satisfied, and the parameters \(\gamma_{1}\) and \(\gamma_{2}\) are small enough, then \(\operatorname{SLE}(V_{k}; 1,0)\) implies that \(x^{k}\) is an approximate KKT point of (P\(_{\rho_{{k}}}\)). However, if case (c) is also satisfied, one cannot estimate \(\Vert g_{\ell }(x^{k}) \Vert \). So, we increase the penalty parameter ρ. In practical computation, if conditions (a) and (b) are satisfied and \(\Vert g_{\ell }(x^{k}) \Vert \) is small enough, we can terminate the algorithm.
 (iii)
From result (23), one knows that \(\bar{d}^{k}\) is a descent direction of the merit function \(f_{\rho_{k}}(x)\) at \(x^{k}\) when \(\bar{d}^{k}\neq0\). However, the primal feasibility and dual feasibility are relaxed to a large extent in \(\operatorname{SLE}(V_{k}; 1,0)\), \(\bar{d}^{k}\) cannot be used as an effective search direction. So, generally, the first direction \(\bar{d}^{k}\) should be corrected by another SLE. For this goal, refer to [21], we construct and solve \(\operatorname{SLE}(V_{k}; 1,\mu^{k})\) in Step 3(iii), (iv). Lemma 2 and the global convergence analysis in the next section show that the algorithm with search direction \(d^{k}\) is well defined and globally convergent.
Remark 4
Explanation for Steps 4 and 5
Usually, search direction \(d^{k}\) cannot avoid the Maratos effect, i.e., unit step cannot be accepted by the associated line search for all sufficiently large iterates k. So, to overcome the Maratos effect and obtain superlinear convergence, one needs to compute an additional high order correction direction. Here, we generate it by solving \(\operatorname{SLE}(V_{k}; 0,\tilde{\mu }^{k})\) in Step 5. Obviously, solving \(\operatorname{SLE}(V_{k}; 0,\tilde{\mu }^{k})\) should add computational cost more or less. On the other hand, numerical testing shows that \(d^{k}\) can still avoid the Maratos effect at some iterates. Therefore, to save computational cost as much as possible, the trial of unit step in Step 4 is added.
Remark 5
With the help of the working set technique, the three SLEs solved in Algorithm A have a common coefficient matrix \(V_{k}\), which can save the cost of computation and is different from those in Refs. [18, 26], etc. Furthermore, due to being interior point type and the constructing technique for \(V_{k}\), Algorithm A is well defined at each iterate without any other CQ except the strict inner \(\tilde{X}_{0}\neq\emptyset \), see Lemmas 1 and 2. In many existing QPfree type algorithms, see Refs. [1, 3, 21–24], the linearly independent constraint qualification (LICQ) is necessary to ensure the iterate itself is well defined. Of course, as we see in Assumption H3, to obtain the global and superlinear convergence of Algorithm A, a suitable CQ on the boundary of X̃ is still necessary.
3 Analysis of global convergence
 H2 :

Suppose that the sequences both \(\{x^{k}\}\) and \(\{H_{k}\}\) yielded by Algorithm A are bounded, and assume that there exists a positive constant a such thati.e., \(d^{T}Q_{k}d\geq a\Vert d \Vert ^{2}, \forall k, \forall d \in R^{n}\).$$ d^{T}H_{k}d \geq a\Vert d \Vert ^{2} \sum_{j\in I_{k}}\frac{z^{k}_{j}}{\vert g_{j}^{k} \vert } \bigl\Vert \nabla g_{j}^{k^{T}}d \bigr\Vert ^{2}, $$(27)
 H3 :

For each \(x\in \tilde{X}\), suppose that
 (i)
the gradient vectors \(\{\nabla g_{j}(x), j\in I(x)\}\) are linearly independent; and
 (ii)
if \(x\notin X\), i.e., \(g_{\ell }(x)\neq 0\), then there exist no scalars \(\lambda_{j}\geq 0, j\in I(x)\) such that \(\sum_{j\in I^{\ell }}\nabla g_{j}(x)=\sum_{j\in I(x)} \lambda_{j}\nabla g_{j}(x)\).
 (i)
Remark 6
Analysis for H2
The uniform ‘positivedefiniteness’ request (27) on \(\{H_{k}\}\) is weaker than the usual uniform positivedefiniteness of \(\{H_{k}\}\) itself on \(R^{n}\), namely, \(d^{T}H_{k}d\geq a\Vert d \Vert ^{2}, \forall k, \forall d\in R^{n}\). However, it is stronger than the uniform positivedefiniteness of \(H_{k}\) on the null space \(\Omega_{k}\). It is encouraging that, based on the Lagrangian Hessian, we can design an alternative computational technique for \(H_{k}\) such that \(\{H_{k}\}\) is bounded and satisfies request (27), which implies (11) whenever \(\{x^{k}\}\) is bounded, see formulas (52), (54) and (55) as well as Theorem 5 in Section 5.
Remark 7
Analysis for H3
 (i)
Hypothesis H3 was introduced by Tits et al. in [1], Assumption 3. In our work, it plays two roles in the convergence analysis of Algorithm A. One is to ensure the correction for the penalty parameter ρ can be finished in a finite number of iterations, the other is to assure that the sequence \(\{V_{k}\}\) of coefficient matrices is uniform invertible, see Lemmas 3 and 4. Furthermore, H3 is considerably milder than the linear independence of the gradients \(\{\nabla g_{i}(x), i\in I^{ \ell }; \nabla g_{j}(x), j\in I^{\imath }(x)\}\), a detailed analysis for this assumption can be seen in [1, 32].
 (ii)
First, H3 automatically holds at each interior point \(x\in \tilde{X}_{0}\). Second, H3 can be reduced to each accumulation point \(x^{*}\) of the iterate sequence \(\{x^{k}\}\), which satisfies \(x^{*}\notin \tilde{X}_{0}\). However, the latter is difficult to be verified.
Lemma 3
Suppose that H1, H2 and H3 hold. Then the penalty parameter \(\rho_{k}\) in Algorithm A is increased at most finite times.
The proof of Lemma 3 is similar to the one of [1], Lemma 4.1, and omitted here. In what follows, ρ̄ denotes the final value of \(\rho_{k}\), i.e., \(\rho_{k}\equiv \bar{\rho }\) when k is sufficiently large.
Lemma 4
 (i)
the sequence \(\{V_{k}\}\) of coefficient matrices is unified invertible, i.e., there exists a positive constant M̄ such that \(\Vert V_{k}^{1} \Vert \leq \bar{M}, \forall k\geq 0\), and
 (ii)
both sequences \(\{(\bar{d}^{k},\bar{\lambda }^{k})\}\) and \(\{(d^{k},\lambda^{k})\}\) are bounded.
Proof
(ii) First, the boundedness of \(\{(\bar{d}^{k},\bar{\lambda }^{k})\}\) follows from \(\operatorname{SLE}(V_{k}; 1,0)\) and conclusion (i) as well as \(\rho_{k}\equiv \bar{\rho }\). Second, the boundedness of \(\{\mu^{k}\}\) follows from formulas (12)(16) and the boundedness of \(\{(\bar{d}^{k},\bar{\lambda }^{k})\}\) as well as the positive boundary below of \(\{z^{k}\}\). Therefore, the boundedness of \(\{(d^{k},\lambda ^{k})\}\) is also at hand by \(\operatorname{SLE}(V_{k}; 1,\mu^{k})\). □
Lemma 5
Suppose that H1, H2 and H3 hold. Let \(x^{*}\) be an accumulation point of the sequence \(\{x^{k}\}\) generated by Algorithm A, and suppose that \(\{x^{k}\}_{K}\rightarrow x^{*}\) for some infinite index set K. If \(\{\xi_{k}\}_{K}\rightarrow 0\), then \(x^{*}\) is a KKT point of problem \((\mathrm{P}_{\bar{\rho }})\), and both \(\{\bar{\lambda }^{k}\}_{K}\) and \(\{\lambda^{k}\}_{K}\) converge to the unique multiplier vector \(\lambda^{*}\) associated with \(x^{*}\).
Proof
Next, divert our attention to showing that \(\bar{\lambda }^{*}\geq 0\). It is obvious that \(\bar{\lambda }^{*}_{j}=0\) follows from \(\bar{ \lambda }_{j}^{*}g_{j}(x^{*})=0\) for \(j\in \hat{I}\setminus I(x^{*})\). Moreover, from the definition of \(\xi_{k}\), i.e., (13), and \((\xi_{k},\bar{d}^{k})\stackrel{K'}{\rightarrow }(0,0)\), we can deduce that \(\sum_{j\in \hat{I}}\frac{\bar{\lambda }^{k}_{j}\phi^{k}_{j}}{z ^{k}_{j}}\rightarrow 0, k\in K'\). Further, in view of (25), we know that each term \(\frac{\bar{\lambda }^{k}_{j}\phi^{k}_{j}}{z ^{k}_{j}}\leq 0\), which together with (29) implies that \(\bar{\lambda }^{k}_{j}\phi^{k}_{j}\stackrel{K'}{\rightarrow }0\). This, plus (12), shows that \(\bar{\lambda }^{*}_{j}{\min}\{0,( \operatorname{max}\{\bar{\lambda }^{*}_{j},0\})^{p}Mg_{j}(x^{*})\}=0\) for \(j\in \hat{I}\), and this includes \(\bar{\lambda }^{*}_{j}\geq 0\) for \(j\in \hat{I}\cap I(x^{*})\). Therefore, \(\bar{\lambda }^{*}_{\hat{I}} \geq 0\). Obviously, \(\bar{\lambda }^{*}_{I\setminus \hat{I}}=0\). So \(\bar{\lambda }^{*}\geq 0\) is at hand.
Hence, taking into account \(x^{*}\in \tilde{X}\), we can conclude from (30) that \((x^{*},\bar{\lambda }^{*})\) is a KKT pair and \(x^{*}\) is a KKT point for \((\mathrm{P}_{\bar{\rho }})\). Furthermore, the analysis above further shows that the sequence \(\{\bar{\lambda }^{k} \}_{K}\) possesses a unique limit point, i.e., the unique KKT multiplier vector \(\lambda^{*}\). So \(\lim_{k\in K}\bar{\lambda }^{k}=\lambda^{*}\).
Theorem 1
Suppose that H1, H2 and H3 hold. Then each accumulation point \(x^{*}\) of the sequence \(\{x^{k}\}\) generated by Algorithm A is a KKT point of the original problem (P), i.e. problem (1).
Proof
First, there exists an infinite index set \(K'\) such that \(x^{k}\rightarrow x^{*}, k\in K'\), and relation (29) holds. By contradiction, suppose that \(x^{*}\) is not a KKT point of (P). Then, from Lemma 4, without loss of generality, one can suppose that \(\lambda^{k}=\bar{\lambda }^{k1}\bar{ \rho }\hat{e} \rightarrow \bar{\lambda }', k\in K'\). Therefore, it follows that \((x^{*},\bar{\lambda }')\) is not a KKT pair of (P), which further implies that \(\delta (x^{*},\bar{\lambda }')>0\) and \(I^{\imath }(x^{*})\subseteq I^{\imath }_{k}, k\in K'\) large enough. There are two cases as follows to be considered.
The remaining proof is divided into two steps.
Step A: Show that there exists a constant \(\bar{t}>0\) such that the steplength \(t_{k}\geq \bar{t}\) holds for all \(k\in K''\).
 (A2):

Analyze inequality (19). From Taylor expansion and (24), one gets$$\begin{aligned} f_{\bar{\rho }} \bigl(x^{k}+td^{k}+t^{2} \tilde{d}^{k} \bigr)f_{\bar{\rho }} \bigl(x ^{k} \bigr) \alpha t\nabla f_{\bar{\rho }} \bigl(x^{k} \bigr)^{T}d^{k} & =(1\alpha )t \nabla f_{\bar{\rho }} \bigl(x^{k} \bigr)^{T}d^{k}+o(t) \\ & \leq (1\alpha )t\theta \xi_{k}+o(t) \\ & \leq (1\alpha )t\theta \bar{\xi }/2+o(t) \\ & \leq 0. \end{aligned}$$
4 Analysis of strong and superlinear convergence
 H4 :

 (i)
The functions \(f(x)\) and \(g(x)\) are all twice continuously differentiable over X̃; and
 (ii)there exists an accumulation point \(x^{*}\) of the sequence \(\{x^{k}\}\) of iterative points with (unique) KKT multiplier \(\lambda '\) associated with (P) such that the second order sufficiency conditions (SOSC) and the strict complementarity hold, i.e., the KKT pair \((x^{*}, \lambda ')\) of (P) satisfies \(\lambda '_{{I^{\imath }(x^{*})}}>0\) and$$\begin{aligned} d^{T}\nabla_{xx}^{2} L \bigl(x^{*}, \lambda ' \bigr)d>0,\quad \forall d\in \bigl\{ d\in R^{n}: d \neq 0, \nabla g_{{I(x^{*})}} \bigl(x^{*} \bigr)^{T}d=0 \bigr\} . \end{aligned}$$
 (i)
Remark 8
Denote the Lagrangian function of problem \((\mathrm{P}_{\bar{\rho }})\) by \(L_{\bar{\rho }}(x,\lambda )=f_{\bar{ \rho }}(x)+\sum_{j\in I}\lambda_{j} g_{j}(x)\). Then, with relation \(\lambda_{\bar{\rho }}=\lambda \bar{\rho }\hat{e}\), we have \(L(x,\lambda_{\bar{\rho }})=L_{\bar{\rho }}(x,\lambda )\). Therefore, taking into account Lemma 6(iv), it is readily checked that the SOSC with the strict complementarity for \((\mathrm{P}_{\bar{\rho }})\) is identical with that for (P).
Lemma 6
 (i)
\(I^{\imath }(x^{*})\subseteq I^{\imath }_{k}\) for \(k\in K'\) sufficiently large;
 (ii)
\(x^{*}\) is a KKT point of problem \((\mathrm{P}_{\bar{\rho}})\);
 (iii)
\(\{(\bar{d}^{k},\bar{\lambda }^{k})\}_{K'}\rightarrow (0, \lambda^{*})\) and \(\{(d^{k},\lambda^{k})\}_{K'}\rightarrow (0,\lambda ^{*})\), where \(\lambda^{*}\) together with \(x^{*}\) is a KKT pair of problem \((\mathrm{P}_{\bar{\rho }})\); and
 (iv)
the KKT multiplier \(\lambda '\) of (P) and \(\lambda ^{*}\) of \((\mathrm{P}_{\bar{\rho }})\) associated with the KKT point \(x^{*}\) satisfy \(\lambda '=\lambda^{*}\bar{\rho }\hat{e}, \lambda ^{*}_{{I(x^{*})}}>0\).
Proof
(ii) By contradiction, suppose that \(x^{*}\) is not a KKT point of \((\mathrm{P}_{\bar{\rho }})\). Then, taking into account conclusion \(I^{\imath }(x^{*})\subseteq I^{\imath }_{k}\) (\(k\in K'\) large enough), by Case II of the proof of Theorem 1, we can bring a contradiction.
Finally, conclusion \(\{(d^{k},\lambda^{k})\}_{K'}\rightarrow (0,\lambda ^{*})\) follows from \(\{(\bar{d}^{k},\bar{\lambda }^{k})\}_{K'} \rightarrow (0,\lambda^{*})\) and (31).
(iv) By Proposition 1 and \(g_{\ell }(x^{*})=0\), we have \(\lambda '=\lambda^{*}\bar{\rho }\hat{e}\), and \(\lambda^{*}_{{I^{ \imath }(x^{*})}}=\lambda '_{{I^{\imath }(x^{*})}}>0\) by H4(ii). Further, in view of \(\bar{d}^{k}\rightarrow 0, \bar{\lambda }^{k} \rightarrow \lambda^{*}\geq 0, k\in K'\), one knows that conditions (a) and (b) in Step 3(ii) hold for k large enough. Therefore, taking into account \(\rho_{k}\equiv \bar{\rho }\) for k large enough, it follows that \(\bar{\lambda }^{k}> \gamma e_{{I^{\ell }}}\) by Step 3(ii), so \(\lambda^{*}_{\ell }\geq \gamma_{3} e_{{I^{\ell }}}>0\). Therefore \(\lambda^{*}_{{I(x^{*})}}>0\) holds. □
Remark 9
In view of \(I^{\ell }(x^{*})=I^{\ell }\), from H3, H4 and Lemma 6(ii), (iv), the following conclusion holds: The LICQ, SOSC and strict complementarity of problem (P) and problem (P\(_{\bar{\rho }}\)) are satisfied at their KKT pair \((x^{*},\lambda ')\) and \((x^{*},\lambda^{*})\), respectively.
In view of Remark 9, similarly to the proof of [21], Theorem 4.1, Lemma 4.2, we have the following result.
Theorem 2
 (i)
\(x^{k}\rightarrow x^{*}\), i.e., Algorithm A is strongly convergent;
 (ii)
\((\bar{d}^{k},\bar{\lambda }^{k})\rightarrow (0,\lambda ^{*}), (d^{k},\lambda^{k})\rightarrow (0,\lambda^{*}), z^{k}\rightarrow \min \{\max \{\underline{\varepsilon} e_{I},\lambda^{*}\},\overline{ \varepsilon}e_{I}\}\), and
 (iii)
\(\phi^{k}=0, \mu^{k}=\varphi_{k}\Vert \bar{d}^{k} \Vert ^{ \nu }z^{k}_{I_{k}}\), \(I^{\imath }_{k}\equiv I^{\imath }(x^{*})\) and \(I_{k}\equiv I_{*}:=I(x^{*})\) if k is sufficiently large.
Lemma 7
Proof
First, from the given conditions and Theorem 2(iv), relation \(z^{k}_{I_{*}}\rightarrow \lambda^{*} _{I_{*}}\) is at hand. Further, this, together with Theorem 2(ii), shows that \(z^{k}_{j}/\lambda^{k}_{j}\rightarrow 1\) for \(j\in I_{*}\). So, it follows that \(\omega_{k}=o(\Vert d^{k} \Vert ^{2})\) from (18).
 H5 :

Assume that the relation \(\Vert P_{k}(\nabla_{xx}^{2}L_{\bar{\rho }}(x^{k},\lambda^{k})H_{k})P_{k}\bar{d}^{k} \Vert =o( \Vert \bar{d}^{k} \Vert )\) holds, where the projective matrix \(P_{k}\) is defined by \(P_{k}=E_{n}N_{k}(N_{k}^{T}N_{k})^{1}N_{k}^{T}\) with \(N_{k}= \nabla g_{{I_{*}}}(x^{k})\) and norder unit matrix \(E_{n}\).
Remark 10
About H5
 (i)
Due to \(I_{*}=I(x^{*})=I ^{\ell }\cup I^{\imath }(x^{*})\), one knows from H3(i) that matrix \(N_{k}\rightarrow \nabla g_{{I_{*}}}(x^{*})\) which is column full rank, and matrix \(P_{k}\) is well defined when k is large enough.
 (ii)The 2sided projection second order approximation H5 above, also used in [1, 8, 18, 22], is milder than the 1sided projection second order approximation:
 \(\mathbf{H5^{+}}\) :

\(\Vert P_{k}(\nabla_{xx}^{2}L_{\bar{\rho }}(x^{k},\lambda^{k})H_{k})\bar{d}^{k} \Vert =o( \Vert \bar{d}^{k} \Vert )\).
 (iii)
In view of relation (38), assumptions H5 and H5^{+} are equivalent to \(\Vert P_{k}(\nabla_{xx}^{2}L_{\bar{\rho }}(x^{k},\lambda^{k})H_{k})P_{k}d^{k} \Vert =o( \Vert d^{k} \Vert )\) and \(\Vert P_{k}(\nabla_{xx}^{2}L_{\bar{\rho }}(x^{k},\lambda^{k})H_{k})d^{k} \Vert =o( \Vert d^{k} \Vert )\), respectively.
Theorem 3
Suppose that \(\tilde{X}\neq\emptyset \) and hypotheses H2H5 hold, and assume that the boundary parameters \(\underline{\varepsilon}\) and ε̅ satisfy (34). Then the step size \(t_{k}\) of Algorithm A always equals one, i.e., \(t_{k} \equiv 1\) for k large enough.
Proof
(i) Discuss (20). For \(j\notin I_{*}=I(x ^{*})\), \(g_{j}(x^{*})<0\), using the continuity of \(g_{j}\) and \((x^{k},d^{k},\tilde{d}^{k})\rightarrow (x^{*},0,0), k\rightarrow \infty \), we know that (20) holds for \(t=1\) and k large enough.
Finally, based on Theorem 3, by similar analysis in [1, 18, 22] (for twostep superlinear convergence) and [21], Appendix A (for onestep superlinear convergence), we can prove the following rate of superlinear convergence.
Theorem 4
Suppose that \(\tilde{X}\neq\emptyset \) and the hypotheses H2H5 hold. If the boundary parameters \(\underline{\varepsilon}\) and ε̅ satisfy (34), then the proposed Algorithm A is twostep superlinearly convergent, i.e., \(\Vert x^{k+2}x^{*} \Vert =o(\Vert x^{k}x^{*} \Vert )\). Moreover, if H5 is strengthened as H5^{+}, then Algorithm A is onestep superlinearly convergent, i.e., \(\Vert x^{k+1}x^{*} \Vert =o(\Vert x^{k}x^{*} \Vert )\).
5 Numerical experiments
In this section, to show the practical effectiveness of Algorithm A, we test 59 typical problems from [33]. The numerical experiments are implemented by using MATLAB R2013a, and on a PC with Inter(R) Core(TM) i54590 3.30 GHz CPU, 4.00 GB RAM. The details about the implementation are described as follows.
5.1 Computing matrix \(H_{k}\)
The sequence \(\{H_{k}\}\) of matrices defined above possesses nice properties as follows.
Theorem 5
 (i)
The sequence \(\{H_{k}\}\) is bounded and satisfies the positive definite restriction (27) with constant \(a=\underline{\varepsilon}\), so H2 holds.
 (ii)In addition, assume that H4(ii) and (34) are satisfied. Then, for k large enough, matrix \(\mathcal{M}_{k}\) is positive definite, \(\vartheta_{\min }^{k}>0\) and \(\theta_{k}<\underline{ \varepsilon}\). Therefore, \(H_{k}\) is always yielded by the first case in (55), i.e.,$$\begin{aligned} H_{k}=\nabla_{xx}^{2}L_{\rho_{k}} \bigl(x^{k},\hat{z}^{k} \bigr)+\theta_{k} E_{n}, \quad \textit{when k is large enough}. \end{aligned}$$(56)
Proof
Based on Theorem 5, comparing with [1], the following remark is given.
Remark 11
The technique (52)(55) yielding matrix \(H_{k}\) is a modification of the one in [1], Section 5.1, and they are unlike in two points. First, the former introduced in this work can ensure the boundedness of \(\{H_{k}\}\) (see Theorem 5(i)), which plays a key role in the analysis of global and superlinear convergence; especially, in ensuring the penalty parameter \(\rho_{k}\) is increased at most finitely many times. However, the latter in [1], Section 5.1, cannot ensure the boundedness of the sequence \(\{W_{k}\}\) yielded by [1], Section 5.1 (corresponds to \(\{H_{k}\}\) in this paper) since this strict relies on the bounded property of \(\{(\rho_{k},\theta_{k})\}\), and one of the necessary conditions for the boundedness of \(\{(\rho_{k},\theta_{k}) \}\) is just the boundedness of \(\{W_{k}\}\) (see the proof of [1], Lemma 4.1). Second, by introducing \(\hat{z}^{k}\) in the computation technique (52)(55) rather than \(z^{k}\) (corresponds the one denoted in [1], Section 5.1), the assumption H5^{+} is almost satisfied (see Theorem 5(iii)). If one still uses \(z^{k}\) rather that \(\hat{z}^{k}\) in (52)(55), then the second order approximate condition H5^{+} even H5 would be difficult to be satisfied since \(z^{k}_{{I^{\imath }\setminus I^{\imath }(x ^{*})}}\rightarrow \underline{\varepsilon}e_{{I^{\imath }\setminus I^{\imath }(x^{*})}}>0=\lambda^{*}_{{I^{\imath }\setminus I^{\imath }(x^{*})}}\) (by Theorem 2(iv)). Of course, in view of \(\lim_{k\rightarrow \infty }\Vert z^{k}\hat{z}^{k} \Vert =\underline{\varepsilon}\) which is small enough, it can be thought that the numerical performances with \(z^{k}\) and \(\hat{z}^{k}\) should possess no distinct difference.
5.2 Choices of parameters
Remark 12
Analysis for lower bound \(\underline{\varepsilon}\) and upper bound ε̅
First, by Theorems 1 and 2, it is known that, in terms of global and strong convergence of Algorithm A, there is no additional request on the lower bound \(\underline{\varepsilon}\) and upper bound ε̅, i.e., any two positive constants should be suitable. Second, if one considers the rate of convergence of Algorithm A, by Theorem 4, parameters \(\underline{\varepsilon}\) and ε̅ should be sufficiently small and sufficiently large, respectively. However, if the initial values of \(\underline{\varepsilon}\) and ε̅ are chosen too small and/or too large, the numerical performances should be unstable. An ideal approach is to decrease \(\underline{\varepsilon}\) and increase ε̅ based on values \(\min \{z^{k}_{i}, i\in I\}\) and \(\max \{z^{k}_{i}, i\in I\}\), respectively.
5.3 Termination rules
During the process of iteration, the implementation is terminated successfully if one of the following two conditions is satisfied:
(i) \(\Vert \Phi (x^{k},\lambda^{k}) \Vert <10^{5}\); (ii) \(\Vert \bar{d}^{k} \Vert <10^{5}\) and \(\max \{\bar{\lambda }^{k}_{j},j\in I^{ \imath }\}< 10^{5}\).
5.4 Numerical reports
Feasible initial interior points for testing problems
Prob.  \(\boldsymbol{x^{0}}\)  Prob.  \(\boldsymbol{x^{0}}\)  Prob.  \(\boldsymbol{x^{0}}\) 

HS6  (2,2)  HS32  (0.1,0.7,0.1)  HS63  (1,1,1) 
HS7  (0,1)  HS39  (−1,−1,0,0)  HS73  (1,1,1,1) 
HS8  (4,2)  HS40  (2,−1,0,1)  HS78  (−2,1.5,1,−1,−1) 
HS25  (50,25,1.5)  HS42  (1,1,1,1)  HS79  (0,0,0,0,0) 
HS26  (0.2,0.2,0.2)  HS52  (1,−0.5,−1,0,1)  HS80  (−2,−1.5,1,1,1) 
HS27  (0,0,0)  HS53  (−6 2 2 2 2)  HS81  (−1.7,1,1.5,−0.8,−0.8) 
HS28  (0,0,0)  HS60  (0,0,0)  HS107  (0.8 0.8 10 10 1 1 1 1 1) 
HS111  (−1,−1,−1,−1,−1,−1,−1,−1,−1,−1)  HS114  (1,745 12,000 110 3,048 1,974 89.2 92.8 8 3.6 145) 
 Prob.::

the problem number given in [33];
 Itr::

the number of iterations;
 Nf::

the number of function evaluations for f;
 N::

the total number of function evaluations for \(g_{j}\);
 ρ̄::

the final value of \(\rho_{k}\);
 Tcpu::

the CPU time (seconds);
 \(f_{\mathrm{final}}\)::

the objective function value at the final iterate.
Numerical experiment compared reports
Prob.  n  \(\boldsymbol{m_{e}}\)  \(\boldsymbol{m_{i}}\)  Algorithm A in this paper  Algorithm from [ 1 ]  

Itr  Nf  N  \(\boldsymbol{\bar{\rho}}\)  \(\boldsymbol{f_{\mathrm{final}}}\)  Tcpu  Itr  \(\boldsymbol{\bar{\rho }}\)  \(\boldsymbol{f_{\mathrm{final}}}\)  
HS1  2  0  1  28  73  70  1  1.7825e − 18  0.02  24  1  6.5782e − 27 
HS3  2  0  1  6  7  8  1  2.3501e − 06  0.01  4  1  8.5023e − 09 
HS4  2  0  2  7  13  29  1  2.6667e + 00  0.01  4  1  2.6667e + 00 
HS5  2  0  4  5  13  47  1  −1.9132e + 00  0.01  6  1  −1.9132e + 00 
HS6  2  1  0  9  364  718  1  2.4199e − 07  0.03  7  2  0.0000e + 00 
HS7  2  1  0  8  15  28  32  −1.7320e + 00  0.01  9  2  −1.7321e + 00 
HS8  2  2  0  9  16  59  8,192  −1.0000e + 00  0.01  14  1  −1.0000e + 00 
HS9  2  1  0  18  34  66  8,192  −4.9985e − 01  0.02  10  1  −5.0000e + 01 
HS12  2  0  1  9  19  39  1  −3.0000e + 01  0.01  5  1  −3.0000e + 01 
HS24  2  0  5  16  29  179  1  −1.0000e + 00  0.02  14  1  −1.0000e + 00 
HS25  3  0  6  1  1  6  1  9.4934e − 31  0.01  62  1  1.8185e − 16 
HS26  3  1  0  16  76  142  2  1.6085e − 04  0.02  19  2  2.8430e − 12 
HS27  3  1  0  28  484  939  4  3.9958e − 02  0.05  14  32  4.0000e − 02 
HS28  3  1  0  11  38  71  1,024  7.5674e − 08  0.01  6  1  0.0000e + 00 
HS29  3  0  1  11  24  53  1  −2.2627e + 01  0.01  8  1  −2.2627e + 01 
HS30  3  0  7  7  10  63  1  1.0000e + 00  0.02  7  1  1.0000e + 00 
HS32  3  1  4  19  33  166  128  9.8818e − 01  0.02  24  4  1.0000e + 00 
HS33  3  0  6  15  20  189  1  −4.5178e + 00  0.02  29  1  −4.5858e + 00 
HS34  3  0  8  10  15  104  1  −8.3403e − 01  0.02  30  1  −0.8340e + 00 
HS36  3  0  7  10  15  144  1  −3.3000e + 03  0.02  10  1  −3.3000e + 03 
HS37  3  0  8  12  19  200  1  −3.4560e + 03  0.02  7  1  −3.4560e + 03 
HS38  4  0  8  73  153  1,218  1  1.9761e − 11  0.06  37  1  3.1594e − 24 
HS39  4  2  0  11  19  63  1  2.5328e − 04  0.02  19  4  −1.0000e + 00 
HS40  4  3  0  49  108  726  2  −2.5000e − 01  0.05  4  2  −2.500e + 00 
HS42  4  2  0  36  70  290  1,024  1.3883e + 01  0.03  6  4  1.3858e + 01 
HS43  4  0  3  12  29  73  1  −4.4000e + 01  0.02  9  1  −4.4000e + 01 
HS46  5  2  0  101  234  735  1  1.3088e − 04  0.05  25  2  6.6616e − 12 
HS47  5  3  0  21  54  276  1  2.0468e − 04  0.04  25  16  8.0322e − 14 
HS48  5  2  0  21  55  202  2,048  3.1361e − 09  0.02  6  4  0.0000e + 00 
HS49  5  2  0  51  87  276  64  1.1761e − 02  0.03  69  64  3.5161e − 12 
HS50  5  3  0  50  200  1,065  128  9.3190e − 05  0.04  11  512  4.0725e − 17 
HS51  5  3  0  29  132  722  256  2.2808e − 05  0.03  8  4  0.0000e + 00 
HS52  5  3  0  31  45  225  256  5.2930e + 00  0.03  4  8  5.3266e + 00 
HS53  5  3  10  36  69  1,694  256  4.0734e + 00  0.06  5  8  4.0930e + 00 
HS56  7  4  0  21  43  2,482  4  −2.6183e + 00  0.06  12  4  −3.4560e + 00 
HS57  2  0  3  34  53  141  1  2.8461e − 02  0.03  15  18  2.8460e − 02 
HS60  3  1  6  18  43  574  1  3.2650e − 02  0.04  7  1  3.2568e − 02 
HS61  3  2  0  16  255  986  256  −1.7195e + 02  0.03  44  128  −1.4365e + 02 
HS62  3  1  6  8  19  153  1  −2.6273e + 04  0.02  5  1  −2.6273e + 04 
HS63  3  2  3  15  27  200  1  9.6232e + 02  0.02  5  2  9.6172e + 02 
HS66  3  0  8  15  42  249  1  5.1816e − 01  0.02  1,000^{+}  1  5.1817e − 01 
HS70  4  0  9  16  22  214  1  1.0085e − 02  0.03  22  1  1.7981e − 01 
HS73  4  1  6  17  35  213  1  2.9896e + 01  0.03  16  1  2.9894e + 01 
HS77  5  2  0  21  141  587  1  4.5981e − 01  0.06  13  1  2.4151e − 01 
HS78  5  3  0  23  66  329  1  −2.9197e + 00  0.03  4  4  −2.9197e + 00 
HS79  5  3  0  16  26  123  128  7.8681e − 02  0.02  7  2  7.8777e − 02 
HS80  5  3  10  66  196  3,975  4  6.0149e − 02  0.14  6  2  5.3950e − 02 
HS81  5  3  10  19  37  708  8  6.4109e − 02  0.05  9  8  5.3950e − 02 
HS84  5  0  16  30  57  1,252  1  −5.2803e + 06  0.06  30  1  −5.2803e + 06 
HS93  6  0  8  21  43  1,387  1  1.3629e + 02  0.04  12  1  1.3508e + 02 
HS99  7  2  14  18  31  57  1  −8.3108e + 08  0.02  8  4  0.0000e + 00 
HS100  7  0  4  8  22  86  1  6.8063e + 02  0.02  9  1  6.8063e + 02 
HS107  9  6  8  41  67  1,086  1  1.3748e − 08  0.06  1,000^{+}  8,192  5.0545e + 38 
HS110  10  0  20  11  510  10,146  1  −4.3134e + 01  0.13  6  1  −4.5778e + 01 
HS111  10  3  20  26  264  6,542  1,024  −5.8531e + 01  0.14  1,000^{+}  1  −4.7760e + 01 
HS112  10  3  10  6  11  199  2  −5.3197e + 01  0.02  11  1  −4.7761e + 01 
HS113  10  0  8  21  48  519  1  2.4306e + 01  0.03  10  1  2.4306e + 01 
HS114  10  3  28  11  136  4,474  16  −1.3407e + 03  0.13  39  256  −1.7688e + 03 
HS118  15  0  59  34  51  2,554  1  6.6482e + 02  0.12       
Same as the way of counting the number of iterations in [1], due to only a little change at the right side vector of SLE (10) in the loop between Step 3(i) and Step 3(ii), which leads to low computational cost, the number of this loop is not counted in the total number of iterations Itr.
From Table 2 it is clear that, for almost all test problems, the two algorithms (Algorithms A and the one in [1]) have the same optimal objective value. Relatively speaking, it also shows that Algorithm A is a promising one in terms of the CPU time, the number of function evaluations Nf and the total number of function evaluations N.
In particular, the following four performances are worth to be mentioned. First, for HS66, HS107 and HS111, the algorithm [1] yields the associated \(f_{\mathrm{final}}\) after 1000 iterations for each problem, while Algorithm A needs only 15, 41 and 26 iterations, respectively. Second, for HS107, the two algorithms yield two large different final objective function values \(f_{\mathrm{final}}\), namely, 3,748e−08 and 5.0545e+38. Third, for HS118 with the same dimension as HS117, Algorithm A has a good numerical performance, while it is not reported in [1]. Fourth, for HS54, HS75, HS85 and HS117, Algorithm A fails to produce an invertible coefficient matrix after some iterations, then it cannot obtain the optimal objective value, so they are not listed in Table 2.
Output of Algorithm A for problem HS8
k  \(\boldsymbol{\rho _{k}}\)  \(\boldsymbol{x^{k}}\)  \(\boldsymbol{f(x^{k})}\)  \(\boldsymbol{\Vert \bar{d}^{k} \Vert } \)  \(\boldsymbol{\Vert g^{k}_{\ell } \Vert } \) 

0  1  (4, 2)  −1.00000e+00  5.09902e+00  
1  1  (4.56235, 1.91528)  −1.00000e+00  5.70399e−01  5.09902e+00 
2  2  (4.60502, 1.91721)  −1.00000e+00  1.02139e−01  5.79231e−01 
3  8  (4.60222, 1.94248)  −1.00000e+00  1.07010e−01  2.08008e−01 
4  128  (4.60158, 1.95367)  −1.00000e+00  1.83034e−01  7.60755e−02 
5  8,192  (4.60159, 1.95516)  −1.00000e+00  1.93245e−01  1.32456e−02 
6  8,192  (4.60159, 1.95575)  −1.00000e+00  5.86992e−04  4.14795e−03 
7  8,192  (4.60159, 1.95581)  −1.00000e+00  1.24840e−04  5.82513e−04 
8  8,192  (4.60159, 1.95583)  −1.00000e+00  2.60839e−05  2.36077e−04 
9  8,192  (4.60159, 1.95584)  −1.00000e+00  1.21999e−05  7.55776e−05 
6 Conclusions
In this paper, based on a simple and effective penalty parameter update rule and using the idea of primalpoint interior method, a primaldual interior point QPfree algorithm for nonlinear constrained optimization is proposed and analyzed. A ‘working set’ technique for estimating the active set is used in this work, then we need to solve only two or three reduced systems of linear equations with the same coefficient matrix at each iteration. Under suitable CQ and assumptions including a relaxed positive definite restriction on the Lagrangian Hessian estimate \(H_{k}\), but without the isolatedness of the stationary points, the proposed algorithm is globally and superlinearly convergent. Moreover, a slightly new computation technique for \(H_{k}\) based on second order derivative information is introduced such that the associated assumptions, i.e., the boundedness of \(\{H_{k}\}\), the relaxed positive definiteness and the 1sided projection second order approximation H5^{+}, are all (almost) satisfied. The numerical experiments based on the proposed computation technique for \(H_{k}\) show that the proposed algorithm is promising.
7 Results and discussion
In this work, a new primaldual interior point QPfree algorithm for nonlinear optimization with equality and inequality constraints is proposed. The global and superlinear convergence are analyzed. Some effective numerical results are reported. As further work, there are several interesting problems worthy of discussing. First, refer to [21], improve the algorithm such that it can start from an arbitrary initial point. Second, try to get rid of the strict complementarity condition. Third, apply the ideas in the paper to minimax optimization problems, engineering problems and so on.
Declarations
Acknowledgements
Project supported by the Natural Science Foundation of Guangxi Province (Nos. 2016GXNSFDA380019 and 2014GXNSFFA118001) and the Natural Science Foundation of China (Nos. 11771383 and 11561005).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Tits, AL, Wächter, A, Bakhtiari, S, Urban, TJ, Lawrence, CT: A primaldual interiorpoint method for nonlinear programming with strong global and local convergence properties. SIAM J. Optim. 14(1), 173199 (2003). doi:10.1137/S1052623401392123 MathSciNetView ArticleMATHGoogle Scholar
 Mayne, DQ, Polak, E: Feasible direction algorithms for optimization problems with equality and inequality constraints. Math. Program. 11, 6780 (1976) MathSciNetView ArticleMATHGoogle Scholar
 Gao, ZY, He, GP, Wu, F: Sequential systems of linear equations algorithm for nonlinear optimization problems with general constraints. J. Optim. Theory Appl. 95, 371397 (1997) MathSciNetView ArticleMATHGoogle Scholar
 Lian, SJ, Duan, YQ: Smoothing of the lowerorder exact penalty function for inequality constrained optimization. J. Inequal. Appl. 2016, 185 (2016) MathSciNetView ArticleMATHGoogle Scholar
 Jian, JB, Xu, QJ, Han, DL: A strongly convergent normrelaxed method of strongly subfeasible direction for optimization with nonlinear equality and inequality constraints. Appl. Math. Comput. 182, 854870 (2006) MathSciNetMATHGoogle Scholar
 Jian, JB, Tang, CM, Hu, QJ, Zheng, HY: A feasible descent SQP algorithm for general constrained optimization without strict complementarity. J. Comput. Appl. Math. 180(2), 391412 (2006) MathSciNetView ArticleMATHGoogle Scholar
 Herskovits, J: Feasible directions interiorpoint technique for nonlinear optimization. J. Optim. Theory Appl. 99, 121146 (1998) MathSciNetView ArticleMATHGoogle Scholar
 Boggs, PT, Tolle, JW, Wang, P: On the local convergence of quasiNewton methods for constrained optimization. SIAM J. Control Optim. 20, 161171 (1982) MathSciNetView ArticleMATHGoogle Scholar
 Boggs, PT, Tolle, JW: Sequential Quadratic Programming, vol. 4, pp. 151. Cambridge University Press, Cambridge (1995) MATHGoogle Scholar
 Gill, PE, Murray, W, Saunders, MA: SNOPT: an SQP algorithm for largescale constrained optimization. SIAM Rev. 47, 99131 (2005) MathSciNetView ArticleMATHGoogle Scholar
 Jian, JB, Tang, CM: An SQP feasible descent algorithm for nonlinear inequality constrained optimization without strict complementarity. Comput. Math. Appl. 49, 223238 (2005) MathSciNetView ArticleMATHGoogle Scholar
 Jian, JB, Zheng, HY, Hu, QJ, Tang, CM: A new normrelaxed method of strongly subfeasible direction for inequality constrained optimization. Appl. Math. Comput. 168, 128 (2005) MathSciNetMATHGoogle Scholar
 Jian, JB, Zheng, HY, Tang, CM, Hu, QJ: A new superlinearly convergent normrelaxed method of strong subfeasible direction for inequality constrained optimization. Appl. Math. Comput. 182, 955976 (2006) MathSciNetMATHGoogle Scholar
 Lawrence, CT, Tits, AL: A computationally efficient feasible sequential quadratic programming algorithm. SIAM J. Optim. 11, 10921118 (2001) MathSciNetView ArticleMATHGoogle Scholar
 Panier, ER, Tits, AL: A superlinearly convergent feasible method for the solution of inequality constrained optimization problems. SIAM J. Control Optim. 25, 934950 (1987) MathSciNetView ArticleMATHGoogle Scholar
 Panier, ER, Tits, AL: On combining feasibility, descent and superlinear convergence in inequality constrained optimization. Math. Program. 59, 261276 (1993) MathSciNetView ArticleMATHGoogle Scholar
 Spellucci, P: An SQP method for general nonlinear programs using only equality constrained subproblem. Math. Program. 82, 413448 (1998) MathSciNetMATHGoogle Scholar
 Bakhtiari, S, Tits, AL: A simple primaldual feasible interiorpoint method for nonlinear programming with monotone descent. Comput. Optim. Appl. 25, 1738 (2003) MathSciNetView ArticleMATHGoogle Scholar
 Elbakry, AS, Tapia, RA, Tsuchiya, T, Zhang, Y: On the formulation and theory of the Newton interiorpoint method for nonlinear programming. J. Optim. Theory Appl. 89, 507541 (1996) MathSciNetView ArticleMATHGoogle Scholar
 Forsgren, A, Gill, PE, Wright, MH: Interior methods for nonlinear optimization. SIAM Rev. 44, 525597 (2002) MathSciNetView ArticleMATHGoogle Scholar
 Jian, JB, Pan, HQ, Tang, CM, Li, JL: A strongly subfeasible primaldual quasi interiorpoint algorithm for nonlinear inequality constrained optimization. Appl. Math. Comput. 266, 560578 (2015) MathSciNetGoogle Scholar
 Panier, ER, Tits, AL, Herskovits, JN: A QPfree, globally convergent, locally superlinearly convergent algorithm for inequality constrained optimization. SIAM J. Control Optim. 26, 788811 (1988) MathSciNetView ArticleMATHGoogle Scholar
 Qi, HD, Qi, LQ: A new QPfree, globally convergent, locally superlinearly convergent algorithm for inequality constrained optimization. SIAM J. Optim. 11, 113132 (2000) MathSciNetView ArticleMATHGoogle Scholar
 Wang, YL, Chen, L, He, GP: Sequential systems of linear equations method for general constrained optimization problems without strict complementarity. J. Comput. Appl. Math. 182, 447471 (2005) MathSciNetView ArticleMATHGoogle Scholar
 Yang, YF, Li, DH, Qi, LQ: A feasible sequential linear equation method for inequality constrained optimization. SIAM J. Optim. 13, 12221244 (2003) MathSciNetView ArticleMATHGoogle Scholar
 Zhu, ZB: An interior point type QPfree algorithm with superlinear convergence for inequality constrained optimization. Appl. Math. Model. 31, 12011212 (2007) View ArticleMATHGoogle Scholar
 Maratos, N: Exact penalty function algorithm for finite dimensional and control optimization problems. Dissertation, Imperial College Science, Technology, University of London (1978) Google Scholar
 Cai, XZ, Wu, L, Yue, YJ, Li, MM, Wang, GQ: Kernelfunctionbased primaldual interiorpoint methods for convex quadratic optimization over symmetric cone. J. Inequal. Appl. 2014, 308 (2014) MathSciNetView ArticleMATHGoogle Scholar
 Facchinei, F, Fischer, A, Kanzow, C: On the accurate identification of active constraints. SIAM J. Optim. 9, 1432 (1998) MathSciNetView ArticleMATHGoogle Scholar
 Chen, L, Wang, YL, He, GP: A feasible active set QPfree method for nonlinear programming. SIAM J. Optim. 17, 401429 (2006) MathSciNetView ArticleMATHGoogle Scholar
 Liu, Y, Jian, JB, Zhu, ZB: New active set identification for general constrained optimization and minimax problems. J. Math. Anal. Appl. 421, 14051416 (2015) MathSciNetView ArticleMATHGoogle Scholar
 Wäachter, A, Biegler, LT: Failure of global convergence for a class of interior point methods for nonlinear programming. Math. Program. 88, 565574 (2000) MathSciNetView ArticleMATHGoogle Scholar
 Hock, W, Schittkowski, K: Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems., vol. 187. Springer, Heidelberg (1981) MATHGoogle Scholar