 Research
 Open Access
A class of derivativefree trustregion methods with interior backtracking technique for nonlinear optimization problems subject to linear inequality constraints
 Jing Gao^{1} and
 Jian Cao^{2}Email author
https://doi.org/10.1186/s1366001816987
© The Author(s) 2018
 Received: 17 January 2018
 Accepted: 26 April 2018
 Published: 9 May 2018
Abstract
This paper focuses on a class of nonlinear optimization subject to linear inequality constraints with unavailablederivative objective functions. We propose a derivativefree trustregion methods with interior backtracking technique for this optimization. The proposed algorithm has four properties. Firstly, the derivativefree strategy is applied to reduce the algorithm’s requirement for first or secondorder derivatives information. Secondly, an interior backtracking technique ensures not only to reduce the number of iterations for solving trustregion subproblem but also the global convergence to standard stationary points. Thirdly, the local convergence rate is analyzed under some reasonable assumptions. Finally, numerical experiments demonstrate that the new algorithm is effective.
Keywords
 Affine scaling
 Trustregion method
 Inequality constraints
 Derivativefree optimization
 Interior backtracking technique
MSC
 49M37
 65K05
 90C30
 90C51
1 Introduction
1.1 Affinescaling matrix for inequality constraints
Motivation
The above discussions illustrate that the affinescaling interiorpoint trustregion method is an effective way to solve the nonlinear optimization problems with inequality constraints. The trustregion frame guarantees the stable numerical performance. However, in Eqs. (4)–(6) the first and secondorder derivatives play important roles during the computational process, which maybe fail to solve the optimization problems like (1). If both the feasibility and the stability of the algorithm need to be guaranteed, we should consider the derivativefree trustregion methods.
1.2 Derivativefree technique for trustregion subproblem
Assumption
 (A1)
Suppose that a level set \(\mathcal{L}(x_{0})\) and a maximal radius \(\Delta_{\max}\) are given. Assume that f is twice continuously differentiable with Lipschitz continuous Hessian in an appropriate open domain containing the \(\Delta_{\max}\) neighborhood \(\bigcup_{x\in\mathcal{L}(x_{0})}B(x, \Delta_{\max})\) of the set \(\mathcal {L}(x_{0})\).
Definition 1
 1the error between the Hessian of the model \(m(x +\alpha p)\) and the Hessian of the function \(f(x +\alpha p)\) satisfies$$ \bigl\Vert \nabla^{2}f(x+\alpha p)\nabla ^{2}m(x+\alpha p) \bigr\Vert \leqslant\kappa_{eh} \alpha \Delta, \quad \forall p \in B(0,\Delta); $$(10)
 2the error between the gradient of the model \(m(x +\alpha p)\) and the gradient of the function \(f(x +\alpha p)\) satisfies$$ \bigl\Vert \nabla f(x+\alpha p)\nabla m(x +\alpha p) \bigr\Vert \leqslant\kappa_{eg} \alpha^{2} \Delta ^{2}, \quad \forall p \in B(0,\Delta); $$(11)
 3the error between the model \(m(x +\alpha p)\) and the function \(f(x +\alpha p)\) satisfies$$ \bigl\Vert f(x +\alpha p) m(x +\alpha p) \bigr\Vert \leqslant \kappa_{ef} \alpha^{3} \Delta^{3}, \quad \forall p \in B(0, \Delta). $$(12)

We use the derivatives of approximation function \(m(x_{k}+\alpha p)\) to replace the derivatives of objective function \(f(x_{k}+\alpha p)\) to reduce the algorithm’s requirement for gradient and Hessian of the iteration points. We solve an affinescaling trustregion subproblem to find a feasible search direction in each iteration.

In the kth iteration, a feasible search direction p is obtained from an affinescaling trustregion subproblem. Meanwhile, interior backtracking skill will be applied both for determining stepsize α and for guaranteeing the feasibility of iteration point.

We will show that the iteration points generated by the proposed algorithm could converge to the optimal points of (1).

Local convergence will be given under some reasonable assumptions.
This paper is organized as follows: we describe a class of derivativefree trustregion method in Sect. 2. The main results including global convergence property and local convergence rate will be discussed in Sect. 3. The numerical results will be illustrated in Sect. 4. Finally, we give some conclusions.
Notation
In this paper, \(\Vert \cdot \Vert \) is the 2norm for a vector and the induced 2norm for a matrix. \(B\subset\Re^{n}\) is a closed ball and \(B(x,\Delta)\) is the closed ball centered at x, with radius \(\Delta>0\). Y is a sample set and \(\mathcal{L}(x_{0})= \{ x\in\Re ^{n}f(x)\leqslant f(x_{0}),A x\geqslant b \} \) is the level set about the objective function f. We use the subscript \(f_{k}\) and subscript \(m_{k}\) to distinguish the relevant information between the original function and the approximate function. For example, \(H_{f_{k}}\) is the Hessian of f at kth iteration and \(H_{m_{k}}\) is the Hessian of \(m_{k}\) at kth iteration.
2 A derivativefree trust region method with interior backtracking technique
Remark 1
We add a backtracking interior linesearch technique in the algorithm. It is helpful to reducing the number of iterations. Equation (13a) is used to guarantee the descent property of \(f(x)\) and (13b) ensures the feasibility of \(x_{k}+\alpha_{k} p_{k}\).
Remark 2
Remark 3
3 Main results and discussion
In this section, we mainly discuss some properties about the proposed algorithm, including the discussion of the error bounds, the sufficiently descent property, the global and local convergence properties. First of all, we make some necessary assumptions as follows.
Assumptions
 (A2)
The level set \(\mathcal{L}(x_{0})\) is bounded.
 (A3)
There exist positive constants \(\kappa_{g_{f}}\) and \(\kappa_{g_{m}}\) such that \(\Vert \nabla{f_{k}} \Vert \leqslant\kappa_{g_{f}}\) and \(\Vert g_{k} \Vert \leqslant \kappa_{g_{m}}\), respectively, for all \(x_{k}\in\mathcal{L}(x_{0})\).
 (A4)
There exist positive constants \(\kappa_{H_{f}}\) and \(\kappa_{H_{g}}\) such that \(\Vert H_{f_{k}} \Vert \leqslant \kappa_{H_{f}}\) and \(\Vert H_{m_{k}} \Vert \leqslant\kappa _{H_{m}}\), respectively, for all \(x_{k}\in\mathcal{L}(x_{0})\).
 (A5)
\([ \begin{matrix} A & D_{k}^{\frac{1}{2}} \end{matrix} ] \) is full row rank for all \(x_{k}\in\mathcal{L}(x_{0})\).
3.1 Error bounds
Observe first that some error bounds hold immediately.
Lemma 1
Proof
Lemma 2
Proof
Lemma 3
Suppose that (A1)–(A5), the error bounds (10)–(12) and the fact that \(\Delta_{k}\leqslant\Delta_{\max}\) hold. If \(\Vert \nabla f_{k}^{T}h_{f_{k}} \Vert \neq0\), then step 3 of Algorithm 1 will stop in a finite number of improvement steps.
Proof
Now we should prove that \(\Vert \nabla f_{k}^{T}h_{f_{k}} \Vert \) must be zero if the loop of Algorithm 2 is infinite.
In fact, there are two cases could cause Algorithm 2 to be implemented. One is that \(m_{k}\) is not fully quadratic, the other is that the radius \(\Delta_{k}>\iota \Vert g_{k}^{T}h_{m_{k}} \Vert \). Then set \(m_{k}^{(0)}=m_{k}\), and improve the model to be fully quadratic on \(B(x_{k}, \Delta_{k})\), which denoted by \(m^{(1)}_{k}\). If \((g_{k}^{T}h_{m_{k}})^{(1)}\) of \(m_{k}^{(1)}\) satisfies the inequality \(\iota \Vert (g_{k}^{T}h_{m_{k}})^{(1)} \Vert \geqslant\Delta_{k}\), Algorithm 2 stops with \(\widetilde{\Delta}_{k}=\Delta_{k} \leqslant\iota \Vert (g_{k}^{T}h_{m_{k}})^{(1)} \Vert \).
Otherwise, \(\iota \Vert (g_{k}^{T}h_{m_{k}})^{(1)} \Vert <\Delta _{k} \) holds. Algorithm 2 will improve the model on \(B(x_{k},\omega\Delta_{k})\) and the resulting model is denoted by \(m^{(2)}_{k}\). If \(m^{(2)}_{k}\) satisfies \(\iota \Vert (g_{k}^{T}h_{m_{k}})^{(2)} \Vert \geqslant\omega\Delta_{k}\), the procedure stops. If not, the radius should be multiplied by ω and Algorithm 2 will improve the model on \(B(x_{k},\omega^{2} \Delta_{k})\), and go on.
3.2 Sufficiently descent property
Lemma 4
Suppose that (A1)–(A5) and the error bounds (10)–(12) hold. \(p_{k}\) is the solution of the trustregion subproblem (7). Then there must exist an appropriate \(\alpha_{k}>0\) which satisfied inequalities (13a).
Proof
We therefore see that it is reasonable to design linesearch step criterion in step 5, which provided us a nonincreasing sequence \(\{f(x_{k})\}\).
Lemma 5
Proof
3.3 Global convergence
Every iteration point in the \(k+1\)th iteration will be chosen on the region \(B(x_{k}, \alpha_{k} \Delta_{k})\). Following the lemma one first shows that the current iteration must be successful if \(\alpha_{k} \Delta_{k}\) is small enough.
Lemma 6
Proof
Lemma 7
Proof
We consider that all the modelimproving iterations before \(m_{k}\) becomes fully quadratic are less than a constant N. Suppose that the current iteration is an iteration after a successful one. It means that an infinite number of iterations are acceptable or not nice. In these two cases, \(\Delta_{k}\) is shrinking. Furthermore, \(\Delta_{k}\) is reduced by a factor ζ at least once every N iterations, which implies \(\Delta_{k}\rightarrow0\).
Lemma 8
Proof
Lemma 9
Proof
Then we obtain the global convergence derived from Lemmas 8 and 9.
Theorem 1
The above theorem shows us there exists a limit point that is firstorder critical. In fact, we are able to prove that all limit points of the sequence of iterations are firstorder critical.
Theorem 2
Proof
3.4 Local convergence
Having proved the global convergence, we now focus on the speed of the local convergence. For this motivation, more acceptable assumptions are given as follows.
Assumptions
 (A6)\(x_{*}\) is the solution of problem (1), which satisfies the strong secondorder sufficient condition, that is, let the columns of \(Z_{*}\) denote an orthogonal basis for the null space of \([ \begin{matrix} A & D^{\frac{1}{2}}_{*} \end{matrix} ] \), then there exists \(\varpi>0\) such that$$ d^{T}({Z_{*}}M_{f_{*}}{Z_{*}})d \geqslant\varpi \Vert d \Vert ^{2}, \quad \forall d. $$(33)
 (A7)LetThis means that for large k$$ \lim_{k \rightarrow\infty}\frac{ \Vert (M_{m_{k}}M_{f_{k}})Z_{k} p_{k} \Vert }{ \Vert p_{k} \Vert }=0. $$(34)$$\begin{aligned} p_{k}^{T} \bigl(Z_{k}^{T} M_{m_{k}}Z_{k} \bigr)p_{k} =&p_{k}^{T} \bigl(Z_{k}^{T}M_{f_{k}}Z_{k} \bigr)p_{k}+o \bigl( \Vert p_{k} \Vert ^{2} \bigr). \end{aligned}$$
Theorem 3
Suppose that (A1)–(A7), the error bounds (10)–(12) and (23) hold. \(\{x_{k}\}\) is a sequence generated by Algorithm 1. Suppose furthermore that the strict complementarity of the problem (1) holds. Then, for sufficiently large k, the stepsize \(\alpha_{k}\equiv1\) and there exists \(\hat{\Delta}>0\) such that \(\Delta_{k}\geqslant\Delta_{K'}\geqslant\hat{\Delta}\), \(\forall k\geqslant K'\), where \(K'\) is a large enough index.
Proof
If \(\Vert p_{k} \Vert <\Delta_{k}\), then \(v_{m_{k}}=0\). Since the strict complementarity of the problem (1) holds at every limit point of \(\{x_{k}\}\), i.e., \(\vert \lambda _{m_{k+1}}^{T}j \vert + \vert a_{j}^{T}x_{k}b_{j} \vert >0\), for all large k, \(\lambda_{m_{k+1}}=\lambda_{m_{k+1}}^{N}>0\) when \(v_{m_{k}}=0\). So, \(\lambda_{m_{k+1}}^{j}=(\lambda_{m_{k+1}}^{N})^{j}>0\). From (35), it is clear that \(\lim_{k \rightarrow\infty}\alpha_{k}=1\).
From the above, we have found that if \(\Vert g_{k}^{T}h_{m_{k}} \Vert \geqslant\varepsilon^{2}\) holds and \(\Delta_{k}\rightarrow 0\), we conclude that \(\lim_{k\rightarrow\infty}\alpha_{k}=+\infty\), and \(\lim_{k\rightarrow\infty}\theta_{k}=1\).
Further, by the condition on the strictly feasible stepsize \(\theta_{k}1=O( \Vert p_{k} \Vert )\), and \(\lim_{k\rightarrow \infty}p_{k}=0\), we have \(\lim_{k\rightarrow\infty}\theta_{k}=1\).
For a similar proof, we can obtain \(p_{k}\rightarrow0\). Combining (37) with (38), one has the fact that \(\rho_{k}\rightarrow1\). Hence there exists \(\hat{\Delta}>0\) such that when \(\Vert p_{k} \Vert \leqslant\hat{\Delta}\), \(\hat{\rho}_{k}\geqslant\rho_{k} \geqslant\eta_{2}\), and \(\Delta _{k+1}\geqslant\Delta_{k}\). As \(p_{k}\rightarrow0\), there exists an index \(K'\) such that \(\Vert p_{k} \Vert \leqslant\hat{\Delta}\) whenever \(k\geqslant K'\). Thus, the conclusion holds. □
Theorem 3 implies that the local convergence rate of Algorithm 1 depends on the Hessian at \(x_{*}\) and the local convergence rate of \(p_{k}\). Meanwhile, if \(p_{k}\) is a quasiNewton step, for sufficiently large k, the sequence \(\{x_{k}\}\) will reach a superlinear local convergence rate to the optimal point \(x_{*}\).
4 Numerical experiments
We now demonstrate the experiment performance of the proposed derivativefree trustregion method.
Environment: The algorithms are written in Matlab R2009a and run on a PC with 2.66 GHz Intel(R) Core(TM)2 Quad CPU and 4 G DDR2.
Initialization: The values \(\Delta_{0}=2\), \(\eta_{0}=0.25\), \(\eta_{1}=0.75\), \(\zeta=0.5\), \(\varsigma=1.5\), \(\iota=0.5\), \(\beta=0.25\), \(\alpha=0.2\), \(\varepsilon= 10^{8}\) and \(\omega=0.3\) are used. \(\Delta_{\max}\) is equal to 4, 6, 8, respectively.
Termination criteria: \(\Vert g_{k} ^{T}h_{m_{k}} \Vert \leqslant\varepsilon\).
Test problems
No.  Problem  Dim  \(x_{0}\) 

1  HS21  2  [−1,−1] 
3  HS25  3  [100,12.5,3] 
5  HS36  3  [10,10,10] 
7  HS44  4  [0,0,0,0] 
9  HS76  4  [0.5,0.5,0.5,0.5] 
11  HS231  2  [−1.2,1] 
13  HS224  2  [0.1,0.1] 
15  HS250  3  [10,10,10] 
17  HS253  3  [0,2,0] 
19  HS331  2  [0.5,0.1] 
2  HS24  2  [1,0.5] 
4  HS35  3  [0.5,0.5,0.5] 
6  HS37  3  [10,10,10] 
8  HS45  5  [2,2,2,2,2] 
10  HS224  2  [0.1,0.1] 
12  HS232  2  [2,0.5] 
14  HS232  2  [2,0.5] 
16  HS251  3  [10,10,10] 
18  HS268  5  [1,1,…,1] 
20  HS340  3  [1,1,1] 
Of course, we will use the level set to limit the bound of \(\Vert \nabla f(x) \Vert \) during program execution, which will be much smaller than this value. Even if the boundedness of the gradient and of the Hessian of the objective functions cannot be satisfied at the same time, at least the boundedness within the level set can be guaranteed.
Experiment results on linear inequality constrained optimization problems
Problem name  Results  

\(\Delta_{\max}=4\)  \(\Delta_{\max}=6\)  \(\Delta_{\max}=8\)  
n  nf  CPUt  nf  CPUt  nf  CPUt  
HS224  2  26  5.187  23  3.35  23  3.35 
HS231  2  16  2.025  18  4.018  F  F 
HS232  2  8  2.387  23  2.455  F  F 
HS250  3  12  55  17  73  16  61 
HS251  3  35  3.036  32  2.022  37  2.332 
5 Conclusions
 (1)
This algorithm is mainly designed to solve the unavailable derivatives optimization problems in engineering. The proposed algorithm adopts interior backtracking technique and possesses the trustregion property.
 (2)
The global convergence is proved by using the definition of fully quadratic. It shows that the iteration points generated by the proposed algorithm could converge to the optimal points of (1). Meanwhile, we get the result that the local convergence rate of the proposed algorithm depends on \(p_{k}\). If \(p_{k}\) becomes the quasiNewton step, then the sequence \(x_{k}\) generated by the algorithm converges to \(x_{*}\) superlinearly.
 (3)
The preliminary numerical experiments verify the new algorithm we proposed is feasible and effective for solving unavailablederivative linear inequality constrained optimization problems.
Declarations
Acknowledgements
This work is supported by the National Science Foundation of China under Grant No. 11626037, 13th fiveyear Science and Technology Project of Education Department of Jilin Province under Grant No. JJKH20170036KJ, the PhD Startup Fund of Natural Science Foundation of Beihua University and Youth Training Project Foundation of Beihua University.
Authors’ contributions
All authors contributed equally and significantly in writing this article. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Kanzow, C., Klug, A.: An interiorpoint affinescaling trustregion method for semismooth equations with box constraints. Comput. Optim. Appl. 37(3), 329–353 (2007) MathSciNetView ArticleMATHGoogle Scholar
 Kanzow, C., Klug, A.: On affineacaling interiorpoint Newton methods for nonlinear minimization with bound constraints, Comput. Optim. Appl. 35(2), 177–197 (2006) MathSciNetView ArticleMATHGoogle Scholar
 Heinkenschloss, M., Ulbrich, M., Ulbrich, S.: Superlinear and quadratic convergence of affinescaling interiorpoint Newton methods for problems with simple bounds without strict complementarity assumption. Math. Program. 86(3), 615–635 (1999) MathSciNetView ArticleMATHGoogle Scholar
 Liuzzi, G., Lucidi, S., Sciandrone, M.: Sequential penalty derivativefree methods for nonlinear constrained optimization. SIAM J. Optim. 20(5), 2614–2635 (2010) MathSciNetView ArticleMATHGoogle Scholar
 Coleman, T.F., Li, Y.: A trust region and affine scaling interior point method for nonconvex minimization with linear inequality constraints. Math. Program. 88(1), 1–31 (1997) MathSciNetView ArticleMATHGoogle Scholar
 Zhu, D.: A new affine scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints. J. Comput. Appl. Math. 161(1), 1–25 (2003) MathSciNetView ArticleMATHGoogle Scholar
 Sahu, D.R., Yao, J.C.: A generalized hybrid steepest descent method and applications. J. Nonlinear Var. Anal. 1, 111–126 (2017) Google Scholar
 Gibali, A.: Two simple relaxed perturbed extragradient methods for solving variational inequlities in Euclidean spaces. J. Nonlinear Var. Anal. 2, 49–61 (2018) View ArticleGoogle Scholar
 Zhang, H., Conn, A.R., Scheinberg, K.: A derivativefree algorithm for leastsquares minimization. SIAM J. Optim. 20(6), 3555–3576 (2010) MathSciNetView ArticleMATHGoogle Scholar
 Zhang, H., Conn, A.R.: On the local convergence of a derivativefree algorithm for leastsquares minimization. Comput. Optim. Appl. 51(2), 481–507 (2012) MathSciNetView ArticleMATHGoogle Scholar
 Liuzzi, G., Lucidi, S., Rinaldi, F.: A derivativefree approach to constrained multiobjective nonsmooth optimization. SIAM J. Optim. 26(4), 2744–2774 (2016) MathSciNetView ArticleMATHGoogle Scholar
 Tung, L.T.: Higherorder contingent derivative of perturbation maps in multiobjective optimization. J. Nonlinear Funct. Anal. 2015, 19 (2015) Google Scholar
 Conn, A.R., Scheinberg, K., Vicente, L.N.: Global convergence of general derivativefree trustregion algorithms to first and secondorder critical points. SIAM J. Optim. 20(1), 387–415 (2006) MathSciNetView ArticleMATHGoogle Scholar
 Jing, G., Zhu, D.: An affine scaling derivativefree trust region method with interior backtracking technique for boundedconstrained nonlinear programming. J. Syst. Sci. Complex. 27(3), 537–564 (2014) MathSciNetView ArticleMATHGoogle Scholar
 Hock, W., Schittkowski, K.: Test Examples for Nonlinear Programming Codes. Springer, Bayreuth (1987) MATHGoogle Scholar
 Schittkowski, K.: More test examples for nonlinear programming codes (1987) Google Scholar
 Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002) MathSciNetView ArticleMATHGoogle Scholar