 Research
 Open access
 Published:
A trust region spectral method for largescale systems of nonlinear equations
Journal of Inequalities and Applications volumeÂ 2016, ArticleÂ number:Â 174 (2016)
Abstract
The spectral gradient method is one of the most effective methods for solving largescale systems of nonlinear equations. In this paper, we propose a new trust region spectral method without gradient. The trust region technique is a globalization strategy in our method. The global convergence of the proposed algorithm is proved. The numerical results show that our new method is more competitive than the spectral method of La Cruz et al. (Math. Comput. 75(255):14291448, 2006) for largescale nonlinear equations.
1 Introduction
In this paper we introduce a trust region spectral method for solving largescale systems of nonlinear equations
where \(F: R^{n}\rightarrow R^{n}\) is continuously differentiable and its Jacobian matrix \(J(x)\in R^{n\times n}\) is sparse, n is large. Largescale systems of nonlinear equations have been widely applied in many aspects, such as networkflow problems, discrete boundary value problems, etc.
Many algorithms have been presented for solving the largescale problem (1). Bouaricha et al. [2] proposed tensor methods. Bergamaschi et al. [3] proposed inexact quasiNewton methods. The above methods need to calculate the Jacobian matrix or an approximation of it at each iteration. La Cruz and Raydan [4] introduced the spectral method for (1). The method uses the residual \(\pm F(x_{k})\) as a search direction and the trial point at each iteration is \(x_{k}\lambda_{k}F(x_{k})\), where \(\lambda _{k}\) is a spectral coefficient. \(\lambda_{k}\) satisfies the GrippoLamparielloLucidi (GLL) line search condition
where \(f(x)= \frac{1}{2}\Vert F(x)\Vert ^{2}\), M is a nonnegative integer, Î± is a small positive number and \(d_{k}=\pm F(x_{k})\). This method also requires one to compute a directional derivative or a very good approximation of it at every iteration. Later La Cruz et al. [1] proposed a spectral method without gradient information, which uses a nonmonotone line search globalization strategy
where \(\sum_{k} \eta_{k} \leq\eta<\infty\). Meanwhile, conjugate gradient techniques have been developed for solving largescale nonlinear equations (see [5â€“7]). In fact, spectral gradient, BFGS quasiNewton, and conjugate gradient methods can solve largescale optimization problems and systems of nonlinear equations (see [8â€“13]). The advantage of spectral methods is that the storage of certain matrices associated with the Hessian of objective functions can be avoided.
The purpose of this paper is to extend the spectral method for solving largescale systems of nonlinear equations by using the trust region technique. For the traditional trust region methods [14], at each iterative point \(x_{k}\), the trial step \(d_{k}\) is obtained by solving the following trust region subproblem:
where \(q_{k}(d)=\frac{1}{2}\Vert F(x_{k})+J(x_{k})d\Vert ^{2}\).
The above trust region methods are particularly effective for small to mediumsized systems of nonlinear equations; however, the computation and storage loads can greatly increase with increased dimension.
For the largescale problems of nonlinear equations, we use \(\gamma_{k} I\) as an approximation of \(J(x_{k})\). At each iterative point \(x_{k}\) in our method, the trial step \(d_{k}\) is obtained by solving the following subproblem:
where \(\gamma_{k}\) is the spectral coefficient and \(F_{k}=F(x_{k})\). The classic quasiNewton equation is
In (6), we leftmultiply \(y_{k}^{T}\) and set \(B_{k+1}=\gamma _{k+1}I\), it follows that
where \(d_{k}=x_{k+1}x_{k}\) and \(y_{k}=F_{k+1}F_{k}\).
The paper is organized as follows. SectionÂ 2 introduces the new algorithm. The convergence theory is presented in SectionÂ 3. SectionÂ 4 demonstrates preliminary numerical results on test problems.
2 New algorithm
In this section, we give a trust region spectral method for solving largescale systems of nonlinear equations. Let \(d_{k}\) be the solution of the trust region subproblem (5). We define the actual reduction as
the predict reduction as
Now we present our algorithm for solving (1). The algorithm is given as follows.
Algorithm 1
 StepÂ 0.:

Choose \(0<\eta_{1} <\eta_{2} < 1\), \(0<\beta_{1}<1 <\beta_{2}\), \(\epsilon>0\). Initialize \(x_{0}\), \(0<\Delta_{0} <\bar{\Delta}\). Set \(k:=0\).
 StepÂ 1.:

Evaluate \(F_{k}\), if \(\Vert F_{k}\Vert \leq\epsilon\), then terminate.
 StepÂ 2.:

Solve the trust region subproblem (5) to obtain \(d_{k}\).
 StepÂ 3.:

Compute
$$ r_{k} = \frac{\mathit{Ared}_{k}(d_{k})}{\mathit{Pred}_{k}(d_{k})}. $$(10)If \(r_{k} < \eta_{1}\), then \(\Delta_{k}= \beta_{1} \Delta_{k}\), go to StepÂ 2. Otherwise, go to StepÂ 4.
 StepÂ 4.:

\(x_{k+1}=x_{k} + d_{k}\);
$$\Delta_{k + 1} = \textstyle\begin{cases} \min\{\beta_{2} \Delta_{k}, \bar{\Delta}\}, & \text{if } r_{k} \geq\eta_{2},\\ \Delta_{k}, & \text{otherwise}. \end{cases} $$Compute \(\gamma_{k+1}\) by (7). Set \(k:=k+1\), go to StepÂ 1.
3 Convergence analysis
In this section, we prove the global convergence of AlgorithmÂ 1. The global convergence of AlgorithmÂ 1 needs the following assumptions.
Assumption A

(1)
The level set \(\Omega=\{x\in R^{n} \vert f(x)\leq f(x_{0}) \} \) is bounded.

(2)
The following relation holds:
$$\bigl\Vert [J_{k}\gamma_{k}I]^{T} F_{k}\bigr\Vert =O\bigl(\Vert d_{k}\Vert \bigr). $$
Then we get the following lemmas.
Lemma 3.1
\(\vert \mathit{Ared}_{k}(d_{k})\mathit{Pred}_{k}(d_{k})\vert =O(\Vert d_{k}\Vert ^{2})\).
Proof
This completes the proof.â€ƒâ–¡
Similar to Zhang and Wang [15], or Yuan et al. [16], we obtain the following result.
Lemma 3.2
If \(d_{k}\) is a solution of (5), then
Proof
Since \(d_{k}\) is a solution of (5), for any \(\alpha\in[0,1]\), it follows that
Then we have
The proof is complete.â€ƒâ–¡
Lemma 3.3
AlgorithmÂ 1 does not circle between StepÂ 2 and StepÂ 3 infinitely.
Proof
If AlgorithmÂ 1 circles between StepÂ 2 and StepÂ 3 infinitely, then for all \(i=1,2,\ldots\)â€‰, we have \(x_{k+i}=x_{k}\), and \(\Vert F_{k}\Vert > \epsilon\), which implies that \(r_{k} < \eta_{1}\), \(\Delta_{k}\to0\).
By Lemmas 3.1 and 3.2, we have
Therefore, for k sufficiently large
this contradicts the fact that \(r_{k}<\eta_{1}\).â€ƒâ–¡
Lemma 3.4
Let AssumptionÂ A hold and \(\{x_{k}\}\) be generated by AlgorithmÂ 1, then \(\{x_{k}\}\subset\Omega\). Moreover, \(\{ f(x_{k})\}\) converges.
Proof
By the definition of AlgorithmÂ 1, we have
This implies
Therefore, \(\{x_{k}\}\subset\Omega\). According to \(f(x_{k})\ge0\), we know that \(\{f(x_{k})\}\) converges.â€ƒâ–¡
The following theorem shows that AlgorithmÂ 1 is global convergent under the conditions of AssumptionÂ A.
Theorem 3.5
Let AssumptionÂ A hold, \(\{x_{k}\}\) be generated by AlgorithmÂ 1. Then the algorithm either stops finitely or generates an infinite sequence \(\{x_{k}\}\) such that
Proof
Assume that AlgorithmÂ 1 does not stop after finite steps. Now we suppose that (17) does not hold, then there exist a constant \(\varepsilon> 0\) and a subsequence \(\{k_{j}\}\) satisfying
Let \(K=\{k\vert \Vert F_{k}\Vert \ge\varepsilon\}\).
Let \(S_{0} =\{k \vert r_{k} \geq\eta_{2} \}\). Using AlgorithmÂ 1 and LemmaÂ 3.2, we have
By LemmaÂ 3.4, we know that \(\{f(x_{k})\}\) is convergent, then
Thus, we have
From StepsÂ 34 of AlgorithmÂ 1 it follows that
for all \(k\notin S_{0}\), thus (19) means
Therefore there exists \(x^{*}\) such that
By (21), we have \(\Delta_{k}\rightarrow0\), which implies
for all sufficiently large k. The fact that \(\vert \mathit{Ared}_{k}(d_{k})  \mathit{Pred}_{k}(d_{k})\vert = O(\Vert d_{k}\Vert ^{2})\) indicates that
which shows that, for sufficiently large k and \(k\in K\),
The above inequality contradicts (20). Thus, the conclusion follows.â€ƒâ–¡
4 Numerical experiments
In this section, the recent spectral method in [1] is called AlgorithmÂ 2. We report results of some numerical experiments of Algorithms 1 and 2. We choose 14 test functions as follows (see [4, 6, 17]).
Function 1
The trigonometric function
Initial guess: \(x_{0}=(\frac{1}{n},\ldots,\frac{1}{n})^{T}\).
Function 2
The discretized twopoint boundary value problem
when A is the \(n\times n\) tridiagonal matrix given by
and \(\Phi=(\Phi_{1}(x),\Phi_{2}(x),\ldots,\Phi_{n}(x))^{T}\) with \(\Phi _{i}(x)=\sin x_{i}1\), \(i=1,2,\ldots,n\).
Initial guess: \(x_{0}=(50,0,\ldots,50,0)\).
Function 3
The Broyden tridiagonal function
Initial guess: \(x_{0}=(1,\ldots,1)^{T}\).
Function 4
The Broyden banded function
Initial guess: \(x_{0}=(1,\ldots,1)^{T}\).
Function 5
The variable dimensioned function
Initial guess: \(x_{0}=(1\frac{1}{n},1\frac{2}{n},\ldots,0)^{T}\).
Function 6
The discrete boundary value function
Initial guess: \(x_{0}=(h(h1),h(2h1),\ldots,h(nh1))^{T}\).
Function 7
The logarithmic function
Initial guess: \(x_{0}=(1,1,\ldots,1)^{T}\).
Function 8
The strictly convex function
Initial guess: \(x_{0}=(\frac{1}{n},\frac{2}{n},\ldots,1)^{T}\).
Function 9
The exponential function
Initial guess: \(x_{0}=(\frac{n}{n1},\frac{n}{n1},\ldots,\frac {n}{n1})^{T}\).
Function 10
The extended Rosenbrock function (n is even). For \(i=1,2,\ldots,n/2\),
Initial guess: \(x_{0}=(1.2,1,\ldots,1.2,1)^{T}\).
Function 11
The singular function
Initial guess: \(x_{0}=(1,1,\ldots,1)^{T}\).
Function 12
The trigexp function
Initial guess: \(x_{0}=(0,0,\ldots,0)^{T}\).
Function 13
The extended Freudentein and Roth function (n is even). For \(i=1,2,\ldots,n/2\),
Initial guess: \(x_{0}=(6,3,\ldots,6,3)^{T}\).
Function 14
The Troech problem
Initial guess: \(x_{0}=(0,0,\ldots,0)^{T}\).
In the experiments, the parameters are chosen as \(\Delta_{0}=1\), \(\bar {\Delta}=10\), \(\epsilon=10^{5}\), \(\eta_{1}=0.001\), \(\eta_{2}=0.75\), \(\beta_{1} =0.5\), \(\beta_{2} =2.0\), \(M=10\), \(\eta_{k}=1/(k+1)^{2}\), \(\alpha=0.5\), where Ïµ is the stop criterion. The program is also stopped if the iteration number is larger than 5,000. We obtain \(d_{k}\) by (5) from the Dogleg method in [18]. The program is coded in MATLAB 2009a.
To show the performance of two algorithms, we use the performance profile proposed by Dolan and MorÃ© [19]. The dimensions of 14 test functions are 100, 1,000, 10,000. According to the numerical results, we plot two figures based on the total number of iterations and the CPU time, respectively.
FigureÂ 1 shows that our algorithm is slightly better than AlgorithmÂ 2 on the total number of iterations for \(n=100\). FiguresÂ 2 and 3 indicate that two algorithms have no large discrepancies on the total number of iterations for \(n=1{,}000,10{,}000\). From FiguresÂ 46, it is easy to see that our algorithm performs better than AlgorithmÂ 2 does on the CPU time for 14 test problems. Preliminary numerical results show that the performance of our algorithm is notable.
References
La Cruz, W, Jose, MM, Marcos, R: Spectral residual method without gradient information for solving largescale nonlinear systems of equations. Math. Comput. 75(255), 14291448 (2006)
Bouaricha, A, Schnabel, RB: Tensor methods for large sparse systems of nonlinear equations. Math. Program. 82(3), 377400 (1998)
Bergamaschi, L, Moret, I, Zilli, G: Inexact quasiNewton methods for sparse systems of nonlinear equations. Future Gener. Comput. Syst. 18(1), 4153 (2001)
La Cruz, W, Raydan, M: Nonmonotone spectral methods for largescale nonlinear systems. Optim. Methods Softw. 18(5), 583599 (2003)
Li, Q, Li, D: AÂ class of derivativefree methods for largescale nonlinear monotone equations. IMA J. Numer. Anal. 31, 16251635 (2011)
Yuan, GL, Zhang, MJ: AÂ threeterms PolakRibiÃ¨Polyak conjugate gradient algorithm for largescale nonlinear equations. J.Â Comput. Appl. Math. 286, 186195 (2015)
Yuan, GL, Meng, ZH, Li, Y: AÂ modified Hestenes and Stiefel conjugate gradient algorithm for largescale nonsmooth minimizations and nonlinear equations. J.Â Optim. Theory Appl. 168, 129152 (2016)
Barzilai, J, Borwein, JM: Twopoint step size gradient methods. IMA J. Numer. Anal. 8(1), 141148 (1988)
Birgin, EG, Martinez, JM, Raydan, M: Inexact spectral projected gradient methods on convex sets. IMA J. Numer. Anal. 23(4), 539559 (2003)
Dai, YH, Zhang, H: Adaptive twopoint stepsize gradient algorithm. Numer. Algorithms 27(4), 377385 (2001)
Dai, YH: Modified twopoint stepsize gradient methods for unconstrained optimization. Comput. Optim. Appl. 22(1), 103109 (2002)
Raydan, M: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Control Optim. 7(1), 2633 (1997)
Yuan, GL, Lu, XW, Wei, ZX: BFGS trustregion method for symmetric nonlinear equations. J.Â Comput. Appl. Math. 230, 4458 (2009)
Yuan, YX: Trust region algorithm for nonlinear equations. Information 1, 721 (1998)
Zhang, JL, Wang, Y: AÂ new trust region method for nonlinear equations. Math. Methods Oper. Res. 58(2), 283298 (2003)
Yuan, GL, Wei, ZX, Liu, XW: AÂ BFGS trustregion method for nonlinear equations. Computing 92(4), 317333 (2011)
MorÃ©, JJ, Garbow, BS, Hillstrom, KE: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 1741 (1981)
Wang, YJ, Xiu, NH: Theory and Algorithm for Nonlinear Programming. Shanxi Science and Technology Press, Xiâ€™an (2004)
Dolan, ED, MorÃ©, JJ: Benchmarking optimization software with performance profiles. Math. Program. 91, 201213 (2002)
Acknowledgements
We thank the reviewers and the editors for their valuable suggestions and comments which improve this paper greatly. This work is supported by the Science and Technology Foundation of the Department of Education of Hubei Province (D20152701) and the Foundations of Education Department of Anhui Province (KJ2016A651; 2014jyxm161).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authorsâ€™ contributions
The two authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Zeng, M., Zhou, G. A trust region spectral method for largescale systems of nonlinear equations. J Inequal Appl 2016, 174 (2016). https://doi.org/10.1186/s136600161117x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s136600161117x