- Research
- Open access
- Published:
A trust region spectral method for large-scale systems of nonlinear equations
Journal of Inequalities and Applications volume 2016, Article number: 174 (2016)
Abstract
The spectral gradient method is one of the most effective methods for solving large-scale systems of nonlinear equations. In this paper, we propose a new trust region spectral method without gradient. The trust region technique is a globalization strategy in our method. The global convergence of the proposed algorithm is proved. The numerical results show that our new method is more competitive than the spectral method of La Cruz et al. (Math. Comput. 75(255):1429-1448, 2006) for large-scale nonlinear equations.
1 Introduction
In this paper we introduce a trust region spectral method for solving large-scale systems of nonlinear equations
where \(F: R^{n}\rightarrow R^{n}\) is continuously differentiable and its Jacobian matrix \(J(x)\in R^{n\times n}\) is sparse, n is large. Large-scale systems of nonlinear equations have been widely applied in many aspects, such as network-flow problems, discrete boundary value problems, etc.
Many algorithms have been presented for solving the large-scale problem (1). Bouaricha et al. [2] proposed tensor methods. Bergamaschi et al. [3] proposed inexact quasi-Newton methods. The above methods need to calculate the Jacobian matrix or an approximation of it at each iteration. La Cruz and Raydan [4] introduced the spectral method for (1). The method uses the residual \(\pm F(x_{k})\) as a search direction and the trial point at each iteration is \(x_{k}-\lambda_{k}F(x_{k})\), where \(\lambda _{k}\) is a spectral coefficient. \(\lambda_{k}\) satisfies the Grippo-Lampariello-Lucidi (GLL) line search condition
where \(f(x)= \frac{1}{2}\Vert F(x)\Vert ^{2}\), M is a nonnegative integer, α is a small positive number and \(d_{k}=\pm F(x_{k})\). This method also requires one to compute a directional derivative or a very good approximation of it at every iteration. Later La Cruz et al. [1] proposed a spectral method without gradient information, which uses a nonmonotone line search globalization strategy
where \(\sum_{k} \eta_{k} \leq\eta<\infty\). Meanwhile, conjugate gradient techniques have been developed for solving large-scale nonlinear equations (see [5–7]). In fact, spectral gradient, BFGS quasi-Newton, and conjugate gradient methods can solve large-scale optimization problems and systems of nonlinear equations (see [8–13]). The advantage of spectral methods is that the storage of certain matrices associated with the Hessian of objective functions can be avoided.
The purpose of this paper is to extend the spectral method for solving large-scale systems of nonlinear equations by using the trust region technique. For the traditional trust region methods [14], at each iterative point \(x_{k}\), the trial step \(d_{k}\) is obtained by solving the following trust region subproblem:
where \(q_{k}(d)=\frac{1}{2}\Vert F(x_{k})+J(x_{k})d\Vert ^{2}\).
The above trust region methods are particularly effective for small to medium-sized systems of nonlinear equations; however, the computation and storage loads can greatly increase with increased dimension.
For the large-scale problems of nonlinear equations, we use \(\gamma_{k} I\) as an approximation of \(J(x_{k})\). At each iterative point \(x_{k}\) in our method, the trial step \(d_{k}\) is obtained by solving the following subproblem:
where \(\gamma_{k}\) is the spectral coefficient and \(F_{k}=F(x_{k})\). The classic quasi-Newton equation is
In (6), we left-multiply \(y_{k}^{T}\) and set \(B_{k+1}=\gamma _{k+1}I\), it follows that
where \(d_{k}=x_{k+1}-x_{k}\) and \(y_{k}=F_{k+1}-F_{k}\).
The paper is organized as follows. Section 2 introduces the new algorithm. The convergence theory is presented in Section 3. Section 4 demonstrates preliminary numerical results on test problems.
2 New algorithm
In this section, we give a trust region spectral method for solving large-scale systems of nonlinear equations. Let \(d_{k}\) be the solution of the trust region subproblem (5). We define the actual reduction as
the predict reduction as
Now we present our algorithm for solving (1). The algorithm is given as follows.
Algorithm 1
- Step 0.:
-
Choose \(0<\eta_{1} <\eta_{2} < 1\), \(0<\beta_{1}<1 <\beta_{2}\), \(\epsilon>0\). Initialize \(x_{0}\), \(0<\Delta_{0} <\bar{\Delta}\). Set \(k:=0\).
- Step 1.:
-
Evaluate \(F_{k}\), if \(\Vert F_{k}\Vert \leq\epsilon\), then terminate.
- Step 2.:
-
Solve the trust region subproblem (5) to obtain \(d_{k}\).
- Step 3.:
-
Compute
$$ r_{k} = \frac{\mathit{Ared}_{k}(d_{k})}{\mathit{Pred}_{k}(d_{k})}. $$(10)If \(r_{k} < \eta_{1}\), then \(\Delta_{k}= \beta_{1} \Delta_{k}\), go to Step 2. Otherwise, go to Step 4.
- Step 4.:
-
\(x_{k+1}=x_{k} + d_{k}\);
$$\Delta_{k + 1} = \textstyle\begin{cases} \min\{\beta_{2} \Delta_{k}, \bar{\Delta}\}, & \text{if } r_{k} \geq\eta_{2},\\ \Delta_{k}, & \text{otherwise}. \end{cases} $$Compute \(\gamma_{k+1}\) by (7). Set \(k:=k+1\), go to Step 1.
3 Convergence analysis
In this section, we prove the global convergence of Algorithm 1. The global convergence of Algorithm 1 needs the following assumptions.
Assumption A
-
(1)
The level set \(\Omega=\{x\in R^{n} \vert f(x)\leq f(x_{0}) \} \) is bounded.
-
(2)
The following relation holds:
$$\bigl\Vert [J_{k}-\gamma_{k}I]^{T} F_{k}\bigr\Vert =O\bigl(\Vert d_{k}\Vert \bigr). $$
Then we get the following lemmas.
Lemma 3.1
\(\vert \mathit{Ared}_{k}(d_{k})-\mathit{Pred}_{k}(d_{k})\vert =O(\Vert d_{k}\Vert ^{2})\).
Proof
This completes the proof. □
Similar to Zhang and Wang [15], or Yuan et al. [16], we obtain the following result.
Lemma 3.2
If \(d_{k}\) is a solution of (5), then
Proof
Since \(d_{k}\) is a solution of (5), for any \(\alpha\in[0,1]\), it follows that
Then we have
The proof is complete. □
Lemma 3.3
Algorithm 1 does not circle between Step 2 and Step 3 infinitely.
Proof
If Algorithm 1 circles between Step 2 and Step 3 infinitely, then for all \(i=1,2,\ldots\) , we have \(x_{k+i}=x_{k}\), and \(\Vert F_{k}\Vert > \epsilon\), which implies that \(r_{k} < \eta_{1}\), \(\Delta_{k}\to0\).
By Lemmas 3.1 and 3.2, we have
Therefore, for k sufficiently large
this contradicts the fact that \(r_{k}<\eta_{1}\). □
Lemma 3.4
Let Assumption A hold and \(\{x_{k}\}\) be generated by Algorithm 1, then \(\{x_{k}\}\subset\Omega\). Moreover, \(\{ f(x_{k})\}\) converges.
Proof
By the definition of Algorithm 1, we have
This implies
Therefore, \(\{x_{k}\}\subset\Omega\). According to \(f(x_{k})\ge0\), we know that \(\{f(x_{k})\}\) converges. □
The following theorem shows that Algorithm 1 is global convergent under the conditions of Assumption A.
Theorem 3.5
Let Assumption A hold, \(\{x_{k}\}\) be generated by Algorithm 1. Then the algorithm either stops finitely or generates an infinite sequence \(\{x_{k}\}\) such that
Proof
Assume that Algorithm 1 does not stop after finite steps. Now we suppose that (17) does not hold, then there exist a constant \(\varepsilon> 0\) and a subsequence \(\{k_{j}\}\) satisfying
Let \(K=\{k\vert \Vert F_{k}\Vert \ge\varepsilon\}\).
Let \(S_{0} =\{k \vert r_{k} \geq\eta_{2} \}\). Using Algorithm 1 and Lemma 3.2, we have
By Lemma 3.4, we know that \(\{f(x_{k})\}\) is convergent, then
Thus, we have
From Steps 3-4 of Algorithm 1 it follows that
for all \(k\notin S_{0}\), thus (19) means
Therefore there exists \(x^{*}\) such that
By (21), we have \(\Delta_{k}\rightarrow0\), which implies
for all sufficiently large k. The fact that \(\vert \mathit{Ared}_{k}(d_{k}) - \mathit{Pred}_{k}(d_{k})\vert = O(\Vert d_{k}\Vert ^{2})\) indicates that
which shows that, for sufficiently large k and \(k\in K\),
The above inequality contradicts (20). Thus, the conclusion follows. □
4 Numerical experiments
In this section, the recent spectral method in [1] is called Algorithm 2. We report results of some numerical experiments of Algorithms 1 and 2. We choose 14 test functions as follows (see [4, 6, 17]).
Function 1
The trigonometric function
Initial guess: \(x_{0}=-(\frac{1}{n},\ldots,\frac{1}{n})^{T}\).
Function 2
The discretized two-point boundary value problem
when A is the \(n\times n\) tridiagonal matrix given by
and \(\Phi=(\Phi_{1}(x),\Phi_{2}(x),\ldots,\Phi_{n}(x))^{T}\) with \(\Phi _{i}(x)=\sin x_{i}-1\), \(i=1,2,\ldots,n\).
Initial guess: \(x_{0}=(50,0,\ldots,50,0)\).
Function 3
The Broyden tridiagonal function
Initial guess: \(x_{0}=(-1,\ldots,-1)^{T}\).
Function 4
The Broyden banded function
Initial guess: \(x_{0}=(-1,\ldots,-1)^{T}\).
Function 5
The variable dimensioned function
Initial guess: \(x_{0}=(1-\frac{1}{n},1-\frac{2}{n},\ldots,0)^{T}\).
Function 6
The discrete boundary value function
Initial guess: \(x_{0}=(h(h-1),h(2h-1),\ldots,h(nh-1))^{T}\).
Function 7
The logarithmic function
Initial guess: \(x_{0}=(1,1,\ldots,1)^{T}\).
Function 8
The strictly convex function
Initial guess: \(x_{0}=(\frac{1}{n},\frac{2}{n},\ldots,1)^{T}\).
Function 9
The exponential function
Initial guess: \(x_{0}=(\frac{n}{n-1},\frac{n}{n-1},\ldots,\frac {n}{n-1})^{T}\).
Function 10
The extended Rosenbrock function (n is even). For \(i=1,2,\ldots,n/2\),
Initial guess: \(x_{0}=(-1.2,1,\ldots,-1.2,1)^{T}\).
Function 11
The singular function
Initial guess: \(x_{0}=(1,1,\ldots,1)^{T}\).
Function 12
The trigexp function
Initial guess: \(x_{0}=(0,0,\ldots,0)^{T}\).
Function 13
The extended Freudentein and Roth function (n is even). For \(i=1,2,\ldots,n/2\),
Initial guess: \(x_{0}=(6,3,\ldots,6,3)^{T}\).
Function 14
The Troech problem
Initial guess: \(x_{0}=(0,0,\ldots,0)^{T}\).
In the experiments, the parameters are chosen as \(\Delta_{0}=1\), \(\bar {\Delta}=10\), \(\epsilon=10^{-5}\), \(\eta_{1}=0.001\), \(\eta_{2}=0.75\), \(\beta_{1} =0.5\), \(\beta_{2} =2.0\), \(M=10\), \(\eta_{k}=1/(k+1)^{2}\), \(\alpha=0.5\), where ϵ is the stop criterion. The program is also stopped if the iteration number is larger than 5,000. We obtain \(d_{k}\) by (5) from the Dogleg method in [18]. The program is coded in MATLAB 2009a.
To show the performance of two algorithms, we use the performance profile proposed by Dolan and Moré [19]. The dimensions of 14 test functions are 100, 1,000, 10,000. According to the numerical results, we plot two figures based on the total number of iterations and the CPU time, respectively.
Figure 1 shows that our algorithm is slightly better than Algorithm 2 on the total number of iterations for \(n=100\). Figures 2 and 3 indicate that two algorithms have no large discrepancies on the total number of iterations for \(n=1{,}000,10{,}000\). From Figures 4-6, it is easy to see that our algorithm performs better than Algorithm 2 does on the CPU time for 14 test problems. Preliminary numerical results show that the performance of our algorithm is notable.
References
La Cruz, W, Jose, MM, Marcos, R: Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Math. Comput. 75(255), 1429-1448 (2006)
Bouaricha, A, Schnabel, RB: Tensor methods for large sparse systems of nonlinear equations. Math. Program. 82(3), 377-400 (1998)
Bergamaschi, L, Moret, I, Zilli, G: Inexact quasi-Newton methods for sparse systems of nonlinear equations. Future Gener. Comput. Syst. 18(1), 41-53 (2001)
La Cruz, W, Raydan, M: Nonmonotone spectral methods for large-scale nonlinear systems. Optim. Methods Softw. 18(5), 583-599 (2003)
Li, Q, Li, D: AÂ class of derivative-free methods for large-scale nonlinear monotone equations. IMA J. Numer. Anal. 31, 1625-1635 (2011)
Yuan, GL, Zhang, MJ: A three-terms Polak-Ribiè-Polyak conjugate gradient algorithm for large-scale nonlinear equations. J. Comput. Appl. Math. 286, 186-195 (2015)
Yuan, GL, Meng, ZH, Li, Y: A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations. J. Optim. Theory Appl. 168, 129-152 (2016)
Barzilai, J, Borwein, JM: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141-148 (1988)
Birgin, EG, Martinez, JM, Raydan, M: Inexact spectral projected gradient methods on convex sets. IMA J. Numer. Anal. 23(4), 539-559 (2003)
Dai, YH, Zhang, H: Adaptive two-point stepsize gradient algorithm. Numer. Algorithms 27(4), 377-385 (2001)
Dai, YH: Modified two-point stepsize gradient methods for unconstrained optimization. Comput. Optim. Appl. 22(1), 103-109 (2002)
Raydan, M: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Control Optim. 7(1), 26-33 (1997)
Yuan, GL, Lu, XW, Wei, ZX: BFGS trust-region method for symmetric nonlinear equations. J. Comput. Appl. Math. 230, 44-58 (2009)
Yuan, YX: Trust region algorithm for nonlinear equations. Information 1, 7-21 (1998)
Zhang, JL, Wang, Y: AÂ new trust region method for nonlinear equations. Math. Methods Oper. Res. 58(2), 283-298 (2003)
Yuan, GL, Wei, ZX, Liu, XW: AÂ BFGS trust-region method for nonlinear equations. Computing 92(4), 317-333 (2011)
Moré, JJ, Garbow, BS, Hillstrom, KE: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17-41 (1981)
Wang, YJ, Xiu, NH: Theory and Algorithm for Nonlinear Programming. Shanxi Science and Technology Press, Xi’an (2004)
Dolan, ED, Moré, JJ: Benchmarking optimization software with performance profiles. Math. Program. 91, 201-213 (2002)
Acknowledgements
We thank the reviewers and the editors for their valuable suggestions and comments which improve this paper greatly. This work is supported by the Science and Technology Foundation of the Department of Education of Hubei Province (D20152701) and the Foundations of Education Department of Anhui Province (KJ2016A651; 2014jyxm161).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
The two authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Zeng, M., Zhou, G. A trust region spectral method for large-scale systems of nonlinear equations. J Inequal Appl 2016, 174 (2016). https://doi.org/10.1186/s13660-016-1117-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-016-1117-x