 Research
 Open access
 Published:
A nonmonotone hybrid conjugate gradient method for unconstrained optimization
Journal of Inequalities and Applications volumeÂ 2015, ArticleÂ number:Â 124 (2015)
Abstract
A nonmonotone hybrid conjugate gradient method is proposed, in which the technique of the nonmonotone Wolfe line search is used. Under mild assumptions, we prove the global convergence and linear convergence rate of the method. Numerical experiments are reported.
1 Introduction
Let us take the following unconstrained optimization problem:
where \(f: {R}^{n}\rightarrow {R}\) is continuously differentiable. For solving (1), the conjugate gradient method generates a sequence \(\{x_{k}\} \): \(x_{k+1}=x_{k}+\alpha_{k}d_{k}\), \(d_{0}= g_{0}\), and \(d_{k}=g_{k}+\beta_{k}d_{k1}\), where the stepsize \(\alpha_{k}>0\) is obtained by the line search, \(d_{k}\) is the search direction, \(g_{k}=\nabla f{(x_{k})}\) is the gradient of \(f(x)\) at the point \(x_{k}\), and \(\beta_{k}\) is known as the conjugate gradient parameter. Different parameters correspond to different conjugate gradient methods. A remarkable survey of conjugate gradient methods is given by Hager and Zhang [1].
Plenty of hybrid conjugate gradient methods were presented in [2â€“7] after the first hybrid conjugate algorithm was proposed by TouatiAhmed and Storey [8]. In [5], Lu et al. proposed a new hybrid conjugate gradient method (LY) with the conjugate gradient parameter \(\beta_{k}^{LY}\),
where \(0<\mu\leq\frac{\lambda\sigma}{1\sigma}\), \(\sigma<\lambda\leq1\). Numerical experiments show that the LY method is effective.
It is well known that the nonmonotone algorithms are promising methods for solving highly nonlinear largescale and possibly illconditioned problems. The first nonmonotone line search framework was proposed by Grippo et al. in [9] for Newtonâ€™s methods. At each iteration, the current function value is defined as follows:
where \(m(0)=0\), \(0\leq m(k)\leq\min{\{m(k1)+1,M\}}\), M is some positive integer. Zhang and Hager [10] proposed another nonmonotone line search technique, they adopted \(C_{k}\) to replace the current function \(f_{k}\), where
\(Q_{0}=1\), \(C_{0}=f(x_{0})\), \(\zeta_{k1}\in[0,1]\), and
To obtain the global convergence (see [4, 11â€“14]) and implement the algorithms, the line search in the conjugate gradient is usually chosen by a Wolfe line search; the stepsize \(\alpha_{k}\) satisfies the following two inequalities:
where \(0<\rho<\sigma<1\). In particular, a nonmonotone version line search can relax the choice of the stepsize. Therefore the nonmonotone Wolfe line search requires the stepsize \(\alpha_{k}\) to satisfy
and (7), or
and (7).
The aim of this paper is to propose a nonmonotone hybrid conjugate gradient method which combines the nonmonotone line search technique with the LY method. It is based on the idea that the larger values of the stepsize \(\alpha_{k}\) may be accepted by the nonmonotone algorithmic framework and improve the behavior of the LY method.
The paper is organized as follows. A new nonmonotone hybrid conjugate gradient algorithm is presented and the global convergence of the algorithm is proved in SectionÂ 2. The line convergence rate of the algorithm is shown in SectionÂ 3. In SectionÂ 4, numerical results are reported.
2 Nonmonotone hybrid conjugate gradient algorithm and global convergence
Now we present a nonmonotone hybrid conjugate gradient algorithm.
Algorithm 1

Step 1. Given \(x_{0}\in R^{n}\), \(\epsilon>0\), \(d_{0}=g_{0}\), \(C_{0}=f_{0}\), \(Q_{0}, \zeta_{0}, k:=0\).

Step 2. If \(\g_{k}\<\epsilon\), then stop. Otherwise, compute \(\alpha_{k}\) by (9) and (7), set \(x_{k+1}=x_{k}+\alpha_{k}d_{k}\).

Step 3. Compute \(\beta_{k+1}\) by (2), set \(d_{k+1}=g_{k+1}+\beta_{k+1}d_{k}\), \(k:=k+1\), and go to Step 2.
Assumption 1
We make the following assumptions:

(i)
The level set \({\Omega_{0}}=\{x\in R^{n}:{f{(x)}\leq f{(x_{0})}}\}\) is bounded, where \(x_{0}\) is the initial point.

(ii)
The gradient function \(g(x)=\nabla f(x)\) of the objective function f is Lipschitz continuous in a neighborhood \(\mathcal{N}\) of level set \({\Omega_{0}}\), i.e. there exists a constant \(L\geq0\) such that
$$\bigl\ g{(x)}g{(\bar{x})}\bigr\ \leq L\x\bar{x}\, $$for any \({x,\bar{x}}\in\mathcal{N}\).
Lemma 2.1
Let the sequence \(\{x_{k}\}\) be generated by Algorithm 1. Then \(d_{k}^{T}g_{k}<0\) holds for all \(k\geq1\).
Proof
From LemmaÂ 2 and LemmaÂ 3 in [5], the conclusion holds.â€ƒâ–¡
Lemma 2.2
Let AssumptionÂ 1 hold and the sequence \(\{x_{k}\}\) be obtained by Algorithm 1, \(\alpha_{k}\) satisfies the nonmonotone Wolfe conditions (9) and (7). Then
Proof
From (7), we have
and by (ii) of AssumptionÂ 1 it implies that
By combining these two inequalities, we obtain
â€ƒâ–¡
Lemma 2.3
Let the sequence \(\{x_{k}\}\) be generated by Algorithm 1 and \(d_{k}^{T}g_{k}<0\) hold for all \(k\geq1\). Then
Proof
See LemmaÂ 1.1 in [10].â€ƒâ–¡
Lemma 2.4
Let AssumptionÂ 1 hold, and the sequence \(\{x_{k}\}\) be obtained by Algorithm 1, where \(d_{k}\) satisfies \(d_{k}^{T}g_{k}<0\), \(\alpha_{k}\) is obtained by the nonmonotone Wolfe conditions (9) andÂ (7). Then
Proof
where \(c_{0}=\rho(1\sigma)/L\).
From (4), (5), and (13), we have
Since \(f(x)\) is bounded from below in the level set \(\Omega_{0}\) and by (11) for all k, we know that \(C_{k}\) is bounded from below. It follows from (14) that (12) holds.â€ƒâ–¡
Theorem 2.1
Suppose that AssumptionsÂ 1 hold and the sequence \(\{x_{k}\}\) is generated by the Algorithm 1. If \(\zeta_{\max}<1\), then either \(g_{k}=0\) for some k or
Proof
We prove by contradiction and assume that there exists a constant \(\epsilon>0\) such that
By LemmaÂ 4 in [5], we have \(\beta_{k}^{LY}\leq\frac{\mu\g_{k}\ ^{2}}{d_{k1}^{T}g_{k}\lambda d_{k1}^{T}g_{k1}}\). Then we have \({\d_{k}\}^{2}=(\beta^{LY})^{2} \d_{k1}\^{2}2g_{k}^{T}d_{k}\g_{k}\^{2} \leq(\frac{\mu\g_{k}\^{2}}{d_{k1}^{T}g_{k}\lambda d_{k1}^{T}g_{k1}})^{2}\d_{k1}\^{2}2g_{k}^{T}d_{k}\g_{k}\^{2}\). The rest of the proof is similar to TheoremÂ 2 and TheoremÂ 1 in [5], and we also conclude
Furthermore, by \(\zeta_{\max}<1\) and (5), we have
then
which indicates
3 Linear convergence rate of algorithm
We analyze the linearly convergence rate of the nonmonotone hybrid conjugate gradient method under the uniform convex assumption of \(f(x)\). The nonmonotone strong Wolfe line search is adopted in this section, given by (9) and
We suppose that the object function \(f(x)\) is twice continuously differentiable and uniformly convex on the level set \(\Omega_{0}\). Then the point \(x^{*}\) denotes a unique solution of the problem (1); there exists a positive constant Ï„ such that
The above conclusion (19) can be found in [10].
To analyze the convergence of the nonmonotone line search hybrid conjugate gradient method, the main difficulty is that the search directions do not usually satisfy the direction condition:
for some constant \(c>0\) and all \(k\geq1\). The following lemma has proven that the direction generated by Algorithm 1 with the strong Wolfe line search (9) and (18) in this paper satisfies the direction condition (20) by the observation for \(g_{k}^{T}d_{k1}\).
Lemma 3.1
Suppose that the sequence \(\{x_{k}\}\) is generated by Algorithm 1 with the strong Wolfe line search (9) and (18), \(0<\sigma <\frac{\lambda}{1+\mu}\). Then there exists some constant \(c>0\) such that the direction condition (20) holds.
Proof
According to the choice of the conjugate gradient parameter \(\beta _{k}^{LY}\), the result is discussed by two cases. In the first case,
then \(\beta_{k}=\frac{g_{k}^{T}(g_{k}d_{k1})}{d_{k1}^{T}(g_{k}g_{k1})}\). If \(\beta_{k}\geq0\), then, by (21) and \(d_{k1}^{T}g_{k1}<0\), \(d_{k1}^{T}g_{k}\geq0\) and \(g_{k}^{T}(g_{k}d_{k1})\geq0\). Furthermore, we have, by (18),
If \(\beta_{k}<0\), we have, by (18) and (21),
In the second case,
then \(\beta_{k}=\frac{\mu\g_{k}\^{2}}{d_{k1}^{T}g_{k}\lambda d_{k1}^{T}g_{k1}}>0\). By (18), we have
From (23) and (25), we obtain (20), where \(c=\min\{1\sigma\mu, 1\frac{\mu\sigma}{\lambda\sigma}\}>0\). The proof is completed.â€ƒâ–¡
Lemma 3.2
Suppose the assumptions of LemmaÂ 2.2 hold and, for all k,
then there exists a constant \(c_{2}>0\) such that
Proof
By LemmaÂ 2.2 and LemmaÂ 3.2, we have
where \(c_{2}=\frac{c(1\sigma)}{c_{1}L}\).â€ƒâ–¡
Theorem 3.1
Let \(x^{*}\) be the unique solution of problem (1) and the sequence \(\{ x_{k}\}\) be generated by Algorithm 1 with the nonmonotone Wolfe conditions (9) and (18), \(0<\sigma<\frac{\lambda}{1+\mu }\). If \(\alpha_{k}\leq\nu\) and \(\zeta_{\max}<1\), then there exists a constant \(\vartheta\in(0,1)\) such that
Proof
The proof is similar to that TheoremÂ 3.1 given in [10]. By (9), (20), and (27), we have
By (ii) in AssumptionÂ 1, \(x_{k+1}=x_{k}+\alpha_{k}d_{k}\) and (27), we have
In the first case, \(\g_{k}\^{2}\geq\beta(C_{k}f(x^{*}))\), where
Since \(Q_{k+1}\leq\frac{1}{1\zeta_{\max}}\) by (17), we have
By \(\g_{k}\^{2}\geq\beta(C_{k}f(x^{*}))\), we have
where \(\vartheta=1cc_{2}\rho\beta(1\zeta_{\max})\in(0, 1)\).
In the second case, \(\g_{k}\^{2}< \beta(C_{k}f(x^{*}))\). By (19) and (30), we have
By combining the equality, the first equation of (32), and \(Q_{k+1}\leq\frac{1}{1\zeta_{\max}}\), \(\zeta_{\max}<1\) and (31), we obtain
By (11), (33), and (34), we have
The proof is completed.â€ƒâ–¡
4 Numerical experiments
In this section, we report numerical results to illustrate the performance of hybrid conjugate gradient (LY) in [5], AlgorithmÂ 1 (NHLYCG1) and AlgorithmÂ 2 (NGLYCG2), in which (8) replaces only (9) in Step 2 of Algorithm 1. All codes are written with Matlab R2012a and are implemented on a PC with CPU 2.40 GHz and 2.00GB RAM. We select 12 smallscale and 28 largescale unconstrained optimization test functions from [15] and the CUTEr collection [16, 17] (see TableÂ 1). All algorithms implement the stronger version of the Wolfe condition with \(\rho=0.45\) and \(\sigma=0.39\), and \(\mu=0.5\), \(\lambda=0.6\), \(C_{0}=f_{0}\), \(Q_{0}=1\), \(\zeta _{0}=0.08\), \(\zeta_{1}=0.04\), \(\zeta_{k+1}=\frac{\zeta_{k}+\zeta_{k1}}{2}\), and the terminated condition
TableÂ 2 lists all the numerical results, which include the order numbers and dimensions of the tested problems, the number of iterations (it), the function evaluations (nf), the gradient evaluations (ng), and the CPU time (t) in seconds, respectively. We presented the Dolan and MorÃ© [18] performance profiles for the LY, NHLYCG1, and NGLYCG2. Note that the performance ratio \(q(\tau)\) is the probability for a solver s for the tested problems with the factor Ï„ of the smallest cost. As we can see from FigureÂ 1 and FigureÂ 2, NHLYCG1 is superior to LY and NGLYCG2 for the number of iterations and CPU time. FigureÂ 3 shows that NGLYCG1 is slightly better than LY and NGLYCG2 for the number of function value evaluations. FigureÂ 4 shows the performance of NGLYCG1 is very much like that of LY for the number of gradient evaluations. However, the performance of NGLYCG2 with the nonmonotone framework (3) is less than satisfactory.
References
Hager, WW, Zhang, H: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170192 (2005)
Andrei, N: A hybrid conjugate gradient algorithm for unconstrained optimization as a convex combination of HestenesStiefel and DaiYuan. Stud. Inform. Control 17(4), 5570 (2008)
BabaieKafaki, S, MahdaviAmiri, N: Two modified hybrid conjugate gradient methods based on a hybrid secant equation. Math. Model. Anal. 18(1), 3252 (2013)
Dai, YH, Yuan, Y: An efficient hybrid conjugate gradient method for unconstrained optimization. Ann. Oper. Res. 103, 3347 (2001)
Lu, YL, Li, WY, Zhang, CM, Yang, YT: A class new conjugate hybrid gradient method for unconstrained optimization. J.Â Inf. Comput. Sci. 12(5), 19411949 (2015)
Yang, YT, Cao, MY: The global convergence of a new mixed conjugate gradient method for unconstrained optimization. J. Appl. Math. 2012, 93298 (2012)
Zheng, XF, Tian, ZY, Song, LW: The global convergence of a mixed conjugate gradient method with the Wolfe line search. Oper. Res. Trans. 13(2), 1824 (2009)
TouatiAhmed, D, Storey, C: Efficient hybrid conjugate gradient techniques. J. Optim. Theory Appl. 64(2), 379397 (1990)
Grippo, L, Lampariello, F, Lucidi, S: A nonmonotone line search technique for Newtonâ€™s method. SIAM J. Numer. Anal. 23, 707716 (1986)
Zhang, H, Hager, WW: A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 14, 10431056 (2004)
Zoutendijk, G: Nonlinear programming, computational methods. In: Abadie, J (ed.) Integer and Nonlinear Programming. NorthHolland, Amsterdam (1970)
AlBaali, M: Descent property and global convergence of the FletcherReeves method with inexact line search. IMA J.Â Numer. Anal. 5, 121124 (1985)
Gilbert, JC, Nocedal, J: Global convergence properties of conjugate gradient methods for optimization. SIAM J.Â Optim. 2(1), 2142 (1992)
Yu, GH, Zhao, YL, Wei, ZX: A descent nonlinear conjugate gradient method for largescale unconstrained optimization. Appl. Math. Comput. 187(2), 636643 (2007)
MorÃ©, JJ, Garbow, BS, Hillstrom, KE: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7, 1741 (1981)
Andrei, N: An unconstrained optimization test functions collection. Adv. Model. Optim. 10, 147161 (2008)
Gould, NIM, Orban, D, Toint, PL: CUTEr and SifDec: a constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw. 29, 373394 (2003)
Dolan, ED, MorÃ©, JJ: Benchmarking optimization software with performance profiles. Math. Program. 9, 201213 (2002)
Acknowledgements
This work is supported in part by the NNSF (11171003) of China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authorsâ€™ contributions
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Rights and permissions
Open Access This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
About this article
Cite this article
Li, W., Yang, Y. A nonmonotone hybrid conjugate gradient method for unconstrained optimization. J Inequal Appl 2015, 124 (2015). https://doi.org/10.1186/s1366001506441
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1366001506441