Three modified PolakRibièrePolyak conjugate gradient methods with sufficient descent property
 Min Sun^{1}Email author and
 Jing Liu^{2}
https://doi.org/10.1186/s1366001506499
© Sun and Liu; licensee Springer. 2015
Received: 3 December 2014
Accepted: 30 March 2015
Published: 8 April 2015
Abstract
In this paper, three modified PolakRibièrePolyak (PRP) conjugate gradient methods for unconstrained optimization are proposed. They are based on the twoterm PRP method proposed by Cheng (Numer. Funct. Anal. Optim. 28:12171230, 2007), the threeterm PRP method proposed by Zhang et al. (IMA J. Numer. Anal. 26:629640, 2006), and the descent PRP method proposed by Yu et al. (Optim. Methods Softw. 23:275293, 2008). These modified methods possess the sufficient descent property without any line searches. Moreover, if the exact line search is used, they reduce to the classical PRP method. Under standard assumptions, we show that these three methods converge globally with a Wolfe line search. We also report some numerical results to show the efficiency of the proposed methods.
Keywords
conjugate gradient method sufficient descent property global convergence1 Introduction
Note that the global convergence of the above three methods is established under some Armijo type line search or strong Wolfe line search. It is well known that the step size generated by the Armijo line search maybe approaches zero, and thus the reduction of the objective function is very little. This slows down the optimization process. Obviously, the strong Wolfe line search can avoid this phenomenon when the parameter \(\sigma \rightarrow0^{+}\), and in this case, the strong Wolfe line search is close to the exact line search. Thus, the computational load of the strong Wolfe line search increases heavily. In fact, the Wolfe line search can also avoid the above phenomenon. However, compared with the strong Wolfe line search, the Wolfe line search needs less computation to get a suitable step size at each iteration. Therefore, the Wolfe line search can enhance the efficiency of the conjugate gradient method.
In this paper, we shall investigate some variations of PRP method under a Wolfe line search. In fact, we take a little modification to the \(\beta_{k}^{\mathrm{PRP}}\) and propose three modified PRP methods based on the iterate directions \(d_{k}^{\mathrm{CTPRP}}\), \(d_{k}^{\mathrm {ZTPRP}}\), and \(d_{k}^{\mathrm{YTPRP}}\), which possess not only the sufficient descent property for any line search but also global convergence with a Wolfe line search. In order to do so, the remainder of the paper is organized as follows: In Section 2, we propose the modified PRP methods and prove their convergence. In Section 3, we present some numerical results by using the test problems in [17]. Section 4 concludes the paper with final remarks.
2 Three modified PRP methods
First, we give the following basic assumption as regards the objection function \(f(x)\).
Assumptions
 (H1)
The level set \(R_{0}=\{xf(x)\leq f(x_{0})\}\) is bounded.
 (H2)In some neighborhood N of \(R_{0}\), the gradient \(g(x)\) is Lipschitz continuous on an open convex set B that contains \(R_{0}\), i.e., there exists a constant \(L>0\) such that$$\bigl\Vert g(x)g(y)\bigr\Vert \leq L\xy\, \quad \text{for any } x, y\in B. $$
First, using the parameter \(\beta_{k}^{\mathrm{MPRP}}\) and the direction \(d_{k}^{\mathrm{CTPRP}}\), we present the following conjugate gradient method (denoted the TMPRP1 method).
TMPRP1 method
(Twoterm modified PRP method)
 Step 0.:

Give an initial point \(x_{0}\in\mathcal{R}^{n}\), \(\mu\geq0\), \(0<\rho<\sigma<1\), and set \(d_{0}=g_{0}\), \(k:=0\).
 Step 1.:

If \(\g_{k}\=0\) then stop; otherwise go to Step 2.
 Step 2.:

Compute \(d_{k}\) byDetermine the step size \(\alpha_{k}\) by Wolfe line search (5).$$ d_{k}=\left \{ \begin{array}{l@{\quad}l} g_{k},& \text{if } k=0, \\  (1+\beta_{k}^{\mathrm{MPRP}}\frac{g_{k}^{\top}d_{k1}}{\g_{k}\ ^{2}} )g_{k}+\beta_{k}^{\mathrm{MPRP}}d_{k1},& \text{if } k\geq1. \end{array} \right . $$(13)
 Step 3.:

Set \(x_{k+1}=x_{k}+\alpha_{k} d_{k}\), and \(k:=k+1\); go to Step 1.
Similarly, using the parameter \(\beta_{k}^{\mathrm{MPRP}}\) and the direction \(d_{k}^{\mathrm{ZTPRP}}\), we present the following conjugate gradient method (denoted the TMPRP2 method).
TMPRP2 method
(Threeterm modified PRP method)
 Step 0.:

Give an initial point \(x_{0}\in\mathcal{R}^{n}\), \(\mu\geq0\), \(0<\rho<\sigma<1\), and set \(d_{0}=g_{0}\), \(k:=0\).
 Step 1.:

If \(\g_{k}\=0\) then stop; otherwise go to Step 2.
 Step 2.:

Compute \(d_{k}\) bywhere \(\vartheta_{k}=g_{k}^{\top}d_{k1}/(\g_{k1}\^{2}+\mug_{k}^{\top}d_{k1})\). Determine the step size \(\alpha_{k}\) by Wolfe line search (5).$$ d_{k}=\left \{ \begin{array}{l@{\quad}l} g_{k},& \text{if } k=0, \\ g_{k}+\beta_{k}^{\mathrm{MPRP}}d_{k1}\vartheta_{k} y_{k1}, & \text{if } k\geq1, \end{array} \right . $$(14)
 Step 3.:

Set \(x_{k+1}=x_{k}+\alpha_{k} d_{k}\), and \(k:=k+1\); go to Step 1.
Using a parameter similar to \(\beta_{k}^{\mathrm{YPRP}}\), we present the following conjugate gradient method (denoted the TMPRP3 method).
TMPRP3 method
(Threeterm descent PRP method)
 Step 0.:

Give an initial point \(x_{0}\in\mathcal{R}^{n}\), \(\mu\geq0\), \(t>1\), \(0<\rho<\sigma<1\), and set \(d_{0}=g_{0}\), \(k:=0\).
 Step 1.:

If \(\g_{k}\=0\) then stop; otherwise go to Step 2.
 Step 2.:

Compute \(d_{k}\) bywhere$$ d_{k}=\left \{ \begin{array}{l@{\quad}l} g_{k}, &\text{if } k=0, \\ g_{k}+\beta_{k}^{\mathrm{VPRP}}d_{k1}+\nu_{k} (y_{k1}s_{k1}), &\text{if } k\geq1, \end{array} \right . $$(15)Determine the step size \(\alpha_{k}\) by Wolfe line search (5).$$ \begin{aligned} &\beta_{k}^{\mathrm{VPRP}}=\frac{g_{k}^{\top}(g_{k}g_{k1})}{\mu g_{k}^{\top}d_{k1}+\g_{k1}\^{2}}t \frac{\y_{k1}\^{2}g_{k}^{\top}d_{k1}}{(\mug_{k}^{\top}d_{k1}+\g_{k1}\^{2})^{2}}, \\ &\nu_{k}=\frac{g_{k}^{\top}d_{k1}}{\mug_{k}^{\top}d_{k1}+\g_{k1}\^{2}}. \end{aligned} $$(16)
 Step 3.:

Set \(x_{k+1}=x_{k}+\alpha_{k} d_{k}\), and \(k:=k+1\); go to Step 1.
Remark 2.1
If the constant \(\mu=0\), then the TMPRP1 method and TMPRP2 method reduce to the methods proposed by Cheng [9] and Zhang et al. [10], respectively, and the TMPRP3 method reduces to a method similar to that proposed by Yu et al. [20].
Remark 2.2
Obviously, if the line search is exact, then the direction generated by (13) or (14) or (15) reduces to (3) with \(\beta _{k}=\beta_{k}^{\mathrm{PRP}}\). Therefore, in the following, we assume that \(\mu>0\).
Remark 2.3
Lemma 2.1
Proof
Remark 2.4
From the proof of Lemma 2.1, we can see that if the term \(s_{k1}\) in \(d_{k}\) is deleted, then the above sufficient descent property still holds.
The global convergence proof of the above three methods is similar, here, we only prove the global convergence of the TMPRP1 method. In the case of the other two methods, the argument is similar.
The following lemma, called the Zoutendijk condition, is often used to prove global convergence of conjugate gradient method. It was originally given by Zoutendijk in [21].
Lemma 2.2
Definition 2.1
Now we prove the strongly global convergence of TMPRP1 method for uniformly convex functions.
Lemma 2.3
Proof
See Lemma 2.1 in [22]. □
The proof of the following theorem is similar to that of Theorem 2.1 in [22]. For completeness, we give the proof.
Theorem 2.1
Proof
We are going to investigate the global convergence of the TMPRP1 method with Wolfe line search (5) for nonconvex function. In the last part of this subsection, we use \(\beta_{k}^{\mathrm{MPRP}+}\) to replace \(\beta _{k}^{\mathrm{MPRP}}\) in (13).
The next lemma corresponds to Lemma 4.3 in [23] and Theorem 3.2 in [24].
Lemma 2.4
Proof
The following theorem establishes the global convergence of the TMPRP1 method with Wolfe line search (5) for general nonconvex functions. The proof is analogous to that of Theorem 3.2 in [24].
Theorem 2.2
Proof
Remark 2.5
From Theorem 2.2, we can see that the TMPRP1 method possesses better convergence properties than CTPRP method in [2]. Since the TMPRP1 method converges globally for nonconvex minimization problems with a Wolfe line search, while the CTPPR method converges globally for nonconvex minimization problems with a strong Wolfe line search. We also note that the term \(\mug_{k}^{\top}d_{k1}\) in the denominator of (11) plays an important role in the proof of Lemma 2.4.
3 Numerical results

TMPRP1: the TMPRP1 method with Wolfe line search (5), with \(\mu=10^{4}\), \(\rho=0.1\), \(\sigma=0.5\);

CG_DESCENT: the CG_DESCENT method with Wolfe line search (5), with \(\rho=0.1\), \(\sigma=0.5\);

DTPRP: the DTPRP method with Wolfe line search (5), with \(\mu =1.2\), \(\rho=0.1\), \(\sigma=0.5\).
The results for the methods on the tested problems
P  n  TMPRP1  CG_DESCENT  DTPRP 

Freudenstein and Roth  100  52/1,017/0.4,688  53/1,030/0.4219  94/2,037/0.8125 
Trigonometric  5,000  118/539/5.6094  75/603/5.6250  57/170/2.2813 
Extended Rosenbrock  5,000  44/868/1.7344  119/2,195/3.6875  54/956/1.8281 
Generalized Rosenbrock  10  223/5,114/1.2500  567/13,632/3.5156  305/6,522/1.6719 
White  1,000  48/874/1.3594  101/2,321/3.3438  71/1,474/2.0625 
Beale  5,000  45/933/3.7344  98/2,182/8.3906  43/555/2.2031 
Penalty  5,000  30/593/1.2969  26/516/1.1094  F 
Perturbed quadratic  100  92/1,674/0.5469  114/1,974/0.6563  99/2,302/0.6875 
Raydan 1  500  171/3,083/1.9219  231/3,882/2.3125  150/2,333/1.4688 
Raydan 2  5,000  5/6/0.3750  6/60/0.6250  6/7/0.3438 
Diagonal 1  100  88/1,462/0.6250  74/880/0.4063  83/1,608/0.6563 
Diagonal 2  100  780/781/0.5938  104/341/0.1875  780/781/0.5313 
Diagonal 3  100  101/1,492/0.7188  154/2,321/1.0781  77/767/0.4219 
Hager  100  44/640/0.3281  32/251/0.1563  34/403/0.2188 
Generalized tridiagonal 1  1,000  41/578/1.4844  31/403/1.0469  58/1,078/2.8594 
Extended tridiagonal 1  1,000  41/432/1.1250  40/497/1.2188  46/724/1.8281 
Extended three expo terms  5,000  45/759/5.5156  31/246/2.0781  21/174/1.5469 
Generalized tridiagonal 2  1,000  56/785/1.3594  404/11,638/19.5938  61/1031/1.7813 
Diagonal 4  5,000  48/815/1.3906  128/2,383/3.6406  55/673/1.1719 
Diagonal 5  5,000  4/5/0.2969  4/8/0.3906  4/5/0.3281 
Extended Himmelblau  5,000  30/438/1.0625  23/214/0.7500  20/178/0.7344 
Generalized PSC1  5,000  222/1,100/5.7344  672/5,554/27.0156  F 
Extended PSC1  5,000  55/916/5.0156  24/173/1.5156  24/187/1.4375 
Extended Powell  5,000  193/2,649/17.4844  F  536/7,005/42.2031 
Extended BD1  5,000  35/431/1.5156  49/856/2.8281  33/452/1.6250 
Extended Maratos  1,000  66/1,121/0.6563  F  136/2,206/1.2344 
Extended Cliff  5,000  48/262/1.6094  123/1,275/6.5000  F 
Quadratic diagonal perturbed  5,000  433/6,793/3.6875  F  247/3,834/2.1094 
Extended Wood  5,000  199/2,976/5.7188  F  131/2,075/4.1094 
Extended Hiebert  5,000  2/32/0.4844  2/33/0.5313  3/62/0.5469 
Quadratic QF1  5,000  731/12,453/19.8438  790/13,180/20.1875  882/14,508/22.3906 
Extended QP1  1,000  65/1,662/1.0156  25/361/0.2813  16/157/0.1875 
Extended QP2  5,000  64/988/5.1719  143/2,919/13.9219  78/1,170/6.4063 
Quadratic QF2  5,000  777/14,146/24.6406  968/17,331/31.9219  814/14,358/24.7344 
Extended EP1  5,000  101/2,391/6.6094  12/195/1.0000  136/3,136/8.0781 
Extended tridiagonal 2  5,000  46/615/2.0156  63/1,270/3.4844  32/170/0.9219 
BDQRTIC  100  159/2,473/0.8438  F  185/3,133/1.0156 
TRIDIA  100  310/4,816/1.5938  440/7,143/2.2344  364/6,190/1.8281 
ARWHEAD  5,000  35/702/2.2813  F  F 
NONDIA  5,000  30/626/1.8906  F  209/4,327/10.7344 
NONDQUAR  5  713/779/0.4531  97/809/0.2813  F 
DQDRTIC  5,000  80/1,234/2.7031  117/2,386/4.8750  81/1,108/2.4688 
EG2  100  165/2,715/1.5,000  85/1,136/0.7969  F 
DIXMAANA  5,001  21/191/7.0938  13/177/6.4688  10/70/2.8281 
DIXMAANB  5,001  22/45/2.0000  13/127/4.7344  7/14/0.8750 
DIXMAANC  5,001  17/136/5.1719  15/231/8.7500  6/17/0.9219 
DIXMAANE  102  346/451/0.7031  186/5,359/5.4844  321/325/0.5313 
Partial perturbed quadratic  100  87/1,905/1.6094  116/2,180/1.6719  77/1,326/0.9844 
Broyden tridiagonal  5,000  114/1,927/5.1719  101/1,884/4.9531  119/2,029/5.2656 
Almost perturbed quadratic  5,000  854/19,329/30.2969  F  866/19,516/32.0938 
Tridiagonal perturbed quadratic  5,000  760/16,744/42.7813  959/22,230/52.9063  774/17,831/43.9063 
EDENSCH  1,000  40/615/1.7188  35/450/1.1875  49/1,273/3.0938 
HIMMELBHA  5,000  15/69/1.4063  F  17/18/0.6406 
STAIRCASE S1  100  341/5,058/1.5781  F  510/7,591/2.4844 
LIARWHD  5,000  39/727/1.9688  165/3,873/9.7500  262/6,799/16.2500 
DIAGONAL 6  5,000  5/6/0.3594  6/60/0.5313  6/7/0.3594 
DIXON3DQ  100  578/9,208/3.0625  F  499/7,241/2.1563 
ENGVAL1  5,000  36/611/1.7344  52/1,264/3.3906  F 
DENSCHNA  5,000  23/249/1.9063  27/318/2.5938  19/93/1.0469 
DENSCHNB  5,000  21/54/0.4844  10/79/0.4531  20/327/0.9063 
DENSCHNC  5,000  23/191/2.6094  34/357/4.5938  F 
DENSCHNF  5,000  25/343/1.1094  24/348/1.1250  23/363/1.2656 
SINQUAD  100  505/10,201/4.1250  F  F 
BIGGSB1  100  489/5,248/1.7031  F  533/5,660/1.6406 
Extended blockdiagonal  1,000  30/506/0.8906  36/508/0.8750  26/374/0.5938 
Generalized quartic 1  5,000  21/159/0.7500  18/342/1.0469  36/777/1.8594 
DIAGONAL 7  5,000  53/2,509/14.1563  54/2,477/13.8125  F 
DIAGONAL 8  5,000  57/2,710/18.3906  56/2,622/17.5781  F 
Full Hessian  5,000  17/239/1.6094  18/305/1.9688  46/1,643/8.8750 
SINCOS  5,000  26/250/1.5313  22/132/1.1719  F 
Generalized quartic 2  5,000  48/996/2.4688  35/606/1.4375  39/704/1.6250 
EXTROSNB  5,000  39/741/1.7500  159/7,756/14.8750  43/1,010/2.3281 
ARGLINB  100  101/5,024/1.7969  111/5,498/1.7813  23/691/0.3125 
FLETCHCR  5,000  61/1,662/4.0469  36/661/1.7813  61/1,976/4.7969 
HIMMELBG  2  F  2/4/0.0313  F 
HIMMELBH  5,000  18/103/0.7969  23/224/1.1406  16/91/0.6719 
DIAGONAL 9  5,000  1/3/0.3594  1/3/0.3906  1/3/0.3594 
4 Conclusion
This paper proposed three modified PRP conjugate gradient methods, which are some improvements of recently proposed PRP conjugate gradient methods. The global convergence of the proposed methods are established under the Wolfe line search. The effectiveness of the proposed methods have been shown by some numerical examples. We find that the performance of the TMPRP1 method is related to the parameter μ in \(\beta_{k}^{\mathrm{MPRP}}\); therefore, how to choose a suitable parameter τ deserves further investigation.
Declarations
Acknowledgements
The authors gratefully acknowledge the helpful comments and suggestions of the anonymous reviewers. This work was partially supported by the domestic visiting scholar project funding of Shandong Province outstanding young teachers in higher schools, the foundation of Scientific Research Project of Shandong Universities (No. J13LI03), and the Shandong Province Statistical Research Project (No. 20143038).
Open Access This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
Authors’ Affiliations
References
 Fletcher, R, Reeves, C: Function minimization by conjugate gradients. J. Comput. 7, 149154 (1964) View ArticleMATHMathSciNetGoogle Scholar
 Polak, B, Ribière, G: Note sur la convergence des méthodes de directions conjuguées. Rev. Fr. Inform. Rech. Oper. 16, 3543 (1969) Google Scholar
 Gilbert, JC, Nocedal, J: Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 2, 2142 (1992) View ArticleMATHMathSciNetGoogle Scholar
 Liu, YL, Storey, CS: Efficient generalized conjugate gradient algorithms, part 1: theory. J. Optim. Theory Appl. 69, 129137 (1991) View ArticleMATHMathSciNetGoogle Scholar
 Dai, YH, Yuan, YX: A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 10, 177182 (2000) View ArticleGoogle Scholar
 Hestenes, MR, Stiefel, EL: Method of conjugate gradient for solving linear systems. J. Res. Natl. Bur. Stand. 49, 409432 (1952) View ArticleMATHMathSciNetGoogle Scholar
 Fletcher, R: Practical Methods of Optimization. Volume 1: Unconstrained Optimization. Wiley, New York (1987) Google Scholar
 Grippo, L, Luidi, S: A globally convergent version of the PolakRibière gradient method. Math. Program. 78, 375391 (1997) MATHGoogle Scholar
 Cheng, WY: A twoterm PRPbased descent method. Numer. Funct. Anal. Optim. 28, 12171230 (2007) View ArticleMATHMathSciNetGoogle Scholar
 Zhang, L, Zhou, WJ, Li, DH: A descent modified PolakRibièrePolyak conjugate gradient method and its global convergence. IMA J. Numer. Anal. 26, 629640 (2006) View ArticleMATHMathSciNetGoogle Scholar
 Yu, GH, Guan, LT, Chen, WF: Spectral conjugate gradient methods with sufficient descent property for largescale unconstrained optimization. Optim. Methods Softw. 23, 275293 (2008) View ArticleMATHMathSciNetGoogle Scholar
 Dolan, ED, Moré, JJ: Benchmarking optimization software with performance profiles. Math. Program. 91, 201213 (2002) View ArticleMATHMathSciNetGoogle Scholar
 Wei, ZX, Li, G, Qi, LQ: Global convergence of the PolakRibièrePolyak conjugate gradient method with inexact line searches for nonconvex unconstrained optimization problems. Math. Comput. 77, 21732193 (2008) View ArticleMATHMathSciNetGoogle Scholar
 Li, G, Tang, CM, Wei, ZX: New conjugacy condition and related new conjugate gradient methods for unconstrained optimization. J. Comput. Appl. Math. 202, 523539 (2007) View ArticleMATHMathSciNetGoogle Scholar
 Yu, G, Guan, L, Li, G: Global convergence of modified PolakRibièrePolyak conjugate gradient methods with sufficient descent property. J. Ind. Manag. Optim. 3, 565579 (2008) MathSciNetGoogle Scholar
 Dai, YH, Kou, CX: A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search. SIAM J. Optim. 23, 296320 (2013) View ArticleMATHMathSciNetGoogle Scholar
 Neculai, A: Unconstrained optimization by direct searching (2007). http://camo.ici.ro/neculai/UNO/UNO.FOR
 Wei, ZX, Yao, SW, Liu, LY: The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 183, 13411350 (2006) View ArticleMATHMathSciNetGoogle Scholar
 Dai, ZF, Wen, FH: Another improved WeiYaoLiu nonlinear conjugate gradient method with sufficient descent property. Appl. Math. Comput. 218, 74217430 (2012) View ArticleMATHMathSciNetGoogle Scholar
 Yu, GH, Zhao, YL, Wei, ZX: A descent nonlinear conjugate gradient method for largescale unconstrained optimization. Appl. Math. Comput. 187, 636643 (2007) View ArticleMATHMathSciNetGoogle Scholar
 Zoutendijk, G: Nonlinear programming, computational methods. In: Abadie, J (ed.) Integer and Nonlinear Programming, pp. 3786. NorthHolland, Amsterdam (1970) Google Scholar
 Dai, ZF, Tian, BS: Global convergence of some modified PRP nonlinear conjugate gradient methods. Optim. Lett. 5(4), 615630 (2011) View ArticleMATHMathSciNetGoogle Scholar
 Dai, YH, Liao, LZ: New conjugacy conditions and related nonlinear conjugate gradient methods. Appl. Math. Optim. 43, 97101 (2001) View ArticleMathSciNetGoogle Scholar
 Hager, WW, Zhang, HC: A survey of nonlinear conjugate gradient methods. Pac. J. Optim. 2, 3558 (2006) MATHMathSciNetGoogle Scholar
 Wu, QZ, Zheng, ZY, Deng, W: Operations Research and Optimization, MATLAB Programming, pp. 6669. China Machine Press, Beijing (2010) Google Scholar