Open Access

Global convergence of a modified conjugate gradient method

Journal of Inequalities and Applications20142014:248

https://doi.org/10.1186/1029-242X-2014-248

Received: 19 March 2014

Accepted: 4 June 2014

Published: 18 July 2014

Abstract

A modified conjugate gradient method to solve unconstrained optimization problems is proposed which satisfies the sufficient descent condition in the case of the strong Wolfe line search, and its global convergence property is established simply. The numerical results show that the proposed method is promising for the given test problems.

MSC:90C26, 65H10.

Keywords

unconstrained optimizationconjugate gradient methodsufficient descent conditionglobal convergence

1 Introduction

The nonlinear conjugate gradient method is one of the best methods to solve unconstrained optimization problems. It comprises a class of unconstrained optimization algorithms which is characterized by low memory requirements and strong local or global convergence properties. Therefore, a modified nonlinear conjugate gradient method is proposed and analyzed in this paper.

Consider the following unconstrained optimization problem:
min x R n f ( x ) ,
(1.1)

where f : R n R is a smooth function and its gradient is denoted by g.

The conjugate gradient methods for solving the above problem often use the following iterative rules:
x k + 1 = x k + α k d k ,
(1.2)
where x k is the current iterate, the stepsize α k is a positive scalar which is generated by some line search, and the search direction d k is defined by
d k = { g k , for  k = 1 ; g k + β k d k 1 , for  k 2 ,
(1.3)
where g k = f ( x k ) , β k is the conjugate parameter which determines the performances of the corresponding methods. There are many well-known parameters β k , such as
β k PRP = g k T ( g k g k 1 ) g k 1 2 ( Polak-Ribière-Polyak (PRP) [1, 2] ) , β k LS = g k T ( g k g k 1 ) d k 1 T g k 1 ( Liu-Storey (LS) [3] ) , β k HZ = ( y k 1 2 d k 1 y k 1 2 d k 1 T y k 1 ) T g k d k 1 T y k 1 ( Hager-Zhang [4] ) ,

where is the Euclidean norm. Their corresponding methods are generally called PRP, LS, and HZ conjugate gradient methods. If f is a strictly convex quadratic function, these methods are equivalent in the case that an exact line search is used. If f is non-convex, their behaviors may show some differences.

When the objective function is convex, Polak and Ribière [1] proved that the PRP method is globally convergent under the exact line search. But Powell [5] showed that the PRP method does not converge globally for some non-convex functions. However, in the past few years, the PRP method is generally believed to be the most efficient conjugate gradient method in practical computation. One remarkable property of the PRP method is that it essentially performs a restart if a bad direction occurs (see [6]). But Powell [5] constructed an example showing that the PRP method can cycle infinitely without approaching any stationary point even if an exact line search is used. This counter-example also indicates that the PRP method has a drawback in that it may not globally be convergent when the objective function is non-convex. Recently, Zhang et al. [7] proposed a descent modified PRP conjugate gradient method and proved its global convergence. The LS method has a similar property as the PRP method. The global convergence of the LS method with the Grippo-Lucidi line search has also been proved in [8]. Some researchers have further studied the LS method (see Liu [9], Liu and Du [10]). In addition, Hager and Zhang [4] gave another effective method, namely the CG-DESCENT method. It not only has stable convergence, but it also shows an effective numerical experiment result. In this method, the parameter β k is computed by β k = max { β k HZ , η k } , where η k = 1 d k 1 min { η , g k 1 } , η > 0 .

In the next section, a modified conjugate gradient method is proposed. In Section 3, we prove the global convergence of the proposed method for non-convex functions in the case of the strong Wolfe line search. In Section 4, we report some numerical results.

2 The new algorithm

Recently, some people have studied some variants of the LS method. For example, Li et al. [11] proposed a modified LS method where the parameter β k is computed by
β k = g k T ( g k g k 1 ) d k 1 T g k 1 t g k g k 1 2 d k 1 T g k ( d k 1 T g k 1 ) 2 ,
where t > 1 4 is a constant. They proved the global convergence of the modified method with the Armijo line search and Wolfe line search. Tang et al. [12] proved the LS method with the new line search. Liu et al. [13] studied a modified LS method where the parameter β k is computed by
β k LS 2 = { g k T ( g k g k 1 ) ρ | g k T d k 1 | g k 1 T d k 1 if  min { 1 , ρ 1 ξ } g k 2 > | g k T g k 1 | , 0 else ,
where ρ > 1 + ξ , ξ > 0 . They proved the global convergence of the corresponding method with the Wolfe line search. In 2006, Wei et al. [14] proposed a modified PRP method where the parameter β k is obtained by
β k = g k T ( g k g k g k 1 g k 1 ) g k 1 2 .
They proved its global convergence with the exact line search, the strong Wolfe line search, and the Grippo-Lucidi line search, respectively. Their work overcomes the weak convergence of the PRP method. Inspired by their work, we consider a variant of LS method, i.e.
β k VLS = g k T ( g k t k g k 1 ) λ d k 1 T g k 1 + ( 1 λ ) max { 0 , g k T d k 1 } ,
(2.1)

where t k = g k g k 1 , λ ( 0 , 1 ) and λ > 2 σ . Obviously, the denominator of (2.1) is a convex combination of d k 1 T g k 1 and max { 0 , g k T d k 1 } which may avoid the denominator of β k LS tending to zero. Now, we state formally the corresponding algorithm scheme for unconstrained optimization problems.

Algorithm 2.1

  • Step 0: Given an initial x 1 R n , ε 0 , λ = 0.8 . Set k = 1 .

  • Step 1: If g 1 ε , then stop.

  • Step 2: Compute α k by the strong Wolfe line search ( 0 < δ < σ < 1 2 ):
    f ( x k + α k d k ) f ( x k ) + δ α k g k T d k ,
    (2.2)
    | g ( x k + α k d k ) T d k | σ g k T d k .
    (2.3)
  • Step 3: Let x k + 1 = x k + α k d k , g k + 1 = g ( x k + 1 ) , if g k + 1 ε , then stop.

  • Step 4: Compute β k + 1 by (2.1), and generate d k + 1 by (1.3).

  • Step 5: Set k = k + 1 , go to step 2.

In some references, the sufficient descent condition
g k T d k c g k 2 , c > 0 ,
(2.4)

is always assumed to hold. Because it plays an important role in proving the global convergence of conjugate gradient methods. Fortunately, in this paper, the search direction d k satisfies the sufficient descent condition in the case of the strong Wolfe line search without any assumption.

Lemma 2.1 Let the sequences { g k } and { d k } be generated by Algorithm 2.1, then we obtain
g k T d k ( 1 2 σ λ ) g k 2 .
(2.5)
Proof The conclusion can be proved by induction. Since g 1 T d 1 = g 1 2 , the conclusion (2.5) holds for k = 1 . Now we assume that the conclusion (2.5) holds for k 1 and g k + 1 0 . One gets from (1.3) that
g k + 1 T d k + 1 g k + 1 2 = 1 β k + 1 VLS g k + 1 T d k g k + 1 2 1 | β k + 1 VLS | | g k + 1 T d k | g k + 1 2 1 g k + 1 2 + g k + 1 g k | g k + 1 T g k | λ | g k T d k | | g k + 1 T d k | g k + 1 2 1 2 g k + 1 2 λ | g k T d k | σ | g k T d k | g k + 1 2 = 1 2 σ λ .

From the above inequality, the conclusion (2.5) holds for k + 1 . Thus, the conclusion (2.5) holds for k N + . □

Remark 2.1 From (2.5) and the definition of β k VLS , it is not difficult to find that
β k VLS = g k T ( g k t k g k 1 ) λ d k 1 T g k 1 + ( 1 λ ) max { 0 , g k T d k 1 } g k 2 g k g k 1 | g k T g k 1 | λ d k 1 T g k 1 + ( 1 λ ) max { 0 , g k T d k 1 } g k 2 g k g k 1 g k T g k 1 λ d k 1 T g k 1 + ( 1 λ ) max { 0 , g k T d k 1 } = 0 .

3 Global convergence of Algorithm 2.1

In order to prove the global convergence of Algorithm 2.1, the following assumptions for the objective function are often used.

Assumption (H)
  1. (i)

    The level set Ω = { x f ( x ) f ( x 1 ) } is bounded, where x 1 is the starting point.

     
  2. (ii)
    In some neighborhood V of Ω, the objective function f is continuously differentiable, and its gradient is Lipschitz continuous, i.e., there exists a constant L > 0 such that
    g ( x ) g ( y ) L x y , for all  x , y V .
    (3.1)
     
From Assumption (H), there exists a constant r ˜ > 0 such that
g k r ˜ , for all  k .

The conclusion of the following lemma, often called the Zoutendijk condition, is usually used to prove the global convergence properties of conjugate gradient methods. It was originally established by Zoutendijk [15].

Lemma 3.1 Suppose Assumption (H) holds. Let the sequences { g k } and { d k } be generated by Algorithm 2.1, then we have
k 1 ( g k T d k ) 2 d k 2 < + .
(3.2)
Lemma 3.2 Suppose Assumption (H) holds. Let the sequences { g k } and { d k } be generated by Algorithm 2.1, and let there exist a constant r > 0 such that
g k r , for all  k 1 .
(3.3)
Then we have
k 2 u k u k 1 2 < + , u k = d k d k .

Proof This lemma can be proved in a similar way as in [16], so we omit it. □

Lemma 3.3 Suppose Assumption (H) holds. Let the sequences { g k } and { d k } be generated by Algorithm 2.1, and let the sequence { g k } satisfy
0 < r g k r ˜ , for all  k 1 .
(3.4)
Then the conjugate parameter β k VLS has property ( ) , i.e.,
  1. (1)

    there exists a constant b > 1 such that | β k VLS | b ;

     
  2. (2)

    there exists a constant τ > 0 , such that x k x k 1 τ | β k VLS | 1 2 b .

     
Proof It follows from (2.1), (3.4), and (2.5) that
| β k VLS | = | g k T ( g k t k g k 1 ) λ d k 1 T g k 1 + ( 1 λ ) max { 0 , g k T d k 1 } | g k ( g k + r ˜ r g k 1 ) λ | d k 1 T g k 1 | g k ( g k + r ˜ r g k 1 ) ( λ 2 σ ) g k 1 2 r ˜ ( r ˜ + r ˜ 2 r ) ( λ 2 σ ) r 2 r ˜ 2 ( r + r ˜ ) ( λ 2 σ ) r 3 = b .
Define τ = ( λ 2 σ ) r 2 4 L r ˜ b . Let x k x k 1 τ , it then follows from Assumption (H)(ii) that
| β k VLS | = | g k T ( g k t k g k 1 ) λ d k 1 T g k 1 + ( 1 λ ) max { 0 , g k T d k 1 } | g k ( g k g k 1 + g k 1 t k g k 1 ) λ | d k 1 T g k 1 | r ˜ ( L τ + g k 1 t k g k 1 ) λ | d k 1 T g k 1 | r ˜ ( L τ + | g k g k 1 | ) ( λ 2 σ ) g k 1 2 r ˜ ( L τ + g k g k 1 ) ( λ 2 σ ) g k 1 2 2 L τ r ˜ ( λ 2 σ ) r 2 = 1 2 b .

 □

Lemma 3.4 Suppose Assumption (H) holds. Consider any method of (1.2)-(1.3), where β k 0 , and where α k satisfies the strong Wolfe line search. If β k has the property ( ) , and (2.5) and (3.4) hold, then there exists a constant τ > 0 , for any Δ Z + and k 0 Z + , and for any k k 0 such that
| k , Δ τ | > Δ 2 ,

where k , Δ τ Δ ̲ ̲ { i Z + : k i k + Δ 1 , x i x i 1 τ } , | k , Δ τ | denotes the number of k , Δ τ .

Proof This lemma plays an important role in proving the global convergences of PRP, HS, and LS conjugate gradient methods, and so on. It was originally proved in [17]. From Remark 2.1 and Lemma 3.3, it is easy to find that Algorithm 2.1 leads to the conclusion of Lemma 3.4. □

Theorem 3.1 Suppose Assumption (H) holds. Let the sequences { g k } and { d k } be generated by Algorithm 2.1. If β k VLS has the property ( ) , and (2.5) holds, then we obtain
lim inf k + g k = 0 .
(3.5)
Proof Using mathematical induction. Suppose that (3.5) does not hold, which means that there exists r > 0 such that
g k r , for all  k 1 .
(3.6)
We also define u k = d k d k , then for all l , k Z + ( l k ), we have
x l x k 1 = i = k l x i x i 1 u i 1 = i = k l s i 1 u k 1 + i = k l s i 1 ( u i 1 u k 1 ) ,
(3.7)

where s i 1 = x i x i 1 .

From Assumption (H), we know that there exists a constant ξ > 0 such that
x ξ , for  x V .
(3.8)
By (3.7), we have
i = k l s i 1 u k 1 = ( x l x k 1 ) i = k l s i 1 ( u i 1 u k 1 ) .
(3.9)
Since (3.8) and (3.9) hold, we have
i = k l s i 1 2 ξ + i = k l s i 1 u i 1 u k 1 .
(3.10)

Let τ come from Lemma 3.4, and we define Δ = [ 8 ξ / τ ] , where 8 ξ / τ Δ < ( 8 ξ / τ ) + 1 , and Δ Z + .

From Lemma 3.2, we know that there exists k 0 such that
i k 0 u i + 1 u i 2 1 4 Δ .
(3.11)
From the Cauchy-Schwarz inequality and (3.11), and letting i [ k , k + Δ 1 ] , we have
u i 1 u k 1 j = k i 1 u j u j 1 ( i k ) 1 2 ( j = k i 1 u j u j 1 2 ) 1 2 Δ 1 2 ( 1 4 Δ ) 1 2 = 1 2 .
(3.12)
From Lemma 3.4, we know that there exists k k 0 such that
| k , Δ τ | > Δ 2 .
(3.13)
By (3.10), (3.12), and (3.13), we have
2 ξ 1 2 i = k k + Δ 1 s i 1 > τ 2 | k , Δ τ | > τ Δ 4 .
(3.14)
From (3.14), we have Δ < 8 ξ / τ , which is a contradiction with the definition of Δ. Therefore,
lim inf k + g k = 0 .

Thus we complete the proof of Theorem 3.1. □

4 Numerical results

In this section, we compare the performance of Algorithm 2.1 with those of the PRP+ method [18] and the CG-DESCENT method [4] in the number of function evaluations and CPU time in seconds with the strong Wolfe line search. The test problems are some large-scaled unconstrained optimization problems in [19, 20]. The parameters in the line search are chosen as follows: δ = 0.01 , σ = 0.1 . If g k 10 6 is satisfied, we will terminate the program. All codes were written in Fortran 6.0 and run on a PC with 2.0 GHz CPU processor and 512 MB memory and Windows XP operation system.

The numerical results are reported in Table 1. The first column ‘Problems’ represents the problem’s name in [19, 20]. ‘Dim’ denotes the dimension of the test problems. The detailed numerical results are listed in the form NFCPU, where NF and CPU denote the number of function evaluations and CPU time in seconds, respectively.
Table 1

The numerical results of Algorithm 2.1 , PRP+ method and CG-DESCENT method

Problems

Dim

Algorithm 2.1

PRP+ method

CG-DESCENT method

Extended Freudenstein & Roth

5,000

580.07

1,1440.09

13,2351.24

10,000

120.01

4260.07

230.01

Extended Trigonometric

5,000

300.07

610.08

790.10

10,000

330.19

1120.28

1610.40

Extended Rosenbrock

5,000

410.02

670.02

600.02

10,000

340.03

620.03

570.01

Extended White & Holst

5,000

400.02

530.02

450.02

10,000

380.01

430.01

520.02

Extended Beale

5,000

150.01

260.02

240.00

10,000

150.01

260.00

240.01

Extended Penalty

5,000

110.01

510.00

1,9790.17

10,000

180.02

360.02

500.02

Perturbed Quadratic

5,000

7050.35

1,4620.42

1,4710.36

10,000

1,3530.88

2,0591.19

2,0140.95

Raydan 2

5,000

90.00

90.00

90.00

10,000

90.02

90.02

90.02

Diagonal 2

5,000

4320.44

9870.67

6990.45

10,000

5951.28

1,1171.52

1,2091.60

Generalized Tridiagonal 1

5,000

420.02

2,0130.27

530.01

10,000

710.14

5780.15

7070.19

Extended Tridiagonal 1

5,000

120.00

230.00

280.00

10,000

120.01

280.02

290.01

Extended Three Expo Terms

5,000

80.01

210.01

150.02

10,000

120.04

190.05

150.03

Generalized Tridiagonal 2

5,000

500.03

940.03

950.03

10,000

620.05

970.06

770.03

 Diagonal 4

5,000

80.00

80.00

80.00

10,000

80.00

80.00

80.00

 Diagonal 5

5,000

90.01

90.02

90.02

10,000

90.03

90.03

90.03

 Extended Himmelblau

5,000

180.00

350.00

160.00

10,000

180.02

350.03

160.00

 Generalized PSC1

5,000

1,8863.20

6330.60

17,67914.59

10,000

7292.61

1,2712.39

8,36413.94

 Extended PSC1

5,000

170.02

130.01

160.02

10,000

170.03

150.02

160.03

 Extended Powell

5,000

460.03

1380.03

2500.05

10,000

740.06

880.04

3110.09

Extended Block-Diagonal BD1

5,000

550.02

400.01

330.01

10,000

530.06

470.05

460.05

Extended Maratos

5,000

710.03

1320.03

1030.02

10,000

690.02

960.03

990.03

Quadratic Diagonal Perturbed

5,000

7930.22

8800.22

2,1110.39

10,000

1,5490.62

1,3030.66

2,9661.16

Extended Wood

5,000

850.02

570.01

1350.01

10,000

840.04

650.03

1160.05

Extended Hiebert

5,000

1760.04

1370.03

1200.03

10,000

1730.05

1370.05

1140.03

QuadraticQF1

5,000

1,2220.25

1,8540.50

1,3970.30

10,000

1,3960.82

1,8640.97

2,5451.06

Extended Quadratic Penalty QP2

5,000

450.05

800.06

760.06

10,000

430.11

710.13

840.14

QuadraticQF2

5,000

1,1670.40

1,6200.44

1,6130.36

10,000

1,4301.14

2,6251.45

2,9411.31

Extended EP1

5,000

60.00

60.00

60.00

10,000

4130.21

5130.25

4390.22

Extended Tridiagonal 2

5,000

8750.23

2,4360.27

8570.11

10,000

5,1391.17

5,8571.29

6,5691.47

ARWHEAD

5,000

160.00

160.00

320.01

10,000

110.00

140.00

110.00

NONDIA

5,000

150.00

150.00

170.00

10,000

150.02

160.00

170.02

DQDRTIC

5,000

170.00

190.01

210.00

10,000

260.01

250.00

230.01

DIXMAANA

5,000

110.02

120.02

160.00

10,000

110.01

120.01

140.02

DIXMAANB

5,000

210.01

200.02

230.00

10,000

210.02

200.02

230.03

DIXMAANC

5,000

220.01

240.01

280.02

10,000

220.01

250.02

280.01

Broyden Tridiagonal

5,000

770.02

1250.03

1320.03

10,000

760.04

1210.08

1100.05

Almost Perturbed Quadratic

5,000

1,1560.30

1,4790.39

1,4480.31

10,000

1,9060.95

2,1981.19

2,1490.92

Tridiagonal Perturbed Quadratic

5,000

1,4890.37

1,7830.53

1,5620.41

10,000

1,1401.15

1,8791.11

2,4771.25

EDENSCH

5,000

3250.04

3620.05

2,1570.33

10,000

1,1060.26

1,4920.45

1,5940.47

VARDIM

5,000

340.00

460.02

460.00

10,000

470.02

520.03

520.03

LIARWHD

5,000

240.01

290.02

340.00

10,000

280.01

380.01

410.02

Diagonal 6

5,000

90.00

90.00

90.00

10,000

90.01

90.02

90.01

DIXMAANG

5,000

1,1860.42

7150.39

6600.33

10,000

2,2611.84

1,2751.42

1,0421.07

DIXMAANI

5,000

5240.38

8500.47

7020.36

10,000

6581.15

1,2661.37

9851.00

DIXMAANJ

5,000

7700.35

8290.44

7700.38

10,000

1,8231.19

1,1081.14

1,1081.09

DIXMAANK

5,000

1,5100.52

7150.41

8120.41

10,000

3,6582.03

9631.09

1,3031.31

ENGVAL1

5,000

5,2770.55

7,4740.86

6,4360.69

10,000

7,3801.72

7,1961.62

21,4024.56

COSINE

5,000

300.02

330.02

310.03

10,000

300.03

340.04

310.03

DENSCHNB

5,000

100.01

140.02

130.02

10,000

100.00

150.00

130.00

DENSCHNF

5,000

190.01

390.01

350.01

10,000

190.02

290.02

310.02

SINQUAD

5,000

5150.51

9760.83

5660.47

10,000

2,0112.61

2,1163.59

5,98910.08

We say that, in particular for the i th problem, the performance of the M1 method was better than the performance of M2 method, if the CPU time, or the number of function evaluations, of the M1 method was smaller than the CPU time, or the number of iterations of the M2 method, respectively. In order to estimate the whole effect, we apply the performance profiles of Dolan and Moré [21] in CPU time. From Table 1, some CPU times are zero. In order to have a comprehensive evaluation of the M1 and M2 methods in CPU time, we take the average value of the CPU time for each method, and denote av ( M 1 ) , av ( M 2 ) . Then we take the CPU time of each problem plus the average value of av ( M 1 ) and av ( M 2 ) . According to their description, the top curve is the method that solved the most problems in a time that was within a factor τ of the best time; see Figure 1 and Figure 2. Using the same method, we also test on the number of function evaluations; see Figure 3 and Figure 4.
Figure 1

Performance profiles with respect to CPU time in seconds.

Figure 2

Performance profiles with respect to CPU time in seconds.

Figure 3

Performance profiles with respect to the numbers of iterations.

Figure 4

Performance profiles with respect to the numbers of iterations.

Obviously, Algorithm 2.1 is competitive to the PRP+ method and the CG-DESCENT method in the number of function evaluations and CPU time. Thus, it is of great importance to study Algorithm 2.1.

Declarations

Acknowledgements

The author wishes to express their heartfelt thanks to the anonymous referees and the editor for their detailed and helpful suggestions for revising the manuscript.

Authors’ Affiliations

(1)
College of General Education, Chongqing College of Electronic Engineering

References

  1. Polak E, Ribière G: Note sur la convergence de méthodes de directions conjuguées. Rev. Fr. Inform. Rech. Oper. 1969,3(16):35–43.MATHGoogle Scholar
  2. Polak BT: The conjugate gradient method in extreme problems. USSR Comput. Math. Math. Phys. 1969, 9: 94–112. 10.1016/0041-5553(69)90035-4View ArticleGoogle Scholar
  3. Liu Y, Storey C: Efficient generalized conjugate gradient algorithms. Part 1: theory. J. Optim. Theory Appl. 1992, 69: 129–137.MathSciNetView ArticleMATHGoogle Scholar
  4. Hager WW, Zhang H: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 2005, 16: 170–192. 10.1137/030601880MathSciNetView ArticleMATHGoogle Scholar
  5. Powell MJD: Nonconvex minimization calculations and the conjugate gradient method. Lecture Notes in Mathematics 1066. In Numerical Analysis. Springer, Berlin; 1984:122–141.View ArticleGoogle Scholar
  6. Hager WW, Zhang H: A survey of nonlinear conjugate gradient methods. Pac. J. Optim. 2006, 2: 35–58.MathSciNetMATHGoogle Scholar
  7. Zhang L, Zhou W, Li DH: A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence. IMA J. Numer. Anal. 2006, 26: 629–640. 10.1093/imanum/drl016MathSciNetView ArticleMATHGoogle Scholar
  8. Li ZF, Chen J, Deng NY: A new conjugate gradient method and its global convergence properties. Math. Program. 1997, 78: 375–391.Google Scholar
  9. Liu J: Convergence properties of a class of nonlinear conjugate gradient methods. Comput. Oper. Res. 2013, 40: 2656–2661. 10.1016/j.cor.2013.05.013MathSciNetView ArticleGoogle Scholar
  10. Liu J, Du X: Global convergence of a modified LS method. Math. Probl. Eng. 2012., 2012: Article ID 910303Google Scholar
  11. Li M, Chen Y, Qu A-P: Global convergence of a modified Liu-Storey conjugate gradient method. U.P.B. Sci. Bull., Ser. A 2012, 74: 11–26.MathSciNetMATHGoogle Scholar
  12. Tang C, Wei Z, Li G: A new version of the Liu-Storey conjugate gradient method. Appl. Math. Comput. 2007, 189: 302–313. 10.1016/j.amc.2006.11.098MathSciNetView ArticleMATHGoogle Scholar
  13. Liu J, Du X, Wang K: Convergence of descent methods with variable parameters. Acta Math. Appl. Sin. 2010, 33: 222–230. (in Chinese)MathSciNetMATHGoogle Scholar
  14. Wei Z, Yao S, Liu L: The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 2006, 183: 1341–1350. 10.1016/j.amc.2006.05.150MathSciNetView ArticleMATHGoogle Scholar
  15. Zoutendijk G: Nonlinear programming, computational methods. In Integer and Nonlinear Programming. Edited by: Abadie J. North-Holland, Amsterdam; 1970:37–86.Google Scholar
  16. Li ZF, Chen J, Deng NY: Convergence properties of conjugate gradient methods with Goldstein line searches. J. China Agric. Univ. 1996,I(4):15–18.Google Scholar
  17. Dai YH, Yuan Y: Nonlinear Conjugate Gradient Method. Shanghai Scientific & Technical Publishers, Shanghai; 2000. (in Chinese)Google Scholar
  18. Powell MJD: Convergence properties of algorithms for nonlinear optimization. SIAM Rev. 1986, 28: 487–500. 10.1137/1028154MathSciNetView ArticleMATHGoogle Scholar
  19. Bongartz I, Conn AR, Gould NIM, Toint PL: CUTE: constrained and unconstrained testing environments. ACM Trans. Math. Softw. 1995, 21: 123–160. 10.1145/200979.201043View ArticleMATHGoogle Scholar
  20. Andrei N: An unconstrained optimization test functions collection. Adv. Model. Optim. 2008, 10: 147–161.MathSciNetMATHGoogle Scholar
  21. Dolan ED, Moré JJ: Benchmarking optimization software with performance profiles. Math. Program. 2002, 91: 201–213. 10.1007/s101070100263MathSciNetView ArticleMATHGoogle Scholar

Copyright

© Wu; licensee Springer. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.