Two efficient modifications of AZPRP conjugate gradient method with sufficient descent property

Abstract

The conjugate gradient method can be applied in many fields, such as neural networks, image restoration, machine learning, deep learning, and many others. Polak–Ribiere–Polyak and Hestenses–Stiefel conjugate gradient methods are considered as the most efficient methods to solve nonlinear optimization problems. However, both methods cannot satisfy the descent property or global convergence property for general nonlinear functions. In this paper, we present two new modifications of the PRP method with restart conditions. The proposed conjugate gradient methods satisfy the global convergence property and descent property for general nonlinear functions. The numerical results show that the new modifications are more efficient than recent CG methods in terms of number of iterations, number of function evaluations, number of gradient evaluations, and CPU time.

1 Introduction

We consider the following form for the unconstrained optimization problem:

$$\min \bigl\{ f(x) |x \in R^{n}\bigr\} ,$$
(1.1)

where $$f:R^{n} \to R$$ is a continuously differentiable function and its gradient is denoted by $$g(x) = \nabla f(x)$$. To solve (1.1) using the CG method, we use the following iterative method starting from the initial point $$x_{0} \in R^{n}$$. Then

$$x_{k + 1} = x_{k} + \alpha _{k}d_{k}, \quad k = 0,1,2,\ldots,$$
(1.2)

where $$\alpha _{k} > 0$$ is the step size obtained by some line search. The search direction $$d_{k}$$ is defined by

$$d_{k} = \textstyle\begin{cases} - g_{k},& k = 0, \\ - g_{k} + \beta _{k}d_{k - 1},&k \ge 1, \end{cases}$$
(1.3)

where $$g_{k} = g(x_{k})$$ and $$\beta _{k}$$ is known as the conjugate gradient method. To obtain the step-length $$\alpha _{k}$$, we have the following two line searches:

1. 1.

Exact line search

$$f(x_{k} + \alpha _{k}d_{k}) = \min f(x_{k} + \alpha d_{k}),\quad \alpha \ge 0.$$
(1.4)

However, (1.4) is computationally expensive if the function has many local minima.

2. 2.

Inexact line search

To overcome the cost of using exact line search and obtain steps that are neither too long nor too short, we usually use inexact line search, in particular weak Wolfe–Powell (WWP) line search [1, 2] given as follows:

\begin{aligned}& f(x_{k} + \alpha _{k}d_{k}) \le f(x_{k}) + \delta \alpha _{k}g_{k}^{T}d_{k}, \end{aligned}
(1.5)
\begin{aligned}& g(x_{k} + \alpha _{k}d_{k})^{T}d_{k} \ge \sigma g_{k}^{T}d_{k}. \end{aligned}
(1.6)

Another, strong, version of Wolfe–Powell (SWP) line search is given by (1.5) and

$$\bigl\vert g(x_{k} + \alpha _{k}d_{k})^{T}d_{k} \bigr\vert \le \sigma \bigl\vert g_{k}^{T}d_{k} \bigr\vert ,$$
(1.7)

where $$0 < \delta < \sigma < 1$$.

The descent condition (downhill condition) plays an important role in the CG method, where the equation of the descent condition is given as follows:

$$g_{k}^{T}d_{k} < 0.$$
(1.8)

Albaali [3] extended (1.8) to the following form:

$$g_{k}^{T}d_{k} \le - c \Vert g_{k} \Vert ^{2},\quad k \ge 0\text{ and }c > 0,$$
(1.9)

called the sufficient descent condition.

The steepest descent method is the simplest of the gradient methods for optimization functions in n variables. From a current trial point $$x_{1}$$, for a function $$f(x)$$, one expects to find a vector close to a minimum by moving away from $$x_{1}$$ along the direction which causes $$f(x)$$ to decrease rapidly, i.e., $$f(x_{1}) > f(x_{2}) > f(x_{3}) > \cdots$$ . This direction of steepest descent is given by the negative gradient, $$- g_{k}$$. Using contour lines, the minimum point of a function is obtained with two variables. For example, Fig. 1 shows contour lines for Booth function in two dimensions.

As we see in Fig. 2, the gradient $$f'(x)$$ is orthogonal with the contour lines, and for every x, the gradient point in the direction of the steepest increases $$f(x)$$. In Fig. 2, the gradient, contours, and Booth function are plotted, which clearly portrays the function’s minimum using the function or contour line graph. Despite the steepest descent method robustness, it is not efficient due to CPU time for large-dimensional functions. Thus, using the CG method will avoid the orthogonality between the f and the search direction. Figure 3 shows the angle between the f and $$d_{k}$$ using the CG method.

$$\cos (\theta _{k}) = \biggl( - \frac{d_{k}^{T}g_{k}}{ \Vert d_{k} \Vert \Vert g_{k} \Vert } \biggr).$$

The most famous classical formulas of CG methods are Hestenses–Stiefel (HS) [3], Polak–Ribiere–Polyak (PRP) [4], Liu and Storey (LS) [5], Fletcher–Reeves (FR) [6], Fletcher (CD) [7], Dai and Yuan (DY) [8], given as follows:

\begin{aligned}& \beta _{k}^{\mathrm{HS}} = \frac{g_{k}^{T}y_{k - 1}}{d_{k - 1}^{T}y_{k - 1}}, \qquad \beta _{k}^{\mathrm{PRP}} = \frac{g_{k}^{T}y_{k - 1}}{ \Vert g_{k - 1} \Vert ^{2}}, \qquad \beta _{k}^{\mathrm{LS}} = - \frac{g_{k}^{T}y_{k - 1}}{d_{k - 1}^{T}g_{k - 1}}, \\& \beta _{k}^{\mathrm{FR}} = \frac{ \Vert g_{k} \Vert ^{2}}{ \Vert g_{k - 1} \Vert ^{2}}, \qquad \beta _{k}^{\mathrm{CD}} = - \frac{ \Vert g_{k} \Vert ^{2}}{d_{k - 1}^{T}g_{k - 1}}, \qquad \beta _{k}^{\mathrm{DY}} = \frac{ \Vert g_{k} \Vert ^{2}}{d_{k - 1}^{T}g_{k - 1}}, \end{aligned}

where $$y_{k - 1} = g_{k} - g_{k - 1}$$.

These methods are similar if we use exact line search and a function satisfying quadratic line search condition since $$g_{k}^{T}d_{k - 1} = 0$$, which implies $$g_{k}^{T}d_{k} = - \Vert g_{k} \Vert ^{2}$$ using (1.3). In addition, if the function is quadratic, then $$g_{k}^{T}g_{k - 1} = 0$$.

The global convergence properties were studied by Zoutendijk [9] and Al-Baali [10]. The global convergence of the PRP method for a convex objective function under exact line search was proved by Polak and Ribere in [4]. Later, Powell [11] gave a counterexample showing a nonconvex function, in which PRP and HS can cycle infinitely without getting a solution. Powell emphasized the importance to achieve the global convergence of PRP and HS method, which should not be negative. Moreover, Gilbert and Nocedal [12] proved that nonnegative PRP, i.e., with $$\beta _{k} = \max \{ \beta _{k}^{\mathrm{PRP}}, 0 \}$$, is globally convergent under complicated line searches.

Since the function is quadratic, i.e., the step size is obtained by exact line search (1.4), the CG method satisfies the conjugacy condition, i.e., $$d_{i}^{T}Hd_{j}^{T} = 0$$, $$\forall i \ne j$$. Using the mean value theorem and exact line search with equation (1.3), we can obtain $$\beta _{k}^{\mathrm{HS}}$$. From the quasi-Newton method, BFGS method, the limited memory (LBFGS) method, and equation (1.3), Dai and Liao [13] proposed the following conjugacy condition:

$$d_{k}^{T}y_{k - 1} = - tg_{k}^{T}s_{k - 1},$$
(1.10)

where $$s_{k - 1} = x_{k} - x_{k - 1}$$, and $$t \ge 0$$. In the case of $$t = 0$$, equation (1.10) becomes the classical conjugacy condition. By using (1.3) and (1.10), [13] proposed the following CG formula:

$$\beta _{k}^{\mathrm{DL}} = \frac{g_{k}^{T}y_{k - 1}}{d_{k - 1}^{T}y_{k - 1}} - t \frac{g_{k}^{T}s_{k - 1}}{d_{k - 1}^{T}y_{k - 1}}.$$
(1.11)

However, $$\beta _{k}^{\mathrm{DL}}$$ faces the same problem as $$\beta _{k}^{\mathrm{PRP}}$$ and $$\beta _{k}^{\mathrm{HS}}$$, i.e., $$\beta _{k}^{\mathrm{DL}}$$ is not nonnegative in general. Thus, [13] replaced equation (1.11) by

$$\beta _{k}^{\mathrm{DL} +} = \max \bigl\{ \beta _{k}^{\mathrm{HS}},0\bigr\} - t\frac{g_{k}^{T}s_{k - 1}}{d_{k - 1}^{T}y_{k - 1}}.$$
(1.12)

Moreover, Hager and Zhang [14, 15] presented a modified CG parameter that satisfies the descent property for any inexact line search with $$g_{k}^{T}d_{k} \le - (7/8) \Vert g_{k} \Vert ^{2}$$. This new version of the CG method is globally convergent whenever the line search satisfies the (WP) line search requirement. This formula is given as follows:

$$\beta _{k}^{\mathrm{HZ}} = \max \bigl\{ \beta _{k}^{N}, \eta _{k}\bigr\} ,$$
(1.13)

where $$\beta _{k}^{N} = \frac{1}{d_{k}^{T}y_{k}}(y_{k} - 2d_{k}\frac{ \Vert y_{k} \Vert ^{2}}{d_{k}^{T}y_{k}})^{T}g_{k}$$, $$\eta _{k} = - \frac{1}{ \Vert d_{k} \Vert \ \min \{ \eta , \Vert g_{k} \Vert \}}$$, and $$\eta > 0$$ is a constant.

Note that if $$t = 2\frac{ \Vert y_{k} \Vert ^{2}}{s_{k}^{T}y_{k}}$$, then $$\beta _{k}^{N} = \beta _{k}^{\mathrm{DY}}$$.

In 2006, Wei et al. [16] gave a new positive CG method, which is quite similar to the original PRP method, which has global convergence under exact and inexact line search, that is,

$$\beta _{k}^{\mathrm{WYL}} = \frac{g_{k}^{T}(g_{k} - \frac{ \Vert g_{k} \Vert }{ \Vert g_{k - 1} \Vert }g_{k - 1})}{ \Vert g_{k - 1} \Vert ^{2}},$$

where $$y_{k - 1} = g_{k} - g_{k - 1}$$. From the WYL method, many modifications appeared, such as the following [17]:

$$\beta _{k}^{\mathrm{DPRP}} = \frac{ \Vert g_{k} \Vert ^{2} - \frac{ \Vert g_{k} \Vert }{ \Vert g_{k - 1} \Vert } \vert g_{k}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}},\quad m \ge 1 \text{ [11]}$$

and

$$\beta _{k}^{\mathrm{DHS}} = \frac{ \Vert g_{k} \Vert ^{2} - \frac{ \Vert g_{k} \Vert }{ \Vert g_{k - 1} \Vert }g_{k}^{T}g_{k - 1}}{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}},\quad \text{where } m > 1.$$

Alhawarat et al. [18] constructed the following CG method with a new restart criterion as follows:

$$\beta _{k}^{\mathrm{AZPRP}} = \textstyle\begin{cases} \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k}. \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}},& \Vert g_{k} \Vert ^{2} > \mu _{k}. \vert g_{k}^{T}g_{k - 1} \vert , \\ 0, &\text{otherwise}, \end{cases}$$

where $$\mu _{k} = \frac{ \Vert s_{k} \Vert }{ \Vert y_{k} \Vert }$$, $$s_{k} = x_{k} - x_{k - 1}$$, $$y_{k} = g_{k} - g_{k - 1}$$, and $$\Vert \cdot \Vert$$ denotes the Euclidean norm.

Besides, Kaelo et al. [19] proposed the following CG formula:

$$\beta _{k}^{\mathrm{PKT}} = \textstyle\begin{cases} \frac{ \Vert g_{k} \Vert ^{2} - g_{k}^{T}g_{k - 1}}{\max \{ d_{k - 1}^{T}y_{k - 1}, - g_{k - 1}^{T}d_{k - 1}\}},& \text{if } 0 < g_{k}^{T}g_{k - 1} < \Vert g_{k} \Vert ^{2}, \\ \frac{ \Vert g_{k} \Vert ^{2}}{\max \{ d_{k - 1}^{T}y_{k - 1}, - g_{k - 1}^{T}d_{k - 1}\}}, &\text{otherwise}. \end{cases}$$

2 Motivation and the new restarted formula

To improve the efficiency of $$\beta _{k}^{\mathrm{AZPRP}}$$ in terms of function evaluation, gradient evaluation, number of iterations, and CPU time, we construct two new CG methods based on $$\beta _{k}^{\mathrm{AZPRP}}$$, $$\beta _{k}^{\mathrm{DPRP}}$$, and $$\beta _{k}^{\mathrm{DHS}}$$ as follows:

$$\beta _{k}^{A1} = \textstyle\begin{cases} \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}},& \text{if } \Vert g_{k} \Vert ^{2} > \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert , \\ - \mu _{k}\frac{g_{k}^{T}s_{k - 1}}{d_{k - 1}^{T}y_{k - 1}},& \text{otherwise}, \end{cases}$$
(2.1)

where

$$\mu _{k} = \frac{ \Vert s_{k - 1} \Vert }{ \Vert y_{k - 1} \Vert }.$$
(2.2)

The second modification is given as follows:

$$\beta _{k}^{A2} = \textstyle\begin{cases} \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}}, &\text{if } \Vert g_{k} \Vert ^{2} > \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert , \\ - \mu _{k}\frac{g_{k}^{T}s_{k - 1}}{d_{k - 1}^{T}y_{k - 1}},& \text{otherwise}. \end{cases}$$
(2.3)

3 The global convergence properties

Assumption 1

1. I.

$$f(x)$$ is bounded from below on the level set $$\Omega = \{ x \in R^{n}:f(x) \le f(x_{1})\}$$, where $$x_{1}$$ is the starting point.

2. II.

In some neighborhood N of Ω, f is continuous and differentiable, and its gradient is Lipchitz continuous. That is, for any $$x,y \in N$$, there exists a constant $$L > 0$$ such that

$$\bigl\Vert g(x) - g(y) \bigr\Vert \le L \Vert x - y \Vert .$$

The following is considered one of the most important lemmas used to prove the global convergence properties. For more details, the reader can refer to [9].

Lemma 3.1

Suppose Assumption 1holds. Considering the CG method of the form (1.3), where the search direction satisfies the sufficient descent condition and $$\alpha _{k}$$ exists by standard WWP line search, we have

$$\sum_{k = 0}^{\infty } \frac{(g_{k}^{T}d_{k})^{2}}{ \Vert d_{k} \Vert ^{2}} < \infty ,$$
(3.1)

where (3.1) is known as the Zoutendijk condition. Inequality (3.1) also holds for the exact line search, the Armijo-Goldstein line search, and the SWP line search.

Substituting (1.9) into (3.1) yields

$$\sum_{k = 0}^{\infty } \frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}} < \infty .$$
(3.2)

Gilbert and Nocedal [11] presented an important theorem to find the global convergence of nonnegative PRP and nonnegative methods summarized by Theorem 3.3. Furthermore, they presented a nice property, called Property*, as follows:

Property*

Consider a method of the form (1.1) and (1.2), and suppose $$0 < \gamma \le \Vert g_{k} \Vert \le \bar{\gamma }$$. We say that the method possesses Property* if there exist constant $$b > 1$$ and $$\lambda > 0$$ such that for all $$k \ge 1$$, we get $$\vert \beta _{k} \vert \le b$$, and if $$\Vert x_{k} - x_{k - 1} \Vert \le \lambda$$, then

$$\vert \beta _{k} \vert \le \frac{1}{2b}.$$

The following theorem plays a crucial role in the CG method given in [11].

Theorem 3.1

Considering any CG method of the form (1.2) and (1.3), suppose the following conditions hold:

1. I.

$$\beta _{k} > 0$$.

2. II.

The sufficient descent condition is satisfied.

3. III.

The Zoutendijk condition holds.

4. IV.

Property* is true.

5. V.

Assumption 1is satisfied.

Then, the iterates are globally convergent, i.e., $$\lim_{k \to \infty } \Vert g_{k} \Vert = 0$$.

Theorem 3.2

Suppose that Assumption 1holds. Then, by considering the CG method of the form (1.2), (1.3), and (2.1), where $$\alpha _{k}$$ is computed by (1.5) and (1.6) and the sufficient descent condition holds, we multiply (1.2) by $$g_{k}^{T}$$, which yields

\begin{aligned} g_{k}d_{k} &= - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} {g}_{k}^{T}d_{k - 1} \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le \Vert g_{k} \Vert ^{2} \biggl( - 1 + \frac{1}{m} \biggr). \end{aligned}

Theorem 3.3

Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where $$\alpha _{k}$$ is computed by (1.5) and (1.6), then $$\beta _{k}^{A1}$$ satisfies Property*.

Proof

Let $$\lambda = \frac{\gamma ^{2}}{2L(L + 1)\lambda \bar{\gamma } b}$$ and

$$\beta _{k}^{A1} = \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \le \frac{ \Vert g_{k} \Vert ^{2} + \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}} \le \frac{ \Vert g_{k} \Vert ( \Vert g_{k} \Vert + \Vert g_{k - 1} \Vert )}{ \Vert g_{k - 1} \Vert ^{2}} \le \frac{2\bar{\gamma }^{2}}{\gamma ^{2}} = b > 1.$$

To show that $$\beta _{k}^{A1} \le \frac{1}{2b}$$, we have the following two cases:

Case 1: $$\mu _{k} > 1$$

\begin{aligned} \beta _{k}^{A1} &= \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \le \frac{ \Vert g_{k} \Vert ^{2} - \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}} \\ &\le \frac{ \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{ \Vert g_{k - 1} \Vert ^{2}} \le \frac{L\lambda \bar{\gamma }}{\gamma ^{2}}. \end{aligned}

Case 2: $$\mu _{k} < 1$$

To satisfy Property* for $$\beta _{k}^{A1}$$ with $$\mu _{k} < 1$$, we need the following inequality:

$$\Vert w_{k} \Vert + \Vert v_{k} \Vert \le L \Vert w_{k} + v_{k} \Vert ,$$
(3.3)

where $$w_{k} = g_{k} - \frac{1}{L}g_{k - 1}$$, and $$v_{k} = \frac{1}{L}g_{k} - g_{k - 1}$$, which yields

\begin{aligned} \bigl\vert \beta _{k}^{A1} \bigr\vert &\le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \biggr\vert \le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \frac{1}{L} \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}} \biggr\vert \\ & \le \frac{ \Vert g_{k} \Vert \Vert g_{k} - \frac{1}{L}g_{k - 1} \Vert }{ \Vert g_{k - 1} \Vert ^{2}}. \end{aligned}

Using (3.3), we obtain

\begin{aligned}& \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} \biggr\Vert \le L \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} + \frac{1}{L}g_{k} - g_{k - 1} \biggr\Vert \le ({L} + 1) \Vert g_{k} - g_{k - 1} \Vert , \\& \bigl\vert \beta _{k}^{A1} \bigr\vert \le \frac{(L + 1) \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{ \Vert g_{k - 1} \Vert ^{2}}\le L\frac{(L + 1)\lambda \bar{\gamma }}{\gamma ^{2}}. \end{aligned}

Thus, in all cases

$$\bigl\vert \beta _{k}^{A1} \bigr\vert \le \frac{L\lambda \bar{\gamma }}{\gamma ^{2}} \le \frac{L(L + 1)\lambda \bar{\gamma }}{\gamma ^{2}} \le \frac{1}{2b}.$$

The proof is completed. □

Theorem 3.4

Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where $$\alpha _{k}$$ is computed by (1.5) and (1.6), then $$\lim_{k \to \infty } \Vert g_{k} \Vert = 0$$.

Proof

We will apply Theorem 3.1. Note that the following properties hold for $$\beta _{k}^{A1}$$:

1. i.

$$\beta _{k}^{A1} > 0$$.

2. ii.

$$\beta _{k}^{A1}$$ satisfies Property* using Theorem 3.3.

3. iii.

$$\beta _{k}^{A1}$$ satisfies the descent property using Theorem 3.2.

4. iv.

Assumption 1 holds.

Thus, all properties in Theorem 3.1 are satisfied, which leads to $$\lim_{k \to \infty } \Vert g_{k} \Vert = 0$$. □

Theorem 3.5

Suppose Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where $$\alpha _{k}$$ is computed by (1.5) and (1.6), and where the sufficient descent condition holds for $$\beta _{k}^{A2}$$. Since $$d_{k - 1}^{T}y_{k - 1} \ge 0$$, we obtain

\begin{aligned} g_{k}d_{k}& = - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} {g}_{k}^{T}d_{k - 1}\\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le \Vert g_{k} \Vert ^{2} \biggl( - 1 + \frac{1}{m} \biggr). \end{aligned}

Theorem 3.6

Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where $$\alpha _{k}$$ is computed by (1.5) and (1.6), then the iterates $$\beta _{k}^{A2}$$ satisfy Property*.

Proof

Let $$\lambda = \frac{(1 - \sigma )c\gamma ^{2}}{2L(L + 1)\bar{\gamma } b}$$ and

\begin{aligned} \beta _{k}^{A2} &= \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} \le \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{d_{k - 1}^{T}y_{k - 1}}\\ & \le \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}}\le \frac{ \Vert g_{k} \Vert ^{2} + \vert g_{k}^{T}g_{k - 1} \vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}}\\ & \le \frac{ \Vert g_{k} \Vert ( \Vert g_{k} \Vert + \Vert g_{k - 1} \Vert )}{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \le \frac{2\bar{\gamma }^{2}}{(1 - \sigma )c\gamma ^{2}} = b > 1. \end{aligned}

To show that $$\beta _{k}^{A2} \le \frac{1}{2b}$$, we have the following two cases:

Case $$\mu _{k} > 1$$

\begin{aligned} \beta _{k}^{A2} &= \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} \le \frac{ \Vert g_{k} \Vert ^{2} - \vert g_{k}^{T}g_{k - 1} \vert }{d_{k - 1}^{T}y_{k - 1}} \\ &\le \frac{ \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \le \frac{L\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}}.\vadjust{\goodbreak} \end{aligned}

Case $$\mu _{k} < 1$$

To satisfy Property* for $$\beta _{k}^{A1}$$ with $$\mu _{k} < 1$$, we need property (3.3) which gives

\begin{aligned} \bigl\vert \beta _{k}^{A2} \bigr\vert & \le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} \biggr\vert \\ & \le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \frac{1}{L} \vert g_{k}^{T}g_{k - 1} \vert }{d_{k - 1}^{T}y_{k - 1}} \biggr\vert \\ &\le \biggl\vert \frac{ \Vert g_{k} \Vert \Vert g_{k} - \frac{1}{L}g_{k - 1} \Vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \biggr\vert . \end{aligned}

Using (3.3), we obtain

\begin{aligned}& \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} \biggr\Vert \le L \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} + \frac{1}{L}g_{k} - g_{k - 1} \biggr\Vert \le ({L} + 1) \Vert g_{k} - g_{k - 1} \Vert , \\& \bigl\vert \beta _{k}^{A2} \bigr\vert \le \frac{(L + 1) \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \le L\frac{(L + 1)\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}}. \end{aligned}

Thus, in all cases

$$\bigl\vert \beta _{k}^{A2} \bigr\vert \le \frac{L\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}} \le L\frac{(L + 1)\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}} \le \frac{1}{2b}.$$

□

Theorem 3.7

Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), i.e., $$\beta _{k}^{A2}$$, where $$\alpha _{k}$$ is computed by (1.5) and (1.6), then $$\lim_{k \to \infty } \Vert g_{k} \Vert = 0$$.

Proof

We will apply Theorem 3.1. Note that the following properties hold for $$\beta _{k}^{A2}$$:

1. i.

$$\beta _{k}^{A2} > 0$$.

2. ii.

$$\beta _{k}^{A2}$$ satisfies Property* by using Theorem 3.6.

3. iii.

$$\beta _{k}^{A2}$$ satisfies the descent property by using Theorem 3.5.

4. iv.

Assumption 1 holds.

Thus all properties in Theorem 3.1 are satisfied, which leads to $$\lim_{k \to \infty } \Vert g_{k} \Vert = 0$$.

If the condition $$\Vert g_{k} \Vert ^{2} > \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert$$ does not hold for $$\beta _{k}^{A1}$$ and $$\beta _{k}^{A2}$$, then the CG method will be restarted using $$\beta _{k}^{D - H} = - \mu _{k}\frac{g_{k}^{T}s_{k - 1}}{d_{k - 1}^{T}y_{k - 1}}$$. □

The following two theorems show that the CG method with $$\beta _{k}^{D - H}$$ has the descent and convergence properties.

Theorem 3.8

Let sequences $$\{ x_{k}\}$$ and $$\{ d_{k}\}$$ be obtained using Eqs. (1.2) and (1.3), which is computed by SWP line search in Eqs. (1.5) and (1.7), then the descent condition holds for $$\{ d_{k}\}$$ with $$\beta _{k}^{D - H}$$.

Proof

By multiplying Eq. (1.3) with $$g_{k}^{T}$$, and substituting $$\beta _{k}^{D - H}$$, we obtain

\begin{aligned} g_{k}^{T}d_{k} &= - \Vert g_{k} \Vert ^{2} - t\frac{g_{k}^{T}s_{k - 1}}{d_{k}^{T}y_{k - 1}} g_{k}^{T}d_{k - 1} \\ &= - \Vert g_{k} \Vert ^{2} - t\alpha _{k}\frac{ \Vert g_{k}^{T}d_{k - 1}^{T} \Vert ^{2}}{d_{k}^{T}y_{k - 1}} \le - \Vert g_{k} \Vert ^{2}. \end{aligned}

Letting $$c = 1$$, we then obtain

$$g_{k}^{T}d_{k} \le - c \Vert g_{k} \Vert ^{2},$$

which completes the proof. □

Theorem 3.9

Assume that Assumption 1holds. Consider the conjugate gradient method in (1.2) and (1.3) with $$\beta _{k}^{D - H}$$ a descent direction and $$\alpha _{k}$$ obtained by the strong Wolfe line search. Then, $$\lim \inf_{ k \to \infty } \Vert g_{k} \Vert = 0$$.

Proof

We will prove this theorem by contradiction. Suppose Theorem 3.4 is not true. Then, a constant $$\varepsilon > 0$$ exists such that

$$\Vert g_{k} \Vert \ge \varepsilon , \quad \forall k \ge 1.$$
(3.4)

By squaring both sides of (1.2), we obtain

\begin{aligned}& \begin{aligned} \Vert d_{k} \Vert ^{2} &= \Vert g_{k} \Vert ^{2} - 2\beta _{k}g_{k}^{T}d_{k - 1} + \beta _{k}^{2} \Vert d_{k - 1} \Vert ^{2} \\ &\le \Vert g_{k} \Vert ^{2} + 2 \vert \beta _{k} \vert \bigl\vert g_{k}^{T}d_{k - 1} \bigr\vert + \beta _{k}^{2} \Vert d_{k - 1} \Vert ^{2} \\ &\le \Vert g_{k} \Vert ^{2} + \frac{2}{L} \frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma ) \vert g_{k - 1}^{T}d_{k - 1} \vert } ( \sigma ) \bigl\vert g_{k - 1}^{T}d_{k - 1} \bigr\vert + \frac{1}{L^{2}}\frac{ ( ( \sigma )g_{k - 1}^{T}d_{k - 1} )^{2} \Vert s_{k - 1} \Vert ^{2}}{ ( (1 - \sigma )g_{k - 1}^{T}d_{k - 1} )^{2}}\\ & \le \Vert g_{k} \Vert ^{2} + \frac{2}{L} \frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma )}\sigma + \frac{1}{L^{2}}\frac{ ( \sigma )^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2}}, \end{aligned} \\& \begin{aligned} \frac{ \Vert d_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}}& \le \frac{ \Vert g_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}} + \frac{2}{L}\frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma ) \Vert g_{k} \Vert ^{4}}\sigma + \frac{1}{L^{2}} \frac{\sigma ^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2} \Vert g_{k} \Vert ^{4}} \\ &\le \frac{1}{ \Vert g_{k} \Vert ^{2}} + \frac{2}{L}\frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma ) \Vert g_{k} \Vert ^{4}}\sigma + \frac{1}{L^{2}}\frac{\sigma ^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2} \Vert g_{k} \Vert ^{4}} \\ &\le \frac{1}{ \Vert g_{k} \Vert ^{2}} + \frac{2}{L}\frac{ \Vert s_{k} \Vert }{(1 - \sigma ) \Vert g_{k} \Vert ^{3}}\sigma + \frac{1}{L^{2}}\frac{\sigma ^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2} \Vert g_{k} \Vert ^{4}}. \end{aligned} \end{aligned}

Let

$$\Vert g_{k} \Vert ^{q} = \min \bigl\{ \Vert g_{k} \Vert ^{2}, \Vert g_{k} \Vert ^{3}, \Vert g_{k} \Vert ^{4} \bigr\} , \quad q \in N,$$

then

$$\frac{ \Vert d_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}} \le \frac{1}{ \Vert g_{k} \Vert ^{q}} \biggl( 1 + \frac{2}{L}\frac{\lambda }{(1 - \sigma )}\sigma + \frac{1}{\lambda ^{2}} \frac{\sigma ^{2}\lambda ^{2}}{(1 - \sigma )^{2}} \biggr).$$

Also, let

$$R = \biggl( 1 + \frac{2}{L}\frac{\lambda }{(1 - \sigma )}\sigma + \frac{1}{\lambda ^{2}}\frac{\sigma ^{2}\lambda ^{2}}{(1 - \sigma )^{2}} \biggr),$$

then

\begin{aligned}& \frac{ \Vert d_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}} \le \frac{R}{ \Vert g_{k} \Vert ^{q}} \le R\sum _{i = 1}^{k} \frac{1}{ \Vert g_{i} \Vert ^{q}}, \\& \frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}} \ge \frac{\varepsilon ^{q}}{kR}. \end{aligned}

Therefore,

$$\sum_{k = 0}^{\infty } \frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}} = \infty .$$

□

4 Numerical results and discussions

To analyze the efficiency of the new CG method, several test functions are selected from CUTE [20], as shown in the Appendix. These functions can be obtained from the following website:

In the Appendix, the following notations are defined as follows:

• No. iter means the number of iterations.

• No. fun. Eva means the number of function evaluations.

The comparison was made with respect to CPU time, the number of function evaluations, the number of iterations, and the number of gradient evaluations. The SWP line search is employed with the following parameters of $$\delta = 0.01$$ and $$\sigma = 0.1$$. The modified CG-Descent 6.8 with zero memory is employed to obtain the result for $$\beta _{k}^{A1}$$, $$\beta _{k}^{A2}$$. The code can be downloaded from the Hager webpage:

A minimum time of 0.02 seconds is used for all algorithms. The host computer is an Intel® Dual-Core CPU with 2 GB of DDR2 RAM. The results are shown in Figs. 45, 6, and 7, in which a performance measure introduced by Dolan and Moré [21] was employed.

It is clear that based on the left-hand side of Figs. 45, 6, and 7, the CG method A1 is above the other curves. Therefore, it is the most efficient method among related AZPRP methods. However, CG method A2 is not as efficient as A1. Still, it is more efficient than AZPRP with respect to CPU time, the number of function evaluations, gradient evaluations, and the number of iterations. In addition, as an application of the CG method in image restoration, the reader can refer to the following references [2224].

5 Conclusion

In this paper, we proposed two efficient conjugate gradient methods related to the AZPRP method. The two methods satisfied global convergence properties and the descent property when SWP line searches were employed. Furthermore, our numerical results showed that the new methods are more efficient than the AZPRP method with respect to the number of iterations, gradient evaluations, function evaluations, and CPU time.

Availability of data and materials

The data is available inside the paper.

References

1. Wolfe, P.: Convergence conditions for ascent methods. SIAM Rev. 11(2), 226–235 (1969)

2. Wolfe, P.: Convergence conditions for ascent methods. II: some corrections. SIAM Rev. 13(2), 185–188 (1971)

3. Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952)

4. Polak, E., Ribiere, G.: Note sur la convergence de méthodes de directions conjuguées. ESAIM: Math. Model. Numer. Anal. 3(R1), 35–43 (1969)

5. Liu, Y., Storey, C.: Efficient generalised conjugate gradient algorithms, part 1: theory. J. Optim. Theory Appl. 69(1), 129–137 (1991)

6. Fletcher, R., Reeves, C.M.: Function minimisation by conjugate gradients. Comput. J. 7(2), 149–154 (1964)

7. Fletcher, R.: Practical Method of Optimisation. Unconstrained Optimisation, edn. (1997)

8. Dai, Y.H., Yuan, Y.: A non-linear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 10(1), 177–182 (1999)

9. Zoutendijk, G.: Non-linear programming, computational methods. In: Integer and Non-linear Programming, pp. 37–86 (1970)

10. Al-Baali, M.: Descent property and global convergence of the Fletcher–Reeves method with inexact line search. IMA J. Numer. Anal. 5(1), 121–124 (1985)

11. Powell, M.J.: Non-convex minimisation calculations and the conjugate gradient method. In: Numerical Analysis, pp. 122–141. Springer, Berlin (1984)

12. Gilbert, J.C., Nocedal, J.: Global convergence properties of conjugate gradient methods for optimisation. SIAM J. Optim. 2(1), 21–42 (1992)

13. Dai, Y.H., Liao, L.Z.: New conjugacy conditions and related non-linear conjugate gradient methods. Appl. Math. Optim. 43(1), 87–101 (2001)

14. Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16(1), 170–192 (2005)

15. Hager, W.W., Zhang, H.: The limited memory conjugate gradient method. SIAM J. Optim. 23(4), 2150–2168 (2013)

16. Wei, Z., Yao, S., Liu, L.: The convergence properties of some new conjugate gradient methods. Appl. Math. Comput. 183(2), 1341–1350 (2006)

17. Dai, Z., Wen, F.: Another improved Wei–Yao–Liu non-linear conjugate gradient method with sufficient descent property. Appl. Math. Comput. 218(14), 7421–7430 (2012)

18. Alhawarat, A., Salleh, Z., Mamat, M., Rivaie, M.: An efficient modified Polak–Ribière–Polyak conjugate gradient method with global convergence properties. Optim. Methods Softw. 32(6), 1299–1312 (2017)

19. Kaelo, P., Mtagulwa, P., Thuto, M.V.: A globally convergent hybrid conjugate gradient method with strong Wolfe conditions for unconstrained optimisation. Math. Sci. 14(1), 1–9 (2020)

20. Bongartz, I., Conn, A.R., Gould, N., Toint, P.L.: CUTE: constrained and unconstrained testing environment. ACM Trans. Math. Softw. 21(1), 123–160 (1995)

21. Dolan, E.D., Moré, J.J.: Benchmarking optimisation software with performance profiles. Math. Program. 91(2), 201–213 (2002)

22. Alhawarat, A., Salleh, Z., Masmali, I.A.: A convex combination between two different search directions of conjugate gradient method and application in image restoration. Math. Probl. Eng. 2021, Article ID 9941757 (2021). https://doi.org/10.1155/2021/9941757

23. Guessab, A., Driouch, A.: A globally convergent modified multivariate version of the method of moving asymptotes. Appl. Anal. Discrete Math. 15(2), 519–535 (2021)

24. Guessab, A., Driouch, A., Nouisser, O.: A globally convergent modified version of the method of moving asymptotes. Appl. Anal. Discrete Math. 13(3), 905–917 (2019)

Acknowledgements

The authors are grateful for all this support and improve our paper; also, we would like to thank the University of Malaysia Terengganu (UMT) for funding this paper.

Funding

This study was partially supported by the Universiti Malaysia Terengganu, Centre of Research and Innovation Management.

Author information

Authors

Contributions

The authors contributed equally and significantly in writing this paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zabidin Salleh.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Function

Dim

A1 CG method

A2 CG method

AZPRP CG method

No. Iter

No. fun Eva.

CPU Time

No. Iter

No. fun Eva.

CPU Time

No. Iter

No. fun Eva.

CPU Time

AKIVA

2

8

20

15

0.02

8

20

15

0.02

8

20

15

0.02

ALLINITU

4

9

25

18

0.02

9

25

18

0.02

9

25

18

0.02

ARGLINA

200

1

3

2

0.02

1

3

2

0.02

1

3

2

0.02

200

6

16

12

0.02

6

16

12

0.02

6

16

12

0.02

BARD

3

12

32

22

0.02

12

32

22

0.02

12

32

22

0.02

BDQRTIC

5000

161

352

334

0.44

25

76

65

0.13

157

334

315

0.66

BEALE

2

11

33

26

0.02

11

33

26

0.02

11

33

26

0.02

BIGGS6

6

24

64

44

0.02

24

64

44

0.02

24

64

44

0.02

BOX3

3

10

23

14

0.02

10

23

14

0.02

10

23

14

0.02

BOX

10,000

7

25

21

0.14

8

27

23

0.08

7

24

20

0.08

BRKMCC

2

5

11

6

0.02

5

11

6

0.02

5

11

6

0.02

BROWNAL

200

5

15

11

0.02

11

53

46

0.02

10

26

18

0.02

BROWNBS

2

10

24

18

0.02

10

24

18

0.02

10

24

18

0.02

BROWNDEN

4

16

38

31

0.02

16

38

31

0.02

16

38

31

0.02

BROYDN7D

5000

84

157

115

0.44

103

180

143

0.47

11

192

153

0.47

BRYBND

5000

85

193

117

0.27

55

127

82

0.2

85

198

124

0.28

CHAINWOO

4000

318

635

393

0.7

446

934

589

0.91

346

691

418

0.67

CHNROSNB

50

372

779

420

0.02

47

108

69

0.02

358

747

400

0.02

CLIFF

2

10

46

39

0.02

10

46

39

0.02

10

46

39

0.02

COSINE

10,000

13

57

47

0.17

12

54

48

0.19

14

56

49

0.22

CRAGGLVY

5000

177

363

309

0.91

88

179

140

0.39

104

221

176

0.41

CUBE

2

17

48

34

0.02

17

48

34

0.02

17

48

34

0.02

CURLY10

10,000

50,576

70,093

81,672

197

48,772

70,747

75,603

165.28

42,321

61,798

65,202

134

CURLY20

10,000

74,906

97,403

1E+05

446.39

78,246

104,075

130,745

374

67,898

90,440

1E+05

390

CURLY30

10,000

76,869

100,202

1E+05

648

73,218

96,533

123,259

639.63

73,218

96,533

1E+05

582

DECONVU

63

313

637

327

2.00E−02

164

392

245

2.00E−02

223

453

232

2.00E−02

DENSCHNA

2

6

16

12

0.02

6

16

12

0.02

6

16

12

0.02

DENSCHNB

2

6

18

15

0.02

6

18

15

0.02

6

18

15

0.02

DENSCHNC

2

11

36

31

0.02

11

36

31

0.02

11

36

31

0.02

DENSCHND

3

14

46

40

0.02

14

46

40

0.02

14

46

40

0.02

DENSCHNE

3

12

43

38

0.02

12

43

38

0.02

12

43

38

0.02

DENSCHNF

2

9

31

26

0.02

9

31

26

0.02

9

31

26

0.02

DIXMAANA

3000

6

15

11

0.02

6

15

11

0.02

5

13

10

0.02

DIXMAANB

3000

6

16

12

0.02

6

16

12

0.02

6

16

12

0.02

DIXMAANC

3000

6

14

9

0.02

6

14

9

0.02

6

14

9

0.02

DIXMAAND

3000

7

17

12

0.02

7

17

12

0.02

6

15

11

0.02

DIXMAANE

3000

218

245

419

0.23

256

283

495

0.3

218

242

422

0.3

DIXMAANF

3000

125

255

133

0.11

129

263

137

0.12

140

285

148

0.19

DIXMAANG

3000

170

345

178

0.16

169

343

177

0.13

174

353

182

0.14

DIXMAANH

3000

176

358

185

0.16

186

377

194

0.14

173

353

184

0.14

DIXMAANI

3000

2994

3083

5909

3.28

3174

3248

6284

3.7

3264

3359

6443

3.3

DIXMAANJ

3000

363

731

371

0.28

345

695

353

0.31

384

773

392

0.31

DIXMAANK

3000

304

613

312

0.25

398

801

406

0.31

401

806

408

0.34

DIXMAANL

3000

342

691

353

0.27

379

765

390

0.31

430

867

441

0.49

DIXON3DQ

10,000

10,000

10,007

19

0.78

10,000

10,007

19,995

19.12

10,000

10,007

19,995

19.12

DJTL

2

75

1163

1148

0.02

75

1163

1148

0.02

75

1163

1148

0.02

DQDRTIC

5000

5

11

6

0.02

5

11

6

0.02

5

11

6

0.02

DQRTIC

5000

15

32

18

0.03

15

32

18

0.02

15

32

18

0.02

EDENSCH

2000

31

70

54

0.05

34

74

67

0.06

30

67

57

0.08

EG2

1000

3

8

5

0.02

3

8

5

0.02

3

8

5

0.02

EIGENALS

2550

8785

16,438

9953

141

17,061

32,415

18,795

280

11,275

20,477

13,384

194.39

EIGENBLS

2550

19,480

38,968

19,489

284

234

548

335

5.14

34,599

69,207

34,609

589

EIGENCLS

2652

11,704

23,434

11,740

180.05

9800

19,062

10,385

162.48

9838

18,888

10,710

185

ENGVAL1

5000

23

45

40

0.05

23

47

41

0.06

23

45

40

0.06

ENGVAL2

3

26

73

55

0.02

26

73

55

0.02

26

73

55

0.02

ERRINROS

50

102,326

201,278

1E+05

2.78

104

260

178

0.02

82,469

2E+05

86,569

2.08

EXPFIT

2

9

29

22

0.02

9

29

22

0.02

9

29

22

0.02

EXTROSNB

1000

2359

5279

3112

0.8

69

193

145

0.03

2205

4964

2945

0.86

FLETCBV2

5000

1

1

1

0.02

1

1

1

0.02

1

1

1

0.02

FLETCHCR

1000

71

153

85

0.03

29

68

43

0.05

88

178

114

0.03

FMINSRF2

5625

426

875

453

1.31E+00

1803

3546

1849

4.59E+00

459

940

486

1.58E+00

FMINSURF

5625

562

1140

584

1.67

1327

2705

1384

4.08

548

1118

575

2.06

FREUROTH

5000

21

54

47

0.09

34

86

85

0.17

27

63

58

0.19

GENHUMPS

5000

13,718

30,411

16,957

46

467

1482

1090

3.33

11,853

35,531

25,448

68.94

GENROSE

500

1831

3888

2134

0.55

61

199

156

0.03

2057

4360

2378

0.47

GROWTHLS

3

109

431

369

0.02

109

431

369

0.03

109

431

369

0.03

GULF

3

33

95

72

0.02

33

95

72

0.02

33

95

72

0.02

HAIRY

2

17

82

68

0.02

17

82

68

0.02

17

82

68

0.02

HATFLDD

3

17

49

37

0.02

17

49

37

0.02

17

49

37

0.02

HATFLDE

3

13

37

30

0.02

13

37

30

0.02

13

37

30

0.02

HATFLDFL

3

21

68

54

0.02

21

68

54

0.02

21

68

54

0.02

HEART6LS

6

375

1137

876

0.02

375

1137

876

0.02

375

1137

876

0.02

HEART8LS

8

253

657

440

0.02

253

657

440

0.02

253

657

440

0.02

HELIX

3

23

60

42

0.02

23

60

42

0.02

23

60

42

0.02

HIELOW

3

13

30

21

0.03

13

30

21

0.02

13

30

21

0.03

HILBERTA

2

2

5

3

0.02

2

5

3

0.02

2

5

3

0.02

HILBERTB

10

4

9

5

0.02

4

9

5

0.02

4

9

5

0.02

HIMMELBB

2

4

18

18

0.02

4

18

18

0.02

4

18

18

0.02

HIMMELBF

4

23

59

46

0.02

23

59

46

0.02

23

59

46

0.02

HIMMELBG

2

7

22

17

0.02

7

22

17

0.02

7

22

17

0.02

HIMMELBH

2

5

13

9

0.02

5

13

9

0.02

5

13

9

0.02

HUMPS

2

45

223

202

0.02

45

223

202

0.02

45

223

202

0.02

JENSMP

2

12

47

41

0.02

12

47

41

0.02

12

47

41

0.02

JIMACK

35,449

8316

11,134

8318

1165

8719

17,440

8721

1224

7297

14,596

7299

1027

KOWOSB

4

16

46

32

0.02

16

46

32

0.02

16

46

32

0.02

LIARWHD

5000

16

43

31

0.03

16

45

31

0.03

15

40

28

0.03

LOGHAIRY

2

26

196

179

0.02

26

196

179

0.02

26

196

179

0.02

MANCINO

100

11

23

12

0.08

11

23

12

0.06

11

23

12

0.06

MARATOSB

2

589

2885

2585

0.02

589

2885

2585

0.02

589

2885

2585

0.02

MEXHAT

2

14

59

55

0.02

14

59

55

0.02

14

59

55

0.02

MOREBV

5000

161

168

317

0.36

161

168

317

0.39

161

168

317

0.39

MSQRTALS

1024

2760

5529

2771

8.02

2776

5562

2789

8.8

2776

5562

2789

8.3

MSQRTBLS

1024

2252

4512

2262

7.53

2119

4198

2182

6.6

2179

4366

2189

6.77

NCB20B

500

4994

7577

10,892

80.22

99

216

202

1.59

4912

7503

10,461

79.45

NCB20

5010

908

1965

1477

11.94

216

470

371

3.05

1074

2459

1545

13.16

NONCVXU2

5000

6864

13,196

7400

16.33

6580

12,710

7032

15.19

7477

14,009

8428

17.9

NONDIA

5000

7

25

19

0.02

7

25

19

0.03

7

25

19

0.03

NONDQUAR

5000

616

1372

868

0.78

1423

3001

1663

1.75

2562

5235

2743

2.95

OSBORNEA

5

82

230

174

0.02

82

230

174

0.02

82

230

174

0.02

OSBORNEB

11

57

134

84

0.02

57

134

84

0.02

57

134

84

0.02

OSCIPATH

10

295,029

781,729

5E+05

2.16

3E+05

781,729

534,425

2.23

295,029

8E+05

5E+05

2.23

PALMER1C

8

12

27

28

0.02

12

27

28

0.02

12

27

28

0.02

PALMER1D

7

10

24

23

0.02

10

24

23

0.02

10

24

23

0.02

PALMER2C

8

11

21

22

0.02

11

21

22

0.02

11

21

22

0.02

PALMER3C

8

11

21

21

0.02

11

21

21

0.02

11

21

21

0.02

PALMER4C

8

11

21

21

0.02

11

21

21

0.02

11

21

21

0.02

PALMER5C

6

6

13

7

0.02

6

13

7

0.02

6

13

7

0.02

PALMER6C

8

11

24

24

0.02

11

24

24

0.02

11

24

24

0.02

PALMER7C

8

11

20

20

0.02

11

20

20

0.02

11

20

20

0.02

PALMER8C

8

11

19

19

0.02

11

19

19

0.02

11

19

19

0.02

PARKCH

15

740

1513

1404

35.83

15

59

134

2.08

726

1548

1348

35

PENALTY1

1000

15

61

56

0.02

41

164

144

0.02

43

168

146

0.02

PENALTY2

200

215

263

421

0.03

212

247

404

0.05

200

243

386

0.03

PENALTY3

200

99

330

275

2.06

32

105

86

0.64

83

278

236

1.86

POWELLSG

5000

20

49

34

0.01

34

84

58

0.02

28

72

49

0.03

POWER

10,000

456

933

488

0.81

325

890

637

0.89

544

1119

592

1

QUARTC

5000

15

32

18

0.02

15

32

18

0.03

15

32

18

0.02

ROSENBR

2

28

84

65

0.02

28

84

65

0.02

28

84

65

0.02

S308

2

7

21

17

0.02

7

21

17

0.02

7

21

17

0.02

SCHMVETT

5000

41

71

58

0.25

41

71

58

0.22

37

67

50

0.17

SENSORS

100

46

116

75

0.61

35

97

69

0.55

51

131

88

0.78

SINEVAL

2

46

181

153

0.02

46

181

153

0.02

46

181

153

0.02

5000

13

44

38

0.08

14

45

39

0.08

14

51

44

0.11

SISSER

2

5

19

19

0.02

5

19

19

0.02

5

19

19

0.02

SNAIL

2

61

251

211

0.02

61

251

211

0.02

61

251

211

0.02

SPARSINE

5000

22,466

22,744

44,664

83.92

21,468

21,760

42,654

83

21,700

22,006

43,104

84.5

SPARSQUR

10,000

37

158

148

0.91

52

205

188

0.84

34

143

136

0.84

SPMSRTLS

4999

216

439

229

0.47

252

501

275

0.61

213

435

224

0.47

SROSENBR

5000

9

23

15

0.02

9

23

15

0.02

9

23

15

0.02

STRATEC

10

170

419

283

6.11

170

419

283

6.2

170

419

283

6.17

5000

1543

1550

3081

1.25E+00

1515

3025

1.25

1.52E+00

1573

1580

3141

1.34E+00

TOINTGOR

50

118

214

152

0.02

123

220

163

0.02

120

215

155

0.02

TOINTGSS

5000

4

9

5

0.02

4

9

5

0.02

4

9

5

0.02

TOINTPSP

50

143

336

254

0.02

26

101

90

0.02

140

326

245

0.02

TOINTQOR

50

29

36

53

0.02

29

36

53

0.02

29

36

53

0.02

TQUARTIC

5000

12

44

36

0.03

11

37

29

0.03

11

38

30

0.03

TRIDIA

5000

783

790

1561

0.89

782

789

1559

0.91

783

790

1561

0.89

VARDIM

9

23

18

0.02

10

24

17

0.02

10

23

18

0.02

VAREIGVL

50

24

51

29

0.02

24

51

29

0.02

23

49

28

0.02

VIBRBEAM

8

98

255

174

0.02

98

255

174

0.02

98

255

174

0.02

WATSON

12

53

124

78

0.02

49

127

88

0.02

57

130

78

0.02

WOODS

4000

24

60

40

0.05

23

59

40

0.03

22

57

41

0.03

YFITU

3

68

208

167

0.02

68

208

167

0.02

68

208

167

0.02

ZANGWIL2

2

1

3

2

0.02

1

3

2

0.02

1

3

2

0.02

Rights and permissions

Reprints and permissions

Salleh, Z., Almarashi, A. & Alhawarat, A. Two efficient modifications of AZPRP conjugate gradient method with sufficient descent property. J Inequal Appl 2022, 14 (2022). https://doi.org/10.1186/s13660-021-02746-0