Assumption 1
-
I.
\(f(x)\) is bounded from below on the level set \(\Omega = \{ x \in R^{n}:f(x) \le f(x_{1})\}\), where \(x_{1}\) is the starting point.
-
II.
In some neighborhood N of Ω, f is continuous and differentiable, and its gradient is Lipchitz continuous. That is, for any \(x,y \in N\), there exists a constant \(L > 0\) such that
$$ \bigl\Vert g(x) - g(y) \bigr\Vert \le L \Vert x - y \Vert . $$
The following is considered one of the most important lemmas used to prove the global convergence properties. For more details, the reader can refer to [9].
Lemma 3.1
Suppose Assumption 1holds. Considering the CG method of the form (1.3), where the search direction satisfies the sufficient descent condition and \(\alpha _{k}\) exists by standard WWP line search, we have
$$ \sum_{k = 0}^{\infty } \frac{(g_{k}^{T}d_{k})^{2}}{ \Vert d_{k} \Vert ^{2}} < \infty , $$
(3.1)
where (3.1) is known as the Zoutendijk condition. Inequality (3.1) also holds for the exact line search, the Armijo-Goldstein line search, and the SWP line search.
Substituting (1.9) into (3.1) yields
$$ \sum_{k = 0}^{\infty } \frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}} < \infty . $$
(3.2)
Gilbert and Nocedal [11] presented an important theorem to find the global convergence of nonnegative PRP and nonnegative methods summarized by Theorem 3.3. Furthermore, they presented a nice property, called Property*, as follows:
Property*
Consider a method of the form (1.1) and (1.2), and suppose \(0 < \gamma \le \Vert g_{k} \Vert \le \bar{\gamma } \). We say that the method possesses Property* if there exist constant \(b > 1\) and \(\lambda > 0\) such that for all \(k \ge 1\), we get \(\vert \beta _{k} \vert \le b\), and if \(\Vert x_{k} - x_{k - 1} \Vert \le \lambda \), then
$$ \vert \beta _{k} \vert \le \frac{1}{2b}. $$
The following theorem plays a crucial role in the CG method given in [11].
Theorem 3.1
Considering any CG method of the form (1.2) and (1.3), suppose the following conditions hold:
-
I.
\(\beta _{k} > 0\).
-
II.
The sufficient descent condition is satisfied.
-
III.
The Zoutendijk condition holds.
-
IV.
Property* is true.
-
V.
Assumption 1is satisfied.
Then, the iterates are globally convergent, i.e., \(\lim_{k \to \infty } \Vert g_{k} \Vert = 0\).
The global convergence properties of \(\beta _{k}^{A1}\)
Theorem 3.2
Suppose that Assumption 1holds. Then, by considering the CG method of the form (1.2), (1.3), and (2.1), where \(\alpha _{k}\) is computed by (1.5) and (1.6) and the sufficient descent condition holds, we multiply (1.2) by \(g_{k}^{T}\), which yields
$$\begin{aligned} g_{k}d_{k} &= - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} {g}_{k}^{T}d_{k - 1} \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le \Vert g_{k} \Vert ^{2} \biggl( - 1 + \frac{1}{m} \biggr). \end{aligned}$$
Theorem 3.3
Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where \(\alpha _{k}\) is computed by (1.5) and (1.6), then \(\beta _{k}^{A1}\) satisfies Property*.
Proof
Let \(\lambda = \frac{\gamma ^{2}}{2L(L + 1)\lambda \bar{\gamma } b}\) and
$$\beta _{k}^{A1} = \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \le \frac{ \Vert g_{k} \Vert ^{2} + \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}} \le \frac{ \Vert g_{k} \Vert ( \Vert g_{k} \Vert + \Vert g_{k - 1} \Vert )}{ \Vert g_{k - 1} \Vert ^{2}} \le \frac{2\bar{\gamma }^{2}}{\gamma ^{2}} = b > 1. $$
To show that \(\beta _{k}^{A1} \le \frac{1}{2b}\), we have the following two cases:
Case 1:
\(\mu _{k} > 1\)
$$\begin{aligned} \beta _{k}^{A1} &= \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \le \frac{ \Vert g_{k} \Vert ^{2} - \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}} \\ &\le \frac{ \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{ \Vert g_{k - 1} \Vert ^{2}} \le \frac{L\lambda \bar{\gamma }}{\gamma ^{2}}. \end{aligned}$$
Case 2:
\(\mu _{k} < 1\)
To satisfy Property* for \(\beta _{k}^{A1}\) with \(\mu _{k} < 1\), we need the following inequality:
$$ \Vert w_{k} \Vert + \Vert v_{k} \Vert \le L \Vert w_{k} + v_{k} \Vert , $$
(3.3)
where \(w_{k} = g_{k} - \frac{1}{L}g_{k - 1}\), and \(v_{k} = \frac{1}{L}g_{k} - g_{k - 1}\), which yields
$$\begin{aligned} \bigl\vert \beta _{k}^{A1} \bigr\vert &\le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + \Vert g_{k - 1} \Vert ^{2}} \biggr\vert \le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \frac{1}{L} \vert g_{k}^{T}g_{k - 1} \vert }{ \Vert g_{k - 1} \Vert ^{2}} \biggr\vert \\ & \le \frac{ \Vert g_{k} \Vert \Vert g_{k} - \frac{1}{L}g_{k - 1} \Vert }{ \Vert g_{k - 1} \Vert ^{2}}. \end{aligned}$$
Using (3.3), we obtain
$$\begin{aligned}& \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} \biggr\Vert \le L \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} + \frac{1}{L}g_{k} - g_{k - 1} \biggr\Vert \le ({L} + 1) \Vert g_{k} - g_{k - 1} \Vert , \\& \bigl\vert \beta _{k}^{A1} \bigr\vert \le \frac{(L + 1) \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{ \Vert g_{k - 1} \Vert ^{2}}\le L\frac{(L + 1)\lambda \bar{\gamma }}{\gamma ^{2}}. \end{aligned}$$
Thus, in all cases
$$\bigl\vert \beta _{k}^{A1} \bigr\vert \le \frac{L\lambda \bar{\gamma }}{\gamma ^{2}} \le \frac{L(L + 1)\lambda \bar{\gamma }}{\gamma ^{2}} \le \frac{1}{2b}. $$
The proof is completed. □
Theorem 3.4
Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where \(\alpha _{k}\) is computed by (1.5) and (1.6), then \(\lim_{k \to \infty } \Vert g_{k} \Vert = 0\).
Proof
We will apply Theorem 3.1. Note that the following properties hold for \(\beta _{k}^{A1}\):
-
i.
\(\beta _{k}^{A1} > 0\).
-
ii.
\(\beta _{k}^{A1}\) satisfies Property* using Theorem 3.3.
-
iii.
\(\beta _{k}^{A1}\) satisfies the descent property using Theorem 3.2.
-
iv.
Assumption 1 holds.
Thus, all properties in Theorem 3.1 are satisfied, which leads to \(\lim_{k \to \infty } \Vert g_{k} \Vert = 0\). □
The global convergence properties of \(\beta _{k}^{A2}\)
Theorem 3.5
Suppose Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where \(\alpha _{k}\) is computed by (1.5) and (1.6), and where the sufficient descent condition holds for \(\beta _{k}^{A2}\). Since \(d_{k - 1}^{T}y_{k - 1} \ge 0\), we obtain
$$\begin{aligned} g_{k}d_{k}& = - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} {g}_{k}^{T}d_{k - 1}\\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le - \Vert g_{k} \Vert ^{2} + \frac{ \Vert g_{k} \Vert ^{2}}{m \vert g_{k}^{T}d_{k - 1} \vert } \bigl\vert {g}_{k}^{T}d_{k - 1} \bigr\vert \\ &\le \Vert g_{k} \Vert ^{2} \biggl( - 1 + \frac{1}{m} \biggr). \end{aligned}$$
Theorem 3.6
Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), where \(\alpha _{k}\) is computed by (1.5) and (1.6), then the iterates \(\beta _{k}^{A2}\) satisfy Property*.
Proof
Let \(\lambda = \frac{(1 - \sigma )c\gamma ^{2}}{2L(L + 1)\bar{\gamma } b}\) and
$$\begin{aligned} \beta _{k}^{A2} &= \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} \le \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{d_{k - 1}^{T}y_{k - 1}}\\ & \le \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}}\le \frac{ \Vert g_{k} \Vert ^{2} + \vert g_{k}^{T}g_{k - 1} \vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}}\\ & \le \frac{ \Vert g_{k} \Vert ( \Vert g_{k} \Vert + \Vert g_{k - 1} \Vert )}{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \le \frac{2\bar{\gamma }^{2}}{(1 - \sigma )c\gamma ^{2}} = b > 1. \end{aligned}$$
To show that \(\beta _{k}^{A2} \le \frac{1}{2b}\), we have the following two cases:
Case
\(\mu _{k} > 1\)
$$\begin{aligned} \beta _{k}^{A2} &= \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} \le \frac{ \Vert g_{k} \Vert ^{2} - \vert g_{k}^{T}g_{k - 1} \vert }{d_{k - 1}^{T}y_{k - 1}} \\ &\le \frac{ \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \le \frac{L\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}}.\vadjust{\goodbreak} \end{aligned}$$
Case
\(\mu _{k} < 1\)
To satisfy Property* for \(\beta _{k}^{A1}\) with \(\mu _{k} < 1\), we need property (3.3) which gives
$$\begin{aligned} \bigl\vert \beta _{k}^{A2} \bigr\vert & \le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert }{m \vert g_{k}^{T}d_{k - 1} \vert + d_{k - 1}^{T}y_{k - 1}} \biggr\vert \\ & \le \biggl\vert \frac{ \Vert g_{k} \Vert ^{2} - \frac{1}{L} \vert g_{k}^{T}g_{k - 1} \vert }{d_{k - 1}^{T}y_{k - 1}} \biggr\vert \\ &\le \biggl\vert \frac{ \Vert g_{k} \Vert \Vert g_{k} - \frac{1}{L}g_{k - 1} \Vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \biggr\vert . \end{aligned}$$
Using (3.3), we obtain
$$\begin{aligned}& \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} \biggr\Vert \le L \biggl\Vert g_{k} - \frac{1}{L}g_{k - 1} + \frac{1}{L}g_{k} - g_{k - 1} \biggr\Vert \le ({L} + 1) \Vert g_{k} - g_{k - 1} \Vert , \\& \bigl\vert \beta _{k}^{A2} \bigr\vert \le \frac{(L + 1) \Vert g_{k} \Vert \Vert g_{k} - g_{k - 1} \Vert }{(1 - \sigma )c \Vert g_{k - 1} \Vert ^{2}} \le L\frac{(L + 1)\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}}. \end{aligned}$$
Thus, in all cases
$$ \bigl\vert \beta _{k}^{A2} \bigr\vert \le \frac{L\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}} \le L\frac{(L + 1)\lambda \bar{\gamma }}{(1 - \sigma )c\gamma ^{2}} \le \frac{1}{2b}. $$
□
Theorem 3.7
Suppose that Assumption 1holds. Consider the CG method of the form (1.2), (1.3), and (2.3), i.e., \(\beta _{k}^{A2}\), where \(\alpha _{k}\) is computed by (1.5) and (1.6), then \(\lim_{k \to \infty } \Vert g_{k} \Vert = 0\).
Proof
We will apply Theorem 3.1. Note that the following properties hold for \(\beta _{k}^{A2}\):
-
i.
\(\beta _{k}^{A2} > 0\).
-
ii.
\(\beta _{k}^{A2}\) satisfies Property* by using Theorem 3.6.
-
iii.
\(\beta _{k}^{A2}\) satisfies the descent property by using Theorem 3.5.
-
iv.
Assumption 1 holds.
Thus all properties in Theorem 3.1 are satisfied, which leads to \(\lim_{k \to \infty } \Vert g_{k} \Vert = 0\).
If the condition \(\Vert g_{k} \Vert ^{2} > \mu _{k} \vert g_{k}^{T}g_{k - 1} \vert \) does not hold for \(\beta _{k}^{A1}\) and \(\beta _{k}^{A2}\), then the CG method will be restarted using \(\beta _{k}^{D - H} = - \mu _{k}\frac{g_{k}^{T}s_{k - 1}}{d_{k - 1}^{T}y_{k - 1}}\). □
The following two theorems show that the CG method with \(\beta _{k}^{D - H}\) has the descent and convergence properties.
Theorem 3.8
Let sequences \(\{ x_{k}\}\) and \(\{ d_{k}\}\) be obtained using Eqs. (1.2) and (1.3), which is computed by SWP line search in Eqs. (1.5) and (1.7), then the descent condition holds for \(\{ d_{k}\}\) with \(\beta _{k}^{D - H}\).
Proof
By multiplying Eq. (1.3) with \(g_{k}^{T}\), and substituting \(\beta _{k}^{D - H}\), we obtain
$$\begin{aligned} g_{k}^{T}d_{k} &= - \Vert g_{k} \Vert ^{2} - t\frac{g_{k}^{T}s_{k - 1}}{d_{k}^{T}y_{k - 1}} g_{k}^{T}d_{k - 1} \\ &= - \Vert g_{k} \Vert ^{2} - t\alpha _{k}\frac{ \Vert g_{k}^{T}d_{k - 1}^{T} \Vert ^{2}}{d_{k}^{T}y_{k - 1}} \le - \Vert g_{k} \Vert ^{2}. \end{aligned}$$
Letting \(c = 1\), we then obtain
$$ g_{k}^{T}d_{k} \le - c \Vert g_{k} \Vert ^{2}, $$
which completes the proof. □
Theorem 3.9
Assume that Assumption 1holds. Consider the conjugate gradient method in (1.2) and (1.3) with \(\beta _{k}^{D - H}\) a descent direction and \(\alpha _{k}\) obtained by the strong Wolfe line search. Then, \(\lim \inf_{ k \to \infty } \Vert g_{k} \Vert = 0\).
Proof
We will prove this theorem by contradiction. Suppose Theorem 3.4 is not true. Then, a constant \(\varepsilon > 0\) exists such that
$$ \Vert g_{k} \Vert \ge \varepsilon , \quad \forall k \ge 1. $$
(3.4)
By squaring both sides of (1.2), we obtain
$$\begin{aligned}& \begin{aligned} \Vert d_{k} \Vert ^{2} &= \Vert g_{k} \Vert ^{2} - 2\beta _{k}g_{k}^{T}d_{k - 1} + \beta _{k}^{2} \Vert d_{k - 1} \Vert ^{2} \\ &\le \Vert g_{k} \Vert ^{2} + 2 \vert \beta _{k} \vert \bigl\vert g_{k}^{T}d_{k - 1} \bigr\vert + \beta _{k}^{2} \Vert d_{k - 1} \Vert ^{2} \\ &\le \Vert g_{k} \Vert ^{2} + \frac{2}{L} \frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma ) \vert g_{k - 1}^{T}d_{k - 1} \vert } ( \sigma ) \bigl\vert g_{k - 1}^{T}d_{k - 1} \bigr\vert + \frac{1}{L^{2}}\frac{ ( ( \sigma )g_{k - 1}^{T}d_{k - 1} )^{2} \Vert s_{k - 1} \Vert ^{2}}{ ( (1 - \sigma )g_{k - 1}^{T}d_{k - 1} )^{2}}\\ & \le \Vert g_{k} \Vert ^{2} + \frac{2}{L} \frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma )}\sigma + \frac{1}{L^{2}}\frac{ ( \sigma )^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2}}, \end{aligned} \\& \begin{aligned} \frac{ \Vert d_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}}& \le \frac{ \Vert g_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}} + \frac{2}{L}\frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma ) \Vert g_{k} \Vert ^{4}}\sigma + \frac{1}{L^{2}} \frac{\sigma ^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2} \Vert g_{k} \Vert ^{4}} \\ &\le \frac{1}{ \Vert g_{k} \Vert ^{2}} + \frac{2}{L}\frac{ \Vert g_{k} \Vert \Vert s_{k} \Vert }{(1 - \sigma ) \Vert g_{k} \Vert ^{4}}\sigma + \frac{1}{L^{2}}\frac{\sigma ^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2} \Vert g_{k} \Vert ^{4}} \\ &\le \frac{1}{ \Vert g_{k} \Vert ^{2}} + \frac{2}{L}\frac{ \Vert s_{k} \Vert }{(1 - \sigma ) \Vert g_{k} \Vert ^{3}}\sigma + \frac{1}{L^{2}}\frac{\sigma ^{2} \Vert s_{k - 1} \Vert ^{2}}{(1 - \sigma )^{2} \Vert g_{k} \Vert ^{4}}. \end{aligned} \end{aligned}$$
Let
$$ \Vert g_{k} \Vert ^{q} = \min \bigl\{ \Vert g_{k} \Vert ^{2}, \Vert g_{k} \Vert ^{3}, \Vert g_{k} \Vert ^{4} \bigr\} , \quad q \in N, $$
then
$$ \frac{ \Vert d_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}} \le \frac{1}{ \Vert g_{k} \Vert ^{q}} \biggl( 1 + \frac{2}{L}\frac{\lambda }{(1 - \sigma )}\sigma + \frac{1}{\lambda ^{2}} \frac{\sigma ^{2}\lambda ^{2}}{(1 - \sigma )^{2}} \biggr). $$
Also, let
$$ R = \biggl( 1 + \frac{2}{L}\frac{\lambda }{(1 - \sigma )}\sigma + \frac{1}{\lambda ^{2}}\frac{\sigma ^{2}\lambda ^{2}}{(1 - \sigma )^{2}} \biggr), $$
then
$$\begin{aligned}& \frac{ \Vert d_{k} \Vert ^{2}}{ \Vert g_{k} \Vert ^{4}} \le \frac{R}{ \Vert g_{k} \Vert ^{q}} \le R\sum _{i = 1}^{k} \frac{1}{ \Vert g_{i} \Vert ^{q}}, \\& \frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}} \ge \frac{\varepsilon ^{q}}{kR}. \end{aligned}$$
Therefore,
$$ \sum_{k = 0}^{\infty } \frac{ \Vert g_{k} \Vert ^{4}}{ \Vert d_{k} \Vert ^{2}} = \infty . $$
□