The PRP algorithm with the modified WWP line search for nonconvex functions is listed as follows.
Algorithm 1
(The PRP CG algorithm under the YWL line search rule)
- Step 1::
-
Choose an initial point \(x_{1} \in \Re ^{n}\), \(\varepsilon \in (0,1) \delta \in (0,\frac{1}{2})\), \(\delta _{1}\in (0,\delta )\), \(\sigma \in (\delta ,1)\). Set \(d_{1}=-g_{1}=-\nabla f(x_{1})\), \(k:=1\).
- Step 2::
-
If \(\|g_{k}\| \leq \varepsilon \), stop.
- Step 3::
-
Compute the step size \(\alpha _{k}\) using the YWL line search rule (1.6) and (1.7).
- Step 4::
-
Let \(x_{k+1}=x_{k}+\alpha _{k}d_{k}\).
- Step 5::
-
If \(\|g_{k+1}\|\leq \varepsilon \), stop.
- Step 6::
-
Calculate the search direction
$$ d_{k+1}=-g_{k+1}+\beta _{k}^{\mathrm{PRP}} d_{k}. $$
(2.1)
- Step 7::
-
Set \(k:=k+1\), and go to Step 3.
The normal assumptions for the nonconvex functions are needed as follows.
Assumption i
-
(A)
The defined level set \(L_{0}=\{x\mid f(x) \leq f(x_{0})\}\) is bounded.
-
(B)
Let \(f(x)\) be twice continuously differentiable and bounded below, and the gradient function \(g(x)\) is Lipschitz continuous, namely there exists a constant \(L>0\) satisfying
$$ \bigl\Vert g(x)-g(y) \bigr\Vert \leq L \Vert x-y \Vert ,\quad x, y \in \Re ^{n}. $$
(2.2)
Remark
-
(1)
Define a case by Case i: \(\min [-\delta _{1}g(x_{k})^{T}d_{k},\delta \alpha _{k}\|d_{k}\|^{2}]=\delta \alpha _{k}\|d_{k}\|^{2}\). This case means that
$$ -\delta _{1}g(x_{k})^{T}d_{k} \geq \delta \alpha _{k} \Vert d_{k} \Vert ^{2}\geq 0, $$
which can ensure that the modified WWP line search (1.6) and (1.7) is reasonable (see Theorem 2.1 in [28]). Then Algorithm 1 is well defined.
-
(2)
In [28], the global convergence of Algorithm 1 is established for Case i, and it needs not only Assumption i conditions but also
and
$$ g_{k+1}d_{k}\leq -\sigma _{1}g_{k}^{T}d_{k}. $$
In this paper, we will give another proof way only needing Assumption i.
-
(3)
Assumptions i(A) and i(B) imply that there exists a constant \(G^{*}>0\) such that
$$ \bigl\Vert g(x) \bigr\Vert \leq G^{*},\quad x \in L_{0}. $$
(2.3)
Lemma 2.1
Let Assumption i
hold. If there exists a positive constant
\(\epsilon _{*}\)
such that
$$ \Vert g_{k} \Vert \geq \epsilon _{*},\quad \forall k, $$
(2.4)
then we can deduce that there exists a constant
\(D^{*}\)
satisfying
$$ \Vert d_{k} \Vert \leq \omega ^{*},\quad \forall k. $$
(2.5)
Proof
By (1.6), we get
$$\begin{aligned} f(x_{k}+\alpha _{k}d_{k}) \le & f_{k}+\delta \alpha _{k}g_{k}^{T}d_{k}+ \alpha _{k}\min \biggl[-\delta _{1}g_{k}^{T}d_{k}, \delta \frac{\alpha _{k}}{2} \Vert d_{k} \Vert ^{2} \biggr] \\ \leq & f_{k}+\delta \alpha _{k}g_{k}^{T}d_{k}- \alpha _{k}\delta _{1}g _{k}^{T}d_{k} \\ =&f_{k}+(\delta -\delta _{1})\alpha _{k}g_{k}^{T}d_{k}, \end{aligned}$$
then the following inequality
$$ -(\delta -\delta _{1})\alpha _{k} g_{k}^{T}d_{k} \leq f(x_{k})-f(x_{k+1}) $$
holds. Using Assumption i(A) and summing these inequalities from \(k=0\) to ∞, we have
$$ \delta \sum_{k=0}^{\infty } \bigl[-(\delta -\delta _{1})\alpha _{k}g_{k}^{T}d _{k}\bigr] < \infty . $$
(2.6)
Using Step 6 of Algorithm 1 and setting \(s_{k}=x_{k+1}-x_{k}=\alpha _{k}d_{k}\), we have
$$\begin{aligned} \Vert d_{k+1} \Vert \leq & \Vert g_{k+1} \Vert + \bigl\vert \beta _{k}^{\mathrm{PRP}} \bigr\vert \Vert d_{k} \Vert \\ \leq & \Vert g_{k+1} \Vert +\frac{ \Vert g_{k+1} \Vert \Vert g_{k+1}-g_{k} \Vert }{ \Vert g_{k} \Vert } \Vert d _{k} \Vert \\ \leq &G^{*}+\frac{G^{*}L^{*}}{ \Vert g_{k} \Vert } \Vert s_{k} \Vert \Vert d_{k} \Vert \\ \leq &G^{*}+\frac{G^{*}L^{*}}{\epsilon _{*}} \Vert s_{k} \Vert \Vert d_{k} \Vert , \end{aligned}$$
(2.7)
where the third inequality follows (2.2) and (2.3), and the last inequality follows (2.4). By the definition of Case i, we get
$$ d_{k}^{T}g_{k}\leq -\frac{\delta }{\delta _{1}} \alpha _{k} \Vert d_{k} \Vert ^{2}. $$
Thus, by (2.6), we get
$$ \sum_{k=0}^{\infty } \Vert s_{k} \Vert ^{2}=\sum_{k=0}^{\infty } \alpha _{k} \bigl( \alpha _{k} \Vert d_{k} \Vert ^{2}\bigr)\leq \frac{\delta _{1}}{\delta (\delta -\delta _{1})} \Biggl[(\delta -\delta _{1})\sum_{k=0}^{\infty } \bigl(-\alpha _{k} g_{k} ^{T}d_{k} \bigr)\Biggr]< \infty . $$
Then we have
$$ \Vert s_{k} \Vert \rightarrow 0,\quad k \rightarrow \infty . $$
This implies that there exist a constant \(\varepsilon \in (0,1)\) and a positive integer \(k_{0}\geq 0\) satisfying
$$ \frac{G^{*}L^{*} \Vert s_{k} \Vert }{\epsilon _{*}}\leq \varepsilon ,\quad \forall k\geq k_{0}. $$
(2.8)
So, by (2.7), for all \(k>k_{0}\), we obtain
$$\begin{aligned} \Vert d_{k+1} \Vert \leq &G^{*}+\varepsilon \Vert d_{k} \Vert \\ \leq & G^{*}\bigl(1+\varepsilon +\varepsilon ^{2}+ \cdots + \varepsilon ^{k-k_{0}-1}\bigr)+\varepsilon ^{k-k_{0}} \Vert d_{k_{0}} \Vert \\ \leq & \frac{G^{*}}{1-\varepsilon }+ \Vert d_{k_{0}} \Vert . \end{aligned}$$
Let \(\omega ^{*}=\max \{\|d_{1}\|,\|d_{2}\|,\ldots ,\|d_{k_{0}}\|,\frac{G _{b}}{1-\varepsilon }+\|d_{k_{0}}\|\}\). Therefore, we get
$$ \Vert d_{k} \Vert \leq \omega ^{*},\quad \forall k \geq 0. $$
The proof is complete. □
Theorem 2.1
Let the conditions of the above lemma hold. Then the following relation
$$ \lim_{k\rightarrow \infty } \inf \Vert g_{k} \Vert =0 $$
(2.9)
holds.
Proof
Suppose that (2.9) does not hold, we can deduce that there exists a constant \(\epsilon _{*}>0\) such that
$$ \Vert g_{k} \Vert \geq \epsilon _{*},\quad \forall k. $$
Using Lemma 2.1, we get (2.5). By a way similar to (2.6) and using the case \(-\delta _{1}g(x_{k})^{T}d_{k} \geq \delta \alpha _{k}\|d_{k}\|^{2}\), we have
$$\begin{aligned} \frac{\delta -\delta _{1}}{\delta _{1}} \delta \Vert \alpha _{k} d_{k} \Vert ^{2} \leq & - \frac{\delta -\delta _{1}}{\delta _{1}} \delta _{1}\alpha _{k}d _{k}^{T}g_{k} \\ =& -(\delta -\delta _{1})\alpha _{k}g_{k}^{T}d_{k} \\ \rightarrow & 0,\quad k\rightarrow \infty , \end{aligned}$$
which generates
$$ \Vert \alpha _{k} d_{k} \Vert ^{2}\rightarrow 0,\quad k\rightarrow \infty . $$
(2.10)
Then we discuss the above relation by the following cases.
Case 1: \(\|d_{k}\| \rightarrow 0\), \(k\rightarrow \infty \). By (3.1), (2.3), (2.2), and (2.10), we have
$$\begin{aligned} 0 \leq & \Vert g_{k+1} \Vert \\ =& \bigl\Vert -d_{k+1}+\beta _{k}^{\mathrm{PRP}}d_{k} \bigr\Vert \\ \leq & \Vert d_{k+1} \Vert +\frac{ \Vert g_{k+1} \Vert \Vert g_{k+1}-g_{k} \Vert }{ \Vert g_{k} \Vert } \Vert d _{k} \Vert \\ \leq & \Vert d_{k+1} \Vert +\frac{G^{*} L \Vert \alpha _{k}d_{k} \Vert }{\epsilon _{*}} \Vert d_{k} \Vert \\ \rightarrow & 0,\quad k\rightarrow \infty . \end{aligned}$$
Then we get (2.9).
Case 2: \(\alpha _{k} \rightarrow 0\), \(k\rightarrow \infty \). By (1.7), Remark (1), and the Taylor formula, we get
$$\begin{aligned} g_{k}^{T}d_{k}+O\bigl( \Vert \alpha _{k}d_{k} \Vert ^{2}\bigr) =&g(x_{k}+\alpha _{k}d_{k})^{T}d _{k} \\ \geq & \sigma d_{k}^{T}g_{k}+\min \bigl[- \delta _{1}g_{k}^{T}d_{k}, \delta \alpha _{k} \Vert d_{k} \Vert ^{2}\bigr] \\ \geq & \sigma d_{k}^{T}g_{k}. \end{aligned}$$
Combining with the case \(-\delta _{1}g(x_{k})^{T}d_{k}\geq \delta \alpha _{k}\|d_{k}\|^{2}\) leads to
$$\begin{aligned} O\bigl( \Vert \alpha _{k}d_{k} \Vert ^{2}\bigr) \geq & -(1-\sigma )d_{k}^{T}g_{k} \\ \geq & \frac{\delta (1-\sigma )}{\delta _{1}}\alpha _{k} \Vert d_{k} \Vert ^{2}. \end{aligned}$$
So we have
$$ O(\alpha _{k}) \geq \frac{\delta (1-\sigma )}{\delta _{1}}. $$
This contracts the case \(\alpha _{k} \rightarrow 0\) (\(k\rightarrow \infty \)). Then we also obtain (2.9). All in all, we always have (2.9). The proof is complete. □