Let Ω be the level set with

$$ \Omega= \bigl\{ x| \bigl\Vert e(x) \bigr\Vert \le \bigl\Vert e(x_{0}) \bigr\Vert \bigr\} . $$

(3.1)

Similar to [31, 32, 50], the following assumptions are needed to prove the global convergence of Algorithm 1.

### Assumption A

(i) *e* is continuously differentiable on an open convex set \(\Omega_{1}\) containing Ω.

(ii) The Jacobian of *e* is symmetric, bounded, and positive definite on \(\Omega_{1}\), *i.e.*, there exist positive constants \(M^{*}\geq m_{*}>0\) such that

$$ \bigl\Vert \nabla e(x) \bigr\Vert \le M^{*} \quad \forall x \in \Omega_{1} $$

(3.2)

and

$$ m_{*}\Vert d\Vert ^{2}\le d^{T}\nabla e(x)d\quad \forall x \in\Omega_{1},d\in \Re^{n}. $$

(3.3)

### Assumption B

\(B_{k}\) is a good approximation to \(\nabla e_{k}\), *i.e.*,

$$ \bigl\Vert (\nabla e_{k}-B_{k})d_{k} \bigr\Vert \leq\epsilon_{*}\Vert e_{k}\Vert , $$

(3.4)

where \(\epsilon_{*}\in(0,1)\) is a small quantity.

Considering Assumption B and using the von Neumann lemma, we deduce that \(B_{k}\) is also bounded (see [31]).

### Lemma 3.1

*Let Assumption*
B
*hold*. *Then*
\(d_{k}\)
*is a descent direction of*
\(p(x)\)
*at*
\(x_{k}\), *i*.*e*.,

$$ \nabla p(x_{k})^{T}d_{k}\leq-(1- \epsilon_{*}) \bigl\Vert e(x_{k}) \bigr\Vert ^{2}. $$

(3.5)

### Proof

By using (1.10), we get

$$\begin{aligned} \nabla p(x_{k})^{T}d_{k} =& e(x_{k})^{T} \nabla e(x_{k}) d_{k} \\ =&e(x_{k})^{T} \bigl[ \bigl(\nabla e(x_{k})-B_{k} \bigr)d_{k}-e(x_{k}) \bigr] \\ =&e(x_{k})^{T} \bigl(\nabla e(x_{k})-B_{k} \bigr)d_{k} - e(x_{k})^{T}e(x_{k}). \end{aligned}$$

(3.6)

Thus, we have

$$\begin{aligned} \nabla p(x_{k})^{T}d_{k}+ \bigl\Vert e(x_{k}) \bigr\Vert ^{2} =&e(x_{k})^{T} \bigl(\nabla e(x_{k})-B_{k} \bigr)d_{k} \\ \leq& \bigl\Vert e(x_{k}) \bigr\Vert \bigl\Vert \bigl(\nabla e(x_{k})-B_{k} \bigr)d_{k} \bigr\Vert . \end{aligned}$$

It follows from (3.4) that

$$\begin{aligned} \nabla p(x_{k})^{T}d_{k} \leq& \bigl\Vert e(x_{k}) \bigr\Vert \bigl\Vert \bigl(\nabla e(x_{k})-B_{k} \bigr)d_{k} \bigr\Vert - \bigl\Vert e(x_{k}) \bigr\Vert ^{2} \\ \leq& -(1-\epsilon_{*}) \bigl\Vert e(x_{k}) \bigr\Vert ^{2}. \end{aligned}$$

(3.7)

The proof is complete. □

The following lemma shows that the line search technique (1.13) is reasonable, then Algorithm 1 is well defined.

### Lemma 3.2

*Let Assumptions*
A
*and*
B
*hold*. *Then Algorithm*
1
*will produce an iteration*
\(x_{k+1}=x_{k}+\alpha_{k}d_{k}\)
*in a finite number of backtracking steps*.

### Proof

From Lemma 3.5 in [32] we have in a finite number of backtracking steps

$$p(x_{k}+\alpha_{k}d_{k})\leq p(x_{k}) +\alpha_{k} \sigma e(x_{k})^{T}d_{k}, $$

from which, in view of the definition of \(p(x_{l(k)})=\max_{0\leq j\leq m(k)}\{p(x_{k-j})\}\geq p(x_{k})\), we obtain (1.13). Thus we conclude the result of this lemma. The proof is complete. □

Now we establish the global convergence theorem of Algorithm 1.

### Theorem 3.1

*Let Assumptions*
A
*and*
B
*hold*, *and*
\(\{\alpha_{k}, d_{k}, x_{k+1}, e_{k+1}\}\)
*be generated by Algorithm*
1. *Then*

$$ \lim_{k\rightarrow\infty} \Vert e_{k}\Vert =0. $$

(3.8)

### Proof

By the acceptance rule (1.13), we have

$$ p(x_{k+1})-p(x_{l(k)}) \leq\sigma \alpha_{k} e_{k}^{T}d_{k} < 0. $$

(3.9)

Using \(m(k+1)\leq m(k)+1\) and \(p(x_{k+1})\leq p(x_{l(k)})\), we obtain

$$p(x_{l(k+1)})\leq\max \bigl\{ p(x_{l(k)}),p(x_{k+1}) \bigr\} =p(x_{l(k)}). $$

This means that the sequence \(\{p(x_{l(k)})\}\) is decreasing for all *k*. Then \(\{p(x_{l(k)})\}\) is convergent. Based on Assumptions A and B, similar to Lemma 3.4 in [32], it is not difficult to deduce that there exist constants \(b_{1}\geq b_{2}>0\) such that

$$ b_{2}\Vert d_{k}\Vert ^{2}\leq d_{k}^{T}B_{k}d_{k}=-e_{k}^{T}d_{k} \leq b_{1}\Vert d_{k}\Vert ^{2}. $$

(3.10)

By (1.13) and (3.10), for all \(k>M\), we get

$$\begin{aligned} p(x_{l(k)}) =&p(x_{l(k)-1}+\alpha_{l(k)-1}d_{l(k)-1}) \\ \leq&\max_{0\leq j\leq m(l(k)-1)} \bigl\{ p(x_{l(k)-j-1}) \bigr\} +\sigma \alpha_{l(k)-1}g_{l(k)-1}^{T}d_{l(k)-1} \\ \leq& \max_{0\leq j\leq m(l(k)-1)} \bigl\{ p(x_{l(k)-j-1}) \bigr\} -\sigma b_{2} \alpha_{l(k)-1}\Vert d_{l(k)-1}\Vert ^{2}. \end{aligned}$$

(3.11)

Since \(\{p(x_{l(k)})\}\) is convergent, from the above inequality, we have

$$\lim_{k\rightarrow\infty}\alpha_{l(k)-1}\Vert d_{l(k)-1}\Vert ^{2}=0. $$

This implies that either

$$ \lim_{k\rightarrow\infty}\inf d_{l(k)-1}=0 $$

(3.12)

or

$$ \lim_{k\rightarrow\infty}\inf \alpha_{l(k)-1}=0. $$

(3.13)

If (3.12) holds, following [40], by induction we can prove that

$$ \lim_{k\rightarrow\infty} \Vert d_{l(k)-j}\Vert =0 $$

(3.14)

and

$$\lim_{k\rightarrow\infty} p(x_{l(k)-j})=\lim_{k\rightarrow\infty} p(x_{l(k)}) $$

for any positive integer *j*. As \(k\geq l(k)\geq k-M\) and *M* is a positive constant, by

$$x_{k}=x_{k-M-1}+\alpha_{k-M-1}d_{k-M-1}+ \cdots+ \alpha_{l(k)-1}d_{l(k)-1} $$

and (3.14), it can be derived that

$$ \lim_{k\rightarrow\infty} p(x_{l(k)})=\lim _{k\rightarrow \infty} p(x_{k}). $$

(3.15)

According to (3.10) and the rule for accepting the step \(\alpha_{k}d_{k}\),

$$ p(x_{k+1})-p(x_{l(k)})\leq\alpha_{k} \sigma e_{k}^{T}d_{k}\leq \alpha_{k} \sigma b_{2}\Vert d_{k}\Vert ^{2}. $$

(3.16)

This means

$$\lim_{k\rightarrow\infty}\alpha_{k}\Vert d_{k}\Vert ^{2}=0, $$

which implies that

$$ \lim_{k\rightarrow\infty}\alpha_{k}=0 $$

(3.17)

or

$$ \lim_{k\rightarrow\infty} \Vert d_{k}\Vert =0. $$

(3.18)

If equation (3.18) holds, since \(B_{k}\) is bounded, then \(\Vert e_{k}\Vert =\Vert B_{k}d_{k}\Vert \leq \Vert B_{k}\Vert \Vert d_{k}\Vert \rightarrow0\) holds. The conclusion of this lemma holds. If (3.17) holds. Then acceptance rule (1.13) means that, for all large enough *k*, \(\alpha_{k}'=\frac{\alpha_{k}}{r}\) such that

$$\begin{aligned} p \bigl(x_{k}+\alpha_{k}'d_{k} \bigr)-p(x_{k}) \geq& p \bigl(x_{k}+\alpha_{k}'d_{k} \bigr)-p(x_{l(k)}) >\sigma\alpha_{k}'e_{k}^{T}d_{k}. \end{aligned}$$

(3.19)

Since

$$ p \bigl(x_{k}+\alpha_{k}'d_{k} \bigr)-p(x_{k})= \alpha_{k}' \nabla p(x_{k})^{T}d_{k}+o \bigl(\alpha_{k}' \Vert d_{k}\Vert \bigr). $$

(3.20)

Using (3.19) and (3.20) in [32], we have

$$\nabla p(x_{k})^{T}d_{k}=e_{k}^{T} \nabla e(x_{k})d_{k}\leq\delta^{*} e_{k}^{T}d_{k}, $$

where \(\delta^{*} >0\) is a constant and \(\sigma< \delta^{*}\). So we get

$$ \bigl[\delta^{*}-\sigma \bigr] \alpha_{k}' e_{k}^{T}d_{k}+o \bigl(\alpha_{k}' \Vert d_{k}\Vert \bigr)\geq0. $$

(3.21)

Note that \(\delta^{*}-\sigma>0\) and \(e_{k}^{T}d_{k}<0\), we have from dividing (3.21) by \(\alpha_{k}'\Vert d_{k}\Vert \)

$$ \lim_{k\rightarrow\infty}\frac{e_{k}^{T}d_{k}}{\Vert d_{k}\Vert }=0. $$

(3.22)

By (3.10), we have

$$ \lim_{k\rightarrow\infty} \Vert d_{k}\Vert =0. $$

(3.23)

Consider \(\Vert e_{k}\Vert =\Vert B_{k}d_{k}\Vert \leq \Vert B_{k}\Vert \Vert d_{k}\Vert \) and the bounded \(B_{k}\) again, we complete the proof. □

### Lemma 3.3

see Lemma 4.1 in [31]

*Let*
*e*
*be continuously differentiable*, *and*
\(\nabla e(x)\)
*be nonsingular at*
\(x^{*}\)
*which satisfies*
\(e(x^{*})=0\). *Let*

$$ a\equiv \biggl\{ \bigl\Vert \nabla e \bigl(x^{*} \bigr) \bigr\Vert + \frac{1}{2c},2c \biggr\} , c= \bigl\Vert \nabla e \bigl(x^{*} \bigr)^{-1} \bigr\Vert . $$

(3.24)

*If*
\(\Vert x_{k}-x^{*}\Vert \)
*sufficiently small*, *then the inequality*

$$ \frac{1}{a} \bigl\Vert x_{k}-x^{*} \bigr\Vert \leq \bigl\Vert e(x_{k}) \bigr\Vert \leq a \bigl\Vert x_{k}-x^{*} \bigr\Vert $$

(3.25)

*holds*.

### Theorem 3.2

*Let the assumptions in Lemma*
3.3
*hold*. *Assume that there exists a sufficiently small*
\(\varepsilon_{0}>0\)
*such that*
\(\Vert B_{k}-\nabla e(x_{k})\Vert \leq\varepsilon_{0}\)
*for each*
*k*. *Then the sequence*
\(\{x_{k}\}\)
*converges to*
\(x^{*}\)
*superlinearly for*
\(\alpha_{k}=1\). *Moreover*, *if*
*e*
*is q*-*order smooth at*
\(x^{*}\)
*and there is a neighborhood*
*U*
*of*
\(x^{*}\)
*satisfying for any*
\(x_{k}\in U\),

$$ \bigl\Vert \bigl[B_{k}-\nabla e \bigl(x^{*} \bigr) \bigr] \bigl(x_{k}-x^{*} \bigr) \bigr\Vert \leq\eta \bigl\Vert x_{k}-x^{*} \bigr\Vert ^{1+q}, $$

(3.26)

*then*
\(x_{k}\rightarrow x^{*}\)
*with order at least*
\(1+q\), *where*
*η*
*is a constant*.

### Proof

Since *g* is continuously differentiable and \(\nabla e(x)\) is nonsingular at \(x^{*}\), there exists a constant \(\gamma>0\) and a neighborhood *U* of \(x^{*}\) satisfying

$$\max \bigl\{ \bigl\Vert \nabla e(y) \bigr\Vert , \bigl\Vert \nabla e(y)^{-1} \bigr\Vert \bigr\} \leq\gamma, $$

where \(\nabla e(y)\) is nonsingular for any \(y\in U\). Consider the following equality when \(\alpha_{k}=1\):

$$\begin{aligned} &B_{k} \bigl(x_{k+1}-x^{*} \bigr)+ \bigl[ \nabla e(x_{k}) \bigl(x_{k}-x^{*} \bigr)-B_{k} \bigl(x_{k}-x^{*} \bigr) \bigr] \\ &\qquad{}+ \bigl[e(x_{k})-e \bigl(x^{*} \bigr)-\nabla e(x_{k}) \bigl(x_{k}-x^{*} \bigr) \bigr] \\ &\quad=e(x_{k})+B_{k}d_{k}=0, \end{aligned}$$

(3.27)

the second term and the third term are \(o(\Vert x_{k}-x^{*}\Vert )\). By the von Neumann lemma, and considering that \(\nabla e(x_{k})\) is nonsingular, \(B_{k}\) is also nonsingular. For any \(y\in U\) and \(\nabla e(y)\) being nonsingular and \(\max\{\Vert \nabla e(y)\Vert ,\Vert \nabla e(y)^{-1}\Vert \}\leq\gamma\), then we obtain from Lemma 3.3

$$\bigl\Vert x_{k+1}-x^{*} \bigr\Vert =o \bigl( \bigl\Vert x_{k}-x^{*} \bigr\Vert \bigr)=o \bigl( \bigl\Vert e(x_{k}) \bigr\Vert \bigr),\quad \mbox{as }k\rightarrow \infty, $$

this means that the sequence \(\{x_{k}\}\) converges to \(x^{*}\) superlinearly for \(\alpha_{k}=1\).

If *e* is *q*-order smooth at \(x^{*}\), then we get

$$e(x_{k})-e \bigl(x^{*} \bigr)-\nabla e(x_{k}) \bigl(x_{k}-x^{*} \bigr)=O \bigl( \bigl\Vert x_{k}-x^{*} \bigr\Vert ^{q+1} \bigr). $$

Consider the second term of (3.27) as \(x_{k}\rightarrow x^{*}\), and use (3.26), we can deduce that the second term of (3.27) is also \(O(\Vert x_{k}-x^{*}\Vert ^{q+1})\). Therefore, we have

$$\bigl\Vert x_{k+1}-x^{*} \bigr\Vert =O \bigl( \bigl\Vert x_{k}-x^{*} \bigr\Vert ^{q+1} \bigr), \quad\mbox{as }x_{k} \rightarrow x^{*}. $$

The proof is complete. □