 Research
 Open access
 Published:
Hybrid HuStorey type methods for largescale nonlinear monotone systems and signal recovery
Journal of Inequalities and Applications volume 2024, Article number: 110 (2024)
Abstract
We propose two hybrid methods for solving largescale monotone systems, which are based on derivativefree conjugate gradient approach and hyperplane projection technique. The conjugate gradient approach is efficient for largescale systems due to low memory, while projection strategy is suitable for monotone equations because it enables simply globalization. The derivativefree functionvaluebased line search is combined with HuStorey type search directions and projection procedure, in order to construct globally convergent methods. Furthermore, the proposed methods are applied into solving a number of largescale monotone nonlinear systems and reconstruction of sparse signals. Numerical experiments indicate the robustness of the proposed methods.
1 Introduction
Nonlinear systems arise in many mathematical models from applied disciplines. Reformulations of different problems lead to systems of nonlinear equations, so various iterative methods for solving nonlinear systems have been developed. In this paper, we consider the nonlinear system
where function \(F: \mathbb{R}^{n}\rightarrow \mathbb{R}^{n}\) is continuous and monotone. The monotonicity means that
A large group of numerical methods for solving monotone systems has been developed during the last decade (see [1, 20–22, 24, 29, 33, 35, 43]). The pioneer work for solving nonlinear monotone systems is given in [41]. In that paper, the authors introduced projection technique that fully exploits the monotonicity of function F. The whole sequence of iterations generated by proposed method globally converges to the solution of system (1) without any regularity assumptions. This projection strategy is applied in [49], where limited memory BFGS method, which is suitable for largescale monotone systems, is proposed. Furthermore, the spectral projection method for nonlinear monotone equations is introduced in [47]. It is wellknown that spectral gradient and conjugate gradient (CG) methods can successfully cope with largescale optimization problems. CG methods and their modifications for unconstrained optimization problems are presented in many papers [3, 4, 7, 18, 23, 26, 42, 44]. Low memory requirement of CG methods motivated authors to adopt this technique for solving largescale monotone systems.
Hyperplane projection procedure guarantees simple globalization and therefore is useful for monotone systems. Since CG methods are at low cost, they are suitable for largescale systems. Because of that, some methods presented in [20–22, 37] combine CG directions with projection technique in order to solve largescale nonlinear monotone systems.
The globalization strategy for solving (1) is usually based on line search with merit function. Many methods which use merit function have practical importance, but they have some disadvantages as well. Every of these algorithms generates a sequence with accumulation point which is only a stationary point of the merit function. So, regularity conditions are necessary to gauarantee that this stationary point is the solution of (1). The regularity is also required in order to prove that the whole sequence of iterations converges to the solution. In addition, the assumption of boundedness of the level set of merit function is also important to ensure the existence of accumulation point. These disadvantages and the idea of exploring the monotonicity of function F motivated authors in [41] to introduce a projection technique and to propose a globally convergent inexact Newton method for solving monotone systems. Inspired by that, many methods without merit function, which use derivativefree line search, have been developed (see [1, 20–22, 24, 29, 33, 35, 43]).
In this paper, we propose two hybrid methods, which are based on derivativefree CG approach and hyperplane projection technique. Functionvaluebased line search without derivatives, combined with HuStorey type search directions and projection procedure, is used for globalization strategy. Both methods use projection technique, proposed in [41], which generates a sequence \(\{{z}_{k}\}\) defined by \({z}_{k}={x}_{k}+\alpha _{k} {d}_{k}\), where \({x}_{k}\) is the current iteration and the step length \(\alpha _{k}>0\) is determined by some line search procedure along the given direction \({d}_{k}\) such that
Note that by monotonicity of F, we have
for every solution \(x^{*}\) of the system (1). Thus the hyperplane
strictly separates the current iteration \(x_{k}\) from the solution set of equation (1). When the hyperplane \(\mathcal{H}_{k}\) is obtained, it is followed by projecting the current iteration \(x_{k}\) onto it to compute the next iteration \(x_{k+1}\) by
The line search procedure affects the computational complexity of algorithm. Because of that, we use derivativefree line search rule without merit function
based only on evaluation of function F, where \(d_{k}\in \mathbb{R}^{n}\) is a hybrid search direction, which should satisfy sufficient descent condition.
Methods are proposed in Sect. 2. Convergence analysis is presented in Sect. 3. Numerical performances of methods and their application for signal recovery are analyzed and compared with other known methods in Sect. 4 and Sect. 5, respectively. Final conclusions are given in the last section. Throughout the paper, the Euclidean norm \(\\cdot \\) is used.
2 Algorithm
As we mentioned before, CG approach for solving unconstrained optimization problems and nonlinear systems is characterized by low memory requirements and therefore is suitable for largescale problems. An excellent review of CG methods for unconstrained optimization problems is given in [17]. Different CG algorithms correspond to different choices of the update parameter \(\beta _{k}\). Therefore, a crucial element in any CG method is the proper choice of parameter \(\beta _{k}\). Fletcher–Reeves (FR) type methods for unconstrained optimization problems [6, 10, 11, 15, 31, 40] have strong convergence properties, but may have modest practical performance due to jamming. These algorithms can make many short steps without giving significant progress to the solution of optimization problem. On the other hand, Polak–Ribière–Polyak (PRP) type methods [38, 39] may not be convergent in general, but often have better computational performances. Therefore, combinations of these methods have been proposed with the aim to take into account the advantages both of FR and PRP methods. One of the first hybrid methods for unconstrained optimization problems was proposed by Hu and Storey [19], where
Motivated by CG methods for unconstrained optimization problems, some authors have adopted CG approach to monotone systems [9, 30, 37, 46]. In this paper, we extend the HuStorey idea given in [19] to nonlinear monotone systems. First, we define HuStorey (HuS) search direction of the form
where update parameter \(\beta _{k}^{HuS}\) is defined by (8), \(F_{k}=F(x_{k})\), \(\beta _{k}^{FR}=\frac{\F_{k}\^{2}}{\F_{k1}\^{2}}\), \(\beta _{k}^{PRP}=\frac{F_{k}^{T} y_{k1}}{\F_{k1}\^{2}}\), \(y_{k1}=F_{k}F_{k1}\) and \(w_{k}=z_{k}x_{k}=\alpha _{k} d_{k}\).
The globalization procedure with derivativefree, functionvaluebased line search (7) requires that the search direction should satisfy sufficient descent condition
where \(\tau >0\). However, HuS search direction (9) does not satisfy (10) in general. So, if this condition is not satisfied, we take the steepest descent direction, i.e. \(d_{k}=F_{k}\).
The fact that HuS direction (9) may not be sufficient descent direction, inspired us to introduce another Hu–Storey type search direction which always satisfies (10). In [8], Cheng proposed twoterm PRP method for unconstrained optimization problems. The authors in [30] used this idea to construct a twoterm PRP method for solving nonlinear monotone systems. Motivated by this and HuStorey update parameter, we define the following twoterm Hu–Storey search direction
which always satisfies the sufficient descent property (10), since
Both directions (9) and (11) are combinations of FR and PRP type directions with the aim to exploit their attractive properties. So, when the iteration jams, the PRP direction is used if \(\beta _{k}^{PRP}>0\), while the steepest descent direction is used if \(\beta _{k}^{PRP}\leq 0\). Otherwise, FR type direction is used. Now, we can state the algorithm as follows.
Algorithm 1 with HuS search direction \(d_{k}=d_{k}^{HuS}\) defined by (9) will be named HuS method, while Algorithm 1 with twoterm HuS search direction \(d_{k}=d_{k}^{THuS}\), defined by (11), will be named twoterm HuS method.
3 Convergence analysis
The convergence analysis of methods is performed under standard assumptions, using the same idea as in [37]. First, for both methods it will be proved that directions \({ d}_{k}\) generated by Algorithm 1 satisfy sufficient descent property (10), then that sequences \(\{{ x}_{k}\}\) and \(\{ { d}_{k} \}\) are bounded, later that algorithm is well defined and finally that the whole sequence generated by Algorithm 1 globally converges to the solution of monotone system (1).
From now on, we assume that the following conditions hold:
 A1:

the function \(F( x)\) is monotone on \(\mathbb{R}^{n}\),
 A2:

the function \(F( x)\) is Lipschitz continuous on \(\mathbb{R}^{n}\),
 A3:

the solution set of system (1) is nonempty.
The construction of Algorithm 1 ensures that generated search directions satisfy sufficient descent condition (10).
Lemma 3.1
The directions \({ d}_{k}\) generated by Algorithm 1 satisfy condition (10) for every \(k\in \mathbb{N}\cup \{0\}\).
Proof
For HuS method, it is not difficult to see that all directions \({ d}_{k}\) satisfy (10), because if \({ d}_{k}^{HuS}\) does not satisfy it, then \({ d}_{k}={ F}_{k}\) which satisfies it. Considering twoterm HuS method, it is clear that all directions satisfy (10) because of (12). □
The next lemma indicates that the sequence of distances between iterations generated by Algorithm 1 and the solution set is decreasing, which essentially ensures global convergence. This lemma can be proved in the same way as in [41] and also guarantees that the sequence \(\{ x_{k}\}\) is bounded.
Lemma 3.2
[41] Suppose that assumptions A1–A3 are satisfied and the sequence \(\{{ x}_{k}\}\) is generated by Algorithm 1. For any solution \({ x}^{*}\) of system (1),
holds and the sequence \(\{{ x}_{k}\}\) is bounded. Furthermore, either \(\{{ x}_{k}\}\) is finite and the last iteration is a solution of (1), or \(\{{ x}_{k}\}\) is infinite and
The boundedness of every search direction generated by Algorithm 1 is necessary for further analysis and it is proved in the next theorem.
Theorem 3.3
Suppose that assumptions A1–A3 are satisfied and the sequence of directions \(\{{ d}_{k}\}\) is generated by HuS method or twoterm HuS method. Let \(\varepsilon _{0}>0\) be a constant such that
holds for every \(k\in \mathbb{N}\cup \{0\}\). Then directions \(\{{ d}_{k}\}\) are bounded, i.e., there exists a constant \(M>0\) such that
holds for every \(k\in \mathbb{N}\cup \{0\}\).
Proof
From Lemma 3.2 and assumption A2, it follows that \(\{ F}({ x}_{k})\\leq L\{ x}_{0}{ x}^{*}\\) for every \(k\in \mathbb{N}\cup \{0\}\), so taking \(\kappa =L\{ x}_{0}{ x}^{*}\\), we have
for every \(k\in \mathbb{N}\cup \{0\}\). Cauchy–Schwartz inequality applied to (6) implies
while (6) and line search condition (7) indicate
Finally, (14) and (19) guarantee
Based on proposed search directions \({ d}_{k}^{HuS}\) (9) and \({ d}_{k}^{THuS}\) (11), we should distinguish the proofs of boundedness of directions generated by HuS and twoterm HuS methods.
Boundedness of directions generated by HuS method: In HuS method, the direction \(d_{k}\) in Step 2 of Algorithm 1 is defined by \({ d}_{k}^{HuS}\) (9).

If \(k=0\) or \(\beta _{k}^{HuS}=0\) or direction \(d_{k}={ d}_{k}^{HuS}\) does not satisfy the sufficient descent property (10), then \({ d}_{k}={ F}_{k}\). Therefore, by (17)
$$ \{ d}_{k}\=\{ F}_{k}\\leq \kappa . $$(21) 
If direction \(d_{k}={ d}_{k}^{HuS}\) satisfies the sufficient descent property (10) and \(\beta _{k}^{HuS}=\beta _{k}^{PRP}\), then \({ d}_{k}={ F}_{k}+\beta _{k}^{PRP}{ w}_{k1}\). Assumption A2, (15), (17) and (18) imply
$$\begin{aligned} \{ d}_{k}\ = & \{ F}_{k}+\beta _{k}^{PRP}{ w}_{k1}\ \\ \leq & \{ F}_{k}\+ \frac{{ F}_{k}^{T} { y}_{k1}}{\{ F}_{k1}\^{2}}\{ w}_{k1}\ \\ \leq & \{ F}_{k}\+ \frac{\ F_{k}\ \ { y}_{k1}\}{\{ F}_{k1}\^{2}} \w_{k1} \ \\ =&\{ F}_{k}\+ \frac{\{ F}_{k}\ \ { F}_{k}{ F}_{k1}\}{\{ F}_{k1}\^{2}} \alpha _{k1}\{ d}_{k1}\ \\ \leq & \kappa + \frac{\kappa L\ { x}_{k}{ x}_{k1}\}{\varepsilon _{0}^{2}}\alpha _{k1} \{ d}_{k1}\ \\ =& \kappa +\frac{\kappa L}{\varepsilon _{0}^{2}} \alpha _{k1}^{2} \{ d}_{k1}\^{2} \\ =&\kappa \left (1+\frac{L}{\varepsilon _{0}^{2}}\alpha _{k1}^{2}\{ d}_{k1} \^{2}\right ) \end{aligned}$$(22)for all \(k\in \mathbb{N}\). From (20) it follows that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k1}\{ d}_{k1}\<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}\) and \(M_{1}=\max \{\{ d}_{0}\,\{ d}_{1}\,\ldots ,\{ d}_{k_{0}}\, \widetilde{M}_{1}\}\), where \(\widetilde{M}_{1}=\kappa (1+L)\), it holds that
$$ \{ d}_{k}\\leq M_{1} $$(23)for every \(k\in \mathbb{N}\).

If direction \(d_{k}={ d}_{k}^{HuS}\) satisfies the sufficient descent property (10) and \(\beta _{k}^{HuS}=\beta _{k}^{FR}\), then \({ d}_{k}={ F}_{k}+\beta _{k}^{FR}{ w}_{k1}\). From (15), (17) and (18), we have
$$\begin{aligned} \{ d}_{k}\ = & \{ F}_{k}+\beta _{k}^{FR}{ w}_{k1}\ \\ \leq & \{ F}_{k}\+\frac{\{ F}_{k}\^{2}}{\{ F}_{k1}\^{2}}\{ w}_{k1} \ \\ \leq & \kappa +\frac{\kappa ^{2}}{\varepsilon _{0}^{2}}\alpha _{k1} \{ d}_{k1}\ \\ =& \kappa \left (1+\frac{\kappa}{\varepsilon _{0}^{2}}\alpha _{k1} \{ d}_{k1}\\right ) \end{aligned}$$(24)for all \(k\in \mathbb{N}\) and (20) implies that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k1}\{ d}_{k1}\<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}^{2}\) and \(M_{2}=\max \{\{ d}_{0}\,\{ d}_{1}\,\ldots ,\{ d}_{k_{0}}\, \widetilde{M}_{2}\}\), where \(\widetilde{M}_{2}=\kappa (1+\kappa )\), there holds
$$ \{ d}_{k}\\leq M_{2} $$(25)for every \(k\in \mathbb{N}\).
Relations (21), (23) and (25) imply that search directions \({ d}_{k}\) from HuS method are bounded, i.e., \(\{ d}_{k}\\leq M\) for every \(k\in \mathbb{N}\cup \{0\}\), where \(M=\max \{\kappa , M_{1},M_{2}\}\).
Boundedness of directions generated by twoterm HuS method: In twoterm HuS method, the direction \(d_{k}\) in Step 2 of Algorithm 1 is defined by \({ d}_{k}^{THuS}\) (11).

If \(k=0\) or \(\beta _{k}^{HuS}=0\) then \({ d}_{k}={ F}_{k}\) and therefore (21) holds by (17).

If \(\beta _{k}^{HuS}=\beta _{k}^{PRP}\) then \(d_{k}=F_{k}+\beta _{k}^{PRP}\left (I \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k1}\). Assumption A2, (15), (17) and (18) imply
$$\begin{aligned} \d_{k}\ = & \F_{k}+\beta _{k}^{PRP}\left (I \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k1}\ \\ \leq & \F_{k}\+\frac{F_{k}^{T} y_{k1}}{\F_{k1}\^{2}}\\left (I \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k1}\ \\ \leq & \F_{k}\+\frac{F_{k}^{T} y_{k1}}{\F_{k1}\^{2}}\w_{k1} \ \\ \leq & \F_{k}\+ \frac{\ F_{k}\ \ y_{k1}\}{\F_{k1}\^{2}} \alpha _{k1}\d_{k1}\ \\ =&\F_{k}\+\frac{\F_{k}\ \ F_{k}F_{k1}\}{\F_{k1}\^{2}} \alpha _{k1}\d_{k1}\ \\ \leq & \kappa + \frac{\kappa L\ x_{k}x_{k1}\}{\varepsilon _{0}^{2}}\alpha _{k1} \d_{k1}\ \\ =& \kappa +\frac{\kappa L}{\varepsilon _{0}^{2}} \alpha _{k1}^{2} \d_{k1}\^{2} \\ =&\kappa \left (1+\frac{L}{\varepsilon _{0}^{2}}\alpha _{k1}^{2}\d_{k1} \^{2}\right ) \end{aligned}$$(26)for all \(k\in \mathbb{N}\).
From (20) it follows that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k1}\{ d}_{k1}\<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}\) and \(M_{3}=\max \{\{ d}_{0}\,\{ d}_{1}\,\ldots ,\{ d}_{k_{0}}\, \widetilde{M}_{3}\}\), where \(\widetilde{M}_{3}=\kappa (1+L)\), it holds that
$$ \{ d}_{k}\\leq M_{3} $$(27)for every \(k\in \mathbb{N}\).

If \(\beta _{k}^{HuS}=\beta _{k}^{FR}\), then \({ d}_{k}=F_{k}+\beta _{k}^{FR}\left (I \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k1}\). From (15), (17) and (18), we have
$$\begin{aligned} \d_{k}\ =& \F_{k}+\beta _{k}^{FR}\left (I \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k1}\ \\ \leq & \F_{k}\+\frac{\F_{k}\^{2}}{\F_{k1}\^{2}}\\left (I \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k1}\ \\ \leq & \F_{k}\+\frac{\F_{k}\^{2}}{\F_{k1}\^{2}}\w_{k1}\ \\ \leq & \kappa +\frac{\kappa ^{2}}{\varepsilon _{0}^{2}}\alpha _{k1} \d_{k1}\ \\ =& \kappa \left (1+\frac{\kappa}{\varepsilon _{0}^{2}}\alpha _{k1} \d_{k1}\\right ) \end{aligned}$$(28)for all \(k\in \mathbb{N}\) and (20) implies that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k1}\{ d}_{k1}\<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}^{2}\) and \(M_{4}=\max \{\{ d}_{0}\,\{ d}_{1}\,\ldots ,\{ d}_{k_{0}}\, \widetilde{M}_{4}\}\), where \(\widetilde{M}_{4}=\kappa (1+\kappa )\), there holds
$$ \{ d}_{k}\\leq M_{4} $$(29)for every \(k\in \mathbb{N}\).
Relations (21), (27) and (29) imply that search directions \({ d}_{k}\) from twoterm HuS method are bounded, i.e., \(\{ d}_{k}\\leq M\) for every \(k\in \mathbb{N}\cup \{0\}\), where \(M=\max \{\kappa , M_{3},M_{4}\}\). □
The line search rule (7) used in Algorithm 1 is derivativefree, functionvaluebased line search, because it does not use derivatives and merit function. Since all directions \({ d}_{k}\) generated by Algorithm 1 are bounded and satisfy condition (10), it is easy to see that this line search necessarily holds for sufficiently small step length \(\alpha _{k}>0\), which can be obtained by finite backtracking. This means that line search is well defined and therefore, Algorithm 1 is well defined.
Lemma 3.4
[5] Suppose that all conditions of Theorem 3.3are satisfied. Then the line search procedure in Algorithm 1 is well defined.
The global convergence of Algorithm 1, with bounded sequence of directions \(\{{ d}_{k}\}\) which satisfy condition (10), is given in the next theorem. It can be proved in a similar way as in [5], so the proof is omitted.
Theorem 3.5
Suppose that assumptions A1A3 are satisfied and the sequence \(\{{ x}_{k}\}\) is generated by Algorithm 1. Then
An important property of Algorithm 1 is the fact that the whole sequence of iterations converges to the solution of the system with no regularity assumptions. Since F is continuous and \(\{{ x}_{k}\}\) is bounded, from (30) it is clear that the sequence \(\{{ x}_{k}\}\) generated by Algorithm 1, has the accumulation point \({ x}^{*}\) which is a solution of system (1). Besides, Lemma 3.2 implies that the sequence \(\{{ x}_{k}{ x}^{*}\}\) is convergent, which points out that the whole sequence \(\{{ x}_{k}\}\) generated by Algorithm 1 globally converges to the solution \({ x}^{*}\) of system (1).
4 Numerical experiments
Some numerical experiments are presented in this section. The efficiency and robustness of both methods are tested on a collection of problems taken from [9, 36, 46, 49] and compared with numerical performances of methods proposed in [2, 5, 9, 37]. As in [37] we tested Problems 1–3 from [49] with dimensions \(1000,20000\) and 50000, Problem 1 from [9] with dimensions \(1000,20000\) and 50000, Problem 2 from [9] with dimensions 1000 and 5000, Problem 2 from [46] with dimension 1000, Problem 3 from [46] with dimensions 1000 and 3000, Problem 4 from [46] with dimensions \(1000, 20000\) and 50000 and monotone nonlinear system with dimension 20164, given in [36] and obtained by discretization of Dirichlet problem. Algorithms are implemented in Matlab R2013b environment on a 2.80 GHz Intel Core i7 860 processor computer with 8 GB of RAM.
The main stopping criteria as in [5, 37] is
where \(\varepsilon =10^{4}\), but if it is not satisfied, algorithms are stopped after the maximum number of iterations \(k_{max}=500 000\). All methods are implemented with parameters \(\sigma = 0.3\) and \(\rho = 0.7\) and tested with 8 following starting iterations: \({ x}_{0}^{1}=10\cdot 1^{n\times 1}\), \({ x}_{0}^{2}=10\cdot 1^{n\times 1}\), \({ x}_{0}^{3}=1^{n\times 1}\), \({ x}_{0}^{4}=1^{n\times 1}\), \({ x}_{0}^{5}=0.1\cdot 1^{n\times 1}\), \({ x}_{0}^{6}=[1,\frac{1}{2},\frac{1}{3},\ldots ,\frac{1}{n}]^{T}\), \({ x}_{0}^{7}=[\frac{1}{n},\frac{2}{n},\ldots ,\frac{n1}{n},1]^{T}\), \({ x}_{0}^{8}=[1\frac{1}{n},1\frac{2}{n},\ldots ,1\frac{n1}{n},0]^{T}\), where n is a dimension of system.
In the same way as in [5, 30, 37, 46], the initial step length α in kth iteration is defined by
where \(t=10^{8}\).
The performance profile, proposed in [13], is used for comparing the efficiency and robustness of all methods. Let us denote the solver \(s\in \mathcal{S}\) from the set of solvers \(\mathcal{S}\), the problem \(p\in \mathcal{P}\) from the set of problems \(\mathcal{P}\) and \(m_{s,p}\) the performance measurement required to solve problem p by solver s. The performance ratio compares the performance of solver s on problem p with the best performance of any solver on this problem. It is defined by \(r_{s,p}=\frac{m_{s,p}}{\min \{m_{s,p}:s\in \mathcal{S}\}}\) if problem p is solved by solver s, or with \(r_{s,p}=r_{M}\) otherwise, where \(r_{M}\) is a fixed parameter. The probability for solver s that a performance ratio is within a factor \(t\in \mathbb{R}\) of the best possible ratio is defined by \(\rho _{s}(t)=\frac{1}{n_{p}}\text{size}\{p\in \mathcal{P}:r_{s,p} \leq t\}\), where \(n_{p}\) is the number of all problems.
The function \(\rho _{s}:\mathbb{R}\rightarrow [0,1]\) is the cumulative distribution function for performance ratio \(r_{s,p}\). It represents the performance of solver s and provides us with the following informations. The efficiency of solver s, which is the percentage of problems solved more quickly, can be determined as the value of \(\rho _{s}(1)\). The solver s̄, which maximizes the function \(\rho _{s}(1)\) is the most efficient and it solves the largest number of problems at the lowest possible value of performance measure m. The robustness of solver s is represented by the value \(\bar{t}\in [1,r_{M}]\), for which \(\rho _{s}(\bar{t})=1\). The solver ŝ, for which \(\bar{t}_{\hat{s}}=\min \{\bar{t}_{s},\forall s\in \mathcal{S}\}\) is the most robust solver.
We considered the number of iterations, the number of function evaluations and CPU time as measures of performance profile. In numerical experiments, we set \(r_{M}=1000\). Firstly, we compared HuS method with method proposed in [9], denoted by PRP. The efficiency and robustness of these algorithms are shown in Fig. 1, which presents performance profiles of methods. Figure 1 indicates that HuS method is more robust than PRP method taking into account the number of iterations, the number of function evaluations and CPU time, because its cumulative distribution function \(\rho (t)\) reaches the value 1 for smaller t. Besides, this figure reveals that both methods have similar efficiency in the sense of number of iterations and number of function evaluations.
Secondly, we compared HuS method with DLPM method from [2] and two threeterm methods: DFPB1 method from [5] and M3TFR2 method proposed in [37]. The performance profiles of these algorithms are given in Fig. 2. This figure demonstrates again that HuS method is the most robust in the sense of number of iterations, number of function evaluations and CPU time. However, DLPM is the most efficient considering the number of iterations and number of function evaluations, while in case of CPU time, the most efficient is M3TFR2 which is followed by HuS method.
Then, we compared twoterm HuS method with method proposed in [30] named by LiLi. The performance profiles of these methods are given in Fig. 3, which shows that twoterm HuS method is more robust and more efficient than LiLi method in all cases. Twoterm HuS method solves about 70% of all test problems with less number of iterations and less number of function evaluations, and about 55% of all problems with less CPU time.
The efficiency and robustness of twoterm HuS method are also compared to DFPB1, M3TFR2 and DLPM methods. Figure 4 points out that twoterm HuS method is the most robust in all cases. DLPM is the most efficient considering the number of iterations and number of function evaluations and it is followed by twoterm HuS method, while M3TFR2 method is the most efficient in the sense of CPU time and also it is followed by twoterm HuS method.
Next, we compared HuS method and twoterm HuS method with EDYM1, EDYM2 from [43] and DFCG method from [27]. The performance profiles of these algorithms are given in Fig. 5. This figure demonstrates again that twoterm HuS method is the most efficient and the most robust in the sense of number of iterations, number of function evaluations and CPU time.
Finally, we compared HuS method and twoterm HuS method, whose performance profiles are given in Fig. 6. According to this figure, twoterm HuS method is more robust in all cases and more efficient, if we consider the number of iterations and CPU time, while both methods have very similar efficiency taking into account the number of function evaluations. All numerical experiments indicate that both methods proposed in this paper have good computational performances. Presented figures show their great robustness and point out that these hybrid methods are successful and competitive with other methods given in [2, 5, 9, 37].
5 Application
In this section, we present an application of our new methods to signal recovery in compressed sensing. Compressed sensing can be modeled as the underdetermined linear equation
where \(x \in \mathbb{R}^{n}\) is a vector with n nonzero components to be recovered, \(b \in \mathbb{R}^{m}\) is the observed or measured data with noise ω and \(A\in \mathbb{R}^{m\times n}\) is a linear mapping. It is well known that solving model (31) can be seen as solving the nonsmooth convex unconstrained optimization problem
where \(\tau >0\). From [45], we have seen that (32) can be written as the quadratic program problem with box constraints, which is equivalent to
where F is a vector valued function,
\(u\in \mathbb{R}^{n}\), \(v \in \mathbb{R}^{n}\), \(e_{n}\) is an ndimensional vector with all elements one. Thus, (32) is equivalent to the nonlinear equation (33). Hence, it can be solved effectively by Algorithm 1.
Our interest in this section is to use the proposed methods HuS and twoterm HuS (THuS) to reconstruct a lengthn sparse signal from m observations, where \(m\ll n\). Due to the storage limitations of the PC, we test a small size signal with \(n = 2048\), \(m = 512\), and the original contains 128 randomly nonzero elements. The random A is the Gaussian matrix, which is generated by command \(randn(m,n)\) in Matlab. In this experiment, the measurement b is disturbed by noise, where ω is the Gaussian noise distributed as \(N(0, 10^{4})\).
We compare the performance of the proposed methods with the CGD [45] and PCG [32] method. The control parameters for HuS and THuS methods were both set as \(\tau =10^{8}\), \(\sigma =10^{4}\), \(\rho =0.5\). For the compared methods, the control parameters were set as reported in their respective papers. The quality of the signal reconstructed is measured by the mean squared error (MSE)
where \(x_{0}\) is the original signal and \(x^{*}\) is the restored signal.
The iterative process is initiated at the measurement image, that is, when \(x_{0} = A^{T}b\) and it is terminated when the relative change between successive iterations falls below 10^{−4}, that is,
where \(f(x) = \frac{1}{2} \Ax b\_{2}^{2} + \tau \x\_{1}\) is the objective function. In this experiment, the parameter τ is chosen as suggested in [28], i.e., \(\tau = 0.008\A^{T} b\_{\infty}\).
The experiment is performed ten times to demonstrate the stability of the proposed methods. The results are presented in Table 1. It is clear from the obtained results that the original signal was recovered by the four method. However, THuS performed better than other three methods in terms of number of iteration, CPU and MSE. The plot of the numerical results consisting of the original sparse signal, the measurement, and the restored signal by each method can be seen in Fig. 7. Furthermore, Fig. 8, shows the performance of each method in terms of their convergence behavior from the view of relative errors and merit function values, when the iteration number and computing time increase.
6 Conclusions
In this paper, we present two hybrid methods for solving largescale monotone systems of nonlinear equations, which use the derivativefree CG approach and the hyperplane projection technique. The CG approach is efficient for largescale systems due to low memory, while projection procedure guarantees simply globalization for monotone systems. The methods incorporate hybrid Hu–Storeytype search directions, which combine PRP and FRtype directions with the aim to exploit their advantages, so the hybrid scheme leads to excellent performance in practice. The methods are based only on the evaluations of the function F, because they do not use derivatives and the merit function. The derivativefree, function valuebased line search is combined with Hu–Storeytype search directions and projection technique, in order to construct globally convergent methods. So, with no regularity and differentiability assumptions, the global convergence of the whole generated sequence is obtained. Preliminary numerical results highlight the great robustness of new methods compared to other algorithms applicable for solving largescale monotone systems. In addition, these methods can also be applied to nonsmooth monotone systems and the application of new methods to the reconstruction of sparse signals is also very promising.
Data Availability
No datasets were generated or analysed during the current study.
References
Abubakar, A., Ibrahim, A., Abdullahi, M., Aphane, M., Chen, J.: A sufficient descent LSPRPBFGSlike method for solving nonlinear monotone equations with application to image restoration. Numer. Algorithms 1–42 (2023)
Abubakar, A.B., Kumam, P.: A descent DaiLiao conjugate gradient method for nonlinear equations. Numer. Algorithms (2018)
Abubakar, A.B., Kumam, P., Malik, M., Chaipunya, P., Ibrahim, A.H.: A hybrid FRDY conjugate gradient algorithm for unconstrained optimization with application in portfolio selection. AIMS Math. 6(6), 6506–6527 (2021)
Abubakar, A.B., Kumam, P., Malik, M., Ibrahim, A.H.: A hybrid conjugate gradient based approach for solving unconstrained optimization and motion control problems. Math. Comput. Simul. 201, 640–657 (2022)
Ahookhosh, M., Amini, K., Bahrami, S.: Two derivativefree projection approaches for systems of largescale nonlinear monotone equations. Numer. Algorithms 64, 21–42 (2013)
AlBaali, M.: Descent property and global convergence of the FletcherReeves method with inexact line search. IMA J. Numer. Anal. 5, 121–124 (1985)
Amini, K., Faramarzi, P.: Global convergence of a modified spectral threeterm CG algorithm for nonconvex unconstrained optimization problems. J. Comput. Appl. Math. 417, 114630 (2023)
Cheng, W.: A twoterm PRPbased descent method. Numer. Funct. Anal. Optim. 28, 1217–1230 (2007)
Cheng, W.: A PRP type method for systems of monotone equations. Math. Comput. Model. 50, 15–20 (2009)
Dai, Y.H., Yuan, Y.: Convergence properties of the FletcherReeves method. IMA J. Numer. Anal. 16, 155–164 (1996)
Dai, Y.H., Yuan, Y.: Convergence of the FletcherReeves method under a generalized Wolfe search. J. Comput. Math. 2, 142–148 (1996)
Dai, Y.H., Yuan, Y.: A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 10, 177–182 (1999)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program., Ser. A 91, 201–213 (2002)
Fletcher, R.: Practical Methods of Optimization. Wiley, Chichester (1987)
Fletcher, R., Reeves, C.: Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964)
Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)
Hager, W.W., Zhang, H.: A survey of nonlinear conjugate methods. Pac. J. Optim. 2, 35–58 (2006)
Hu, Q., Zhang, H., Zhou, Z., Chen, Y.: A class of improved conjugate gradient methods for nonconvex unconstrained optimization. Numer. Linear Algebra Appl. 30(4), e2482 (2023)
Hu, Y.F., Storey, C.: Global convergence result for conjugate gradient methods. J. Optim. Theory Appl. 71, 399–405 (1991)
Ibrahim, A.H., Alshahrani, M., AlHomidan, S.: Two classes of spectral threeterm derivativefree method for solving nonlinear equations with application. Numer. Algorithms 2023:1–21
Ibrahim, A.H., Kimiaei, M., Kumam, P.: A new black box method for monotone nonlinear equations. Optimization 72(5), 1119–1137 (2023)
Ibrahim, A.H., Kumam, P., Abubakar, A.B., Abubakar, J.: A derivativefree projection method for nonlinear equations with nonLipschitz operator: application to LASSO problem. Math. Methods Appl. Sci. 46(8), 9006–9027 (2023)
Ibrahim, A.H., Kumam, P., Kamandi, A., Abubakar, A.B.: An efficient hybrid conjugate gradient method for unconstrained optimization. Optim. Methods Softw. 37(4), 1370–1383 (2022)
Ivanov, B., Milovanović, G.V., Stanimirović, P.S.: Accelerated DaiLiao projection method for solving systems of monotone nonlinear equations with application to image deblurring. J. Glob. Optim. 85(2), 377–420 (2023)
Jiang, X., Yang, H., Jian, J., Wu, X.: Two families of hybrid conjugate gradient methods with restart procedures and their applications. Optim. Methods Softw. 1–28 (2023)
Jiang, X.Z., Zhu, Y.H., Jian, J.B.: Two efficient nonlinear conjugate gradient methods with restart procedures and their applications in image restoration. Nonlinear Dyn. 111(6), 5469–5498 (2023)
Kaelo, P., Koorapetse, M., Sam, C.R.: A globally convergent derivativefree projection method for nonlinear monotone equations with applications. Bull. Malays. Math. Sci. Soc. 44, 4335–4356 (2021)
Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D., et al.: A method for largescale l1regularized least squares. IEEE J. Sel. Top. Signal Process. 1(4), 606–617 (2007)
Kimiaei, M., Hassan Ibrahim, A., Ghaderi, S.: A subspace inertial method for derivativefree nonlinear monotone equations. Optimization 2023:1–28
Li, Q., Li, D.H.: A class of derivativefree methods for largescale nonlinear monotone equations. IMA J. Numer. Anal. 31, 1625–1635 (2011)
Liu, G.H., Han, J.Y., Yin, H.X.: Global convergence of the FletcherReeves algorithm with an inexact line search. Appl. Math. J. Chin. Univ. Ser. B 10, 75–82 (1995)
Liu, J., Li, S.J.: A projection method for convex constrained monotone nonlinear equations with applications. Comput. Math. Appl. 70(10), 2442–2453 (2015)
Liu, P., Wu, X., Shao, H., Zhang, Y., Cao, S.: Three adaptive hybrid derivativefree projection methods for constrained monotone nonlinear equations and their applications. Numer. Linear Algebra Appl. 30(2), e2471 (2023)
Liu, Y., Storey, C.: Efficient generalized conjugate gradient algorithms. Part 1: Theory. J. Optim. Theory Appl. 69, 129–137 (1991)
Ma, G., Liu, L., Jian, J., Yan, X.: A new hybrid CGPMbased algorithm for constrained nonlinear monotone equations with applications. J. Appl. Math. Comput. 70(1), 103–147 (2024)
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)
Papp, Z., Rapajić, S.: FR type methods for systems of largescale nonlinear monotone equations. Appl. Math. Comput. 269, 816–823 (2015)
Polak, E., Ribière, G.: Note sur la convergence de directions conjugées. Rev. Francaise Informat Recherche Opertionelle 16, 35–43 (1969)
Polyak, B.T.: The conjugate gradient method in extreme problems. USSR Comput. Math. Math. Phys. 9, 94–112 (1969)
Powell, M.J.D.: Restart procedures of the conjugate gradient method. Math. Program. 2, 241–254 (1997)
Solodov, M.V., Svaiter, B.F.: A globally convergent inexact Newton method for systems of monotone equations. In: Fukushima, M., Qi, L. (eds.) Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, pp. 355–369. Kluwer Academic, Dordrecht (1998)
Wang, X.: A class of spectral threeterm descent HestenesStiefel conjugate gradient algorithms for largescale unconstrained optimization and image restoration problems. Appl. Numer. Math. (2023)
Waziri, M.Y., Ahmed, K.: Two descent DaiYuan conjugate gradient methods for systems of monotone nonlinear equations. J. Sci. Comput. 90, 1–53 (2022)
Wu, X., Shao, H., Liu, P., Zhang, Y., Zhuo, Y.: An efficient conjugate gradientbased algorithm for unconstrained optimization and its projection extension to largescale constrained nonlinear equations with applications in signal recovery and image denoising problems. J. Comput. Appl. Math. 422, 114879 (2023)
Xiao, Y., Zhu, H.: A conjugate gradient method to solve convex constrained monotone equations with applications in compressive sensing. J. Math. Anal. Appl. 405(1), 310–319 (2013)
Yan, Q.R., Peng, X.Z., Li, D.H.: A globally convergent derivativefree method for solving largescale nonlinear monotone equations. J. Comput. Appl. Math. 234, 649–657 (2010)
Zhang, L., Zhou, W.: Spectral gradient projection method for solving nonlinear monotone equations. J. Comput. Appl. Math. 196, 478–484 (2006)
Zhao, Y.B., Li, D.: Monotonicity of fixed point and normal mapping associated with variational inequality and its application. SIAM J. Optim. 4, 962–973 (2001)
Zhou, W.J., Li, D.H.: Limited memory BFGS method for nonlinear monotone equations. J. Comput. Math. 25, 89–96 (2007)
Funding
This research is supported by the Science Fund of the Republic of Serbia, #GRANT No 7359, Project titleLASCADO. Supak Phiangsungnoen acknowledges Rajamangala University of Technology Rattanakosin the financial support provided by Thailand Science Research and Innovation (TSRI) and Fundamental Fund of Rajamangala University of Technology Rattanakosin with funding under contract No.FRB6719/2567 and the NSRF via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation [grant number B41G670027].
Author information
Authors and Affiliations
Contributions
Writing the manuscript was jointly done by Z.P, S.R, A.H.I and S.P. More precisely, Z.P, S.R and S.P formulated the problem and proved the convergence analysis. The main manuscript text was written by Z.P, S.R and S.P. A.H.I. implemented the numerical experiments for the largescale systems. S.P. implemented the signal reconstruction experiment. All authors reviewed the manuscript
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This article does not contain any studies with human participants or animals performed by any of the authors.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License, which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/byncnd/4.0/.
About this article
Cite this article
Papp, Z., Rapajić, S., Ibrahim, A.H. et al. Hybrid HuStorey type methods for largescale nonlinear monotone systems and signal recovery. J Inequal Appl 2024, 110 (2024). https://doi.org/10.1186/s13660024031871
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660024031871