- Research
- Open access
- Published:
Hybrid Hu-Storey type methods for large-scale nonlinear monotone systems and signal recovery
Journal of Inequalities and Applications volume 2024, Article number: 110 (2024)
Abstract
We propose two hybrid methods for solving large-scale monotone systems, which are based on derivative-free conjugate gradient approach and hyperplane projection technique. The conjugate gradient approach is efficient for large-scale systems due to low memory, while projection strategy is suitable for monotone equations because it enables simply globalization. The derivative-free function-value-based line search is combined with Hu-Storey type search directions and projection procedure, in order to construct globally convergent methods. Furthermore, the proposed methods are applied into solving a number of large-scale monotone nonlinear systems and reconstruction of sparse signals. Numerical experiments indicate the robustness of the proposed methods.
1 Introduction
Nonlinear systems arise in many mathematical models from applied disciplines. Reformulations of different problems lead to systems of nonlinear equations, so various iterative methods for solving nonlinear systems have been developed. In this paper, we consider the nonlinear system
where function \(F: \mathbb{R}^{n}\rightarrow \mathbb{R}^{n}\) is continuous and monotone. The monotonicity means that
A large group of numerical methods for solving monotone systems has been developed during the last decade (see [1, 20–22, 24, 29, 33, 35, 43]). The pioneer work for solving nonlinear monotone systems is given in [41]. In that paper, the authors introduced projection technique that fully exploits the monotonicity of function F. The whole sequence of iterations generated by proposed method globally converges to the solution of system (1) without any regularity assumptions. This projection strategy is applied in [49], where limited memory BFGS method, which is suitable for large-scale monotone systems, is proposed. Furthermore, the spectral projection method for nonlinear monotone equations is introduced in [47]. It is well-known that spectral gradient and conjugate gradient (CG) methods can successfully cope with large-scale optimization problems. CG methods and their modifications for unconstrained optimization problems are presented in many papers [3, 4, 7, 18, 23, 26, 42, 44]. Low memory requirement of CG methods motivated authors to adopt this technique for solving large-scale monotone systems.
Hyperplane projection procedure guarantees simple globalization and therefore is useful for monotone systems. Since CG methods are at low cost, they are suitable for large-scale systems. Because of that, some methods presented in [20–22, 37] combine CG directions with projection technique in order to solve large-scale nonlinear monotone systems.
The globalization strategy for solving (1) is usually based on line search with merit function. Many methods which use merit function have practical importance, but they have some disadvantages as well. Every of these algorithms generates a sequence with accumulation point which is only a stationary point of the merit function. So, regularity conditions are necessary to gauarantee that this stationary point is the solution of (1). The regularity is also required in order to prove that the whole sequence of iterations converges to the solution. In addition, the assumption of boundedness of the level set of merit function is also important to ensure the existence of accumulation point. These disadvantages and the idea of exploring the monotonicity of function F motivated authors in [41] to introduce a projection technique and to propose a globally convergent inexact Newton method for solving monotone systems. Inspired by that, many methods without merit function, which use derivative-free line search, have been developed (see [1, 20–22, 24, 29, 33, 35, 43]).
In this paper, we propose two hybrid methods, which are based on derivative-free CG approach and hyperplane projection technique. Function-value-based line search without derivatives, combined with Hu-Storey type search directions and projection procedure, is used for globalization strategy. Both methods use projection technique, proposed in [41], which generates a sequence \(\{{z}_{k}\}\) defined by \({z}_{k}={x}_{k}+\alpha _{k} {d}_{k}\), where \({x}_{k}\) is the current iteration and the step length \(\alpha _{k}>0\) is determined by some line search procedure along the given direction \({d}_{k}\) such that
Note that by monotonicity of F, we have
for every solution \(x^{*}\) of the system (1). Thus the hyperplane
strictly separates the current iteration \(x_{k}\) from the solution set of equation (1). When the hyperplane \(\mathcal{H}_{k}\) is obtained, it is followed by projecting the current iteration \(x_{k}\) onto it to compute the next iteration \(x_{k+1}\) by
The line search procedure affects the computational complexity of algorithm. Because of that, we use derivative-free line search rule without merit function
based only on evaluation of function F, where \(d_{k}\in \mathbb{R}^{n}\) is a hybrid search direction, which should satisfy sufficient descent condition.
Methods are proposed in Sect. 2. Convergence analysis is presented in Sect. 3. Numerical performances of methods and their application for signal recovery are analyzed and compared with other known methods in Sect. 4 and Sect. 5, respectively. Final conclusions are given in the last section. Throughout the paper, the Euclidean norm \(\|\cdot \|\) is used.
2 Algorithm
As we mentioned before, CG approach for solving unconstrained optimization problems and nonlinear systems is characterized by low memory requirements and therefore is suitable for large-scale problems. An excellent review of CG methods for unconstrained optimization problems is given in [17]. Different CG algorithms correspond to different choices of the update parameter \(\beta _{k}\). Therefore, a crucial element in any CG method is the proper choice of parameter \(\beta _{k}\). Fletcher–Reeves (FR) type methods for unconstrained optimization problems [6, 10, 11, 15, 31, 40] have strong convergence properties, but may have modest practical performance due to jamming. These algorithms can make many short steps without giving significant progress to the solution of optimization problem. On the other hand, Polak–Ribière–Polyak (PRP) type methods [38, 39] may not be convergent in general, but often have better computational performances. Therefore, combinations of these methods have been proposed with the aim to take into account the advantages both of FR and PRP methods. One of the first hybrid methods for unconstrained optimization problems was proposed by Hu and Storey [19], where
Motivated by CG methods for unconstrained optimization problems, some authors have adopted CG approach to monotone systems [9, 30, 37, 46]. In this paper, we extend the Hu-Storey idea given in [19] to nonlinear monotone systems. First, we define Hu-Storey (HuS) search direction of the form
where update parameter \(\beta _{k}^{HuS}\) is defined by (8), \(F_{k}=F(x_{k})\), \(\beta _{k}^{FR}=\frac{\|F_{k}\|^{2}}{\|F_{k-1}\|^{2}}\), \(\beta _{k}^{PRP}=\frac{F_{k}^{T} y_{k-1}}{\|F_{k-1}\|^{2}}\), \(y_{k-1}=F_{k}-F_{k-1}\) and \(w_{k}=z_{k}-x_{k}=\alpha _{k} d_{k}\).
The globalization procedure with derivative-free, function-value-based line search (7) requires that the search direction should satisfy sufficient descent condition
where \(\tau >0\). However, HuS search direction (9) does not satisfy (10) in general. So, if this condition is not satisfied, we take the steepest descent direction, i.e. \(d_{k}=-F_{k}\).
The fact that HuS direction (9) may not be sufficient descent direction, inspired us to introduce another Hu–Storey type search direction which always satisfies (10). In [8], Cheng proposed two-term PRP method for unconstrained optimization problems. The authors in [30] used this idea to construct a two-term PRP method for solving nonlinear monotone systems. Motivated by this and Hu-Storey update parameter, we define the following two-term Hu–Storey search direction
which always satisfies the sufficient descent property (10), since
Both directions (9) and (11) are combinations of FR and PRP type directions with the aim to exploit their attractive properties. So, when the iteration jams, the PRP direction is used if \(\beta _{k}^{PRP}>0\), while the steepest descent direction is used if \(\beta _{k}^{PRP}\leq 0\). Otherwise, FR type direction is used. Now, we can state the algorithm as follows.
Algorithm 1 with HuS search direction \(d_{k}=d_{k}^{HuS}\) defined by (9) will be named HuS method, while Algorithm 1 with two-term HuS search direction \(d_{k}=d_{k}^{THuS}\), defined by (11), will be named two-term HuS method.
3 Convergence analysis
The convergence analysis of methods is performed under standard assumptions, using the same idea as in [37]. First, for both methods it will be proved that directions \({ d}_{k}\) generated by Algorithm 1 satisfy sufficient descent property (10), then that sequences \(\{{ x}_{k}\}\) and \(\{ { d}_{k} \}\) are bounded, later that algorithm is well defined and finally that the whole sequence generated by Algorithm 1 globally converges to the solution of monotone system (1).
From now on, we assume that the following conditions hold:
- A1:
-
the function \(F( x)\) is monotone on \(\mathbb{R}^{n}\),
- A2:
-
the function \(F( x)\) is Lipschitz continuous on \(\mathbb{R}^{n}\),
- A3:
-
the solution set of system (1) is nonempty.
The construction of Algorithm 1 ensures that generated search directions satisfy sufficient descent condition (10).
Lemma 3.1
The directions \({ d}_{k}\) generated by Algorithm 1 satisfy condition (10) for every \(k\in \mathbb{N}\cup \{0\}\).
Proof
For HuS method, it is not difficult to see that all directions \({ d}_{k}\) satisfy (10), because if \({ d}_{k}^{HuS}\) does not satisfy it, then \({ d}_{k}=-{ F}_{k}\) which satisfies it. Considering two-term HuS method, it is clear that all directions satisfy (10) because of (12). □
The next lemma indicates that the sequence of distances between iterations generated by Algorithm 1 and the solution set is decreasing, which essentially ensures global convergence. This lemma can be proved in the same way as in [41] and also guarantees that the sequence \(\{ x_{k}\}\) is bounded.
Lemma 3.2
[41] Suppose that assumptions A1–A3 are satisfied and the sequence \(\{{ x}_{k}\}\) is generated by Algorithm 1. For any solution \({ x}^{*}\) of system (1),
holds and the sequence \(\{{ x}_{k}\}\) is bounded. Furthermore, either \(\{{ x}_{k}\}\) is finite and the last iteration is a solution of (1), or \(\{{ x}_{k}\}\) is infinite and
The boundedness of every search direction generated by Algorithm 1 is necessary for further analysis and it is proved in the next theorem.
Theorem 3.3
Suppose that assumptions A1–A3 are satisfied and the sequence of directions \(\{{ d}_{k}\}\) is generated by HuS method or two-term HuS method. Let \(\varepsilon _{0}>0\) be a constant such that
holds for every \(k\in \mathbb{N}\cup \{0\}\). Then directions \(\{{ d}_{k}\}\) are bounded, i.e., there exists a constant \(M>0\) such that
holds for every \(k\in \mathbb{N}\cup \{0\}\).
Proof
From Lemma 3.2 and assumption A2, it follows that \(\|{ F}({ x}_{k})\|\leq L\|{ x}_{0}-{ x}^{*}\|\) for every \(k\in \mathbb{N}\cup \{0\}\), so taking \(\kappa =L\|{ x}_{0}-{ x}^{*}\|\), we have
for every \(k\in \mathbb{N}\cup \{0\}\). Cauchy–Schwartz inequality applied to (6) implies
while (6) and line search condition (7) indicate
Finally, (14) and (19) guarantee
Based on proposed search directions \({ d}_{k}^{HuS}\) (9) and \({ d}_{k}^{THuS}\) (11), we should distinguish the proofs of boundedness of directions generated by HuS and two-term HuS methods.
Boundedness of directions generated by HuS method: In HuS method, the direction \(d_{k}\) in Step 2 of Algorithm 1 is defined by \({ d}_{k}^{HuS}\) (9).
-
If \(k=0\) or \(\beta _{k}^{HuS}=0\) or direction \(d_{k}={ d}_{k}^{HuS}\) does not satisfy the sufficient descent property (10), then \({ d}_{k}=-{ F}_{k}\). Therefore, by (17)
$$ \|{ d}_{k}\|=\|-{ F}_{k}\|\leq \kappa . $$(21) -
If direction \(d_{k}={ d}_{k}^{HuS}\) satisfies the sufficient descent property (10) and \(\beta _{k}^{HuS}=\beta _{k}^{PRP}\), then \({ d}_{k}=-{ F}_{k}+\beta _{k}^{PRP}{ w}_{k-1}\). Assumption A2, (15), (17) and (18) imply
$$\begin{aligned} \|{ d}_{k}\| = & \|-{ F}_{k}+\beta _{k}^{PRP}{ w}_{k-1}\| \\ \leq & \|{ F}_{k}\|+ \frac{|{ F}_{k}^{T} { y}_{k-1}|}{\|{ F}_{k-1}\|^{2}}\|{ w}_{k-1}\| \\ \leq & \|{ F}_{k}\|+ \frac{\| F_{k}\| \| { y}_{k-1}\|}{\|{ F}_{k-1}\|^{2}} \|w_{k-1} \| \\ =&\|{ F}_{k}\|+ \frac{\|{ F}_{k}\| \| { F}_{k}-{ F}_{k-1}\|}{\|{ F}_{k-1}\|^{2}} \alpha _{k-1}\|{ d}_{k-1}\| \\ \leq & \kappa + \frac{\kappa L\| { x}_{k}-{ x}_{k-1}\|}{\varepsilon _{0}^{2}}\alpha _{k-1} \|{ d}_{k-1}\| \\ =& \kappa +\frac{\kappa L}{\varepsilon _{0}^{2}} \alpha _{k-1}^{2} \|{ d}_{k-1}\|^{2} \\ =&\kappa \left (1+\frac{L}{\varepsilon _{0}^{2}}\alpha _{k-1}^{2}\|{ d}_{k-1} \|^{2}\right ) \end{aligned}$$(22)for all \(k\in \mathbb{N}\). From (20) it follows that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k-1}\|{ d}_{k-1}\|<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}\) and \(M_{1}=\max \{\|{ d}_{0}\|,\|{ d}_{1}\|,\ldots ,\|{ d}_{k_{0}}\|, \widetilde{M}_{1}\}\), where \(\widetilde{M}_{1}=\kappa (1+L)\), it holds that
$$ \|{ d}_{k}\|\leq M_{1} $$(23)for every \(k\in \mathbb{N}\).
-
If direction \(d_{k}={ d}_{k}^{HuS}\) satisfies the sufficient descent property (10) and \(\beta _{k}^{HuS}=\beta _{k}^{FR}\), then \({ d}_{k}=-{ F}_{k}+\beta _{k}^{FR}{ w}_{k-1}\). From (15), (17) and (18), we have
$$\begin{aligned} \|{ d}_{k}\| = & \|-{ F}_{k}+\beta _{k}^{FR}{ w}_{k-1}\| \\ \leq & \|{ F}_{k}\|+\frac{\|{ F}_{k}\|^{2}}{\|{ F}_{k-1}\|^{2}}\|{ w}_{k-1} \| \\ \leq & \kappa +\frac{\kappa ^{2}}{\varepsilon _{0}^{2}}\alpha _{k-1} \|{ d}_{k-1}\| \\ =& \kappa \left (1+\frac{\kappa}{\varepsilon _{0}^{2}}\alpha _{k-1} \|{ d}_{k-1}\|\right ) \end{aligned}$$(24)for all \(k\in \mathbb{N}\) and (20) implies that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k-1}\|{ d}_{k-1}\|<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}^{2}\) and \(M_{2}=\max \{\|{ d}_{0}\|,\|{ d}_{1}\|,\ldots ,\|{ d}_{k_{0}}\|, \widetilde{M}_{2}\}\), where \(\widetilde{M}_{2}=\kappa (1+\kappa )\), there holds
$$ \|{ d}_{k}\|\leq M_{2} $$(25)for every \(k\in \mathbb{N}\).
Relations (21), (23) and (25) imply that search directions \({ d}_{k}\) from HuS method are bounded, i.e., \(\|{ d}_{k}\|\leq M\) for every \(k\in \mathbb{N}\cup \{0\}\), where \(M=\max \{\kappa , M_{1},M_{2}\}\).
Boundedness of directions generated by two-term HuS method: In two-term HuS method, the direction \(d_{k}\) in Step 2 of Algorithm 1 is defined by \({ d}_{k}^{THuS}\) (11).
-
If \(k=0\) or \(\beta _{k}^{HuS}=0\) then \({ d}_{k}=-{ F}_{k}\) and therefore (21) holds by (17).
-
If \(\beta _{k}^{HuS}=\beta _{k}^{PRP}\) then \(d_{k}=-F_{k}+\beta _{k}^{PRP}\left (I- \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k-1}\). Assumption A2, (15), (17) and (18) imply
$$\begin{aligned} \|d_{k}\| = & \|-F_{k}+\beta _{k}^{PRP}\left (I- \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k-1}\| \\ \leq & \|F_{k}\|+\frac{|F_{k}^{T} y_{k-1}|}{\|F_{k-1}\|^{2}}\|\left (I- \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k-1}\| \\ \leq & \|F_{k}\|+\frac{|F_{k}^{T} y_{k-1}|}{\|F_{k-1}\|^{2}}\|w_{k-1} \| \\ \leq & \|F_{k}\|+ \frac{\| F_{k}\| \| y_{k-1}\|}{\|F_{k-1}\|^{2}} \alpha _{k-1}\|d_{k-1}\| \\ =&\|F_{k}\|+\frac{\|F_{k}\| \| F_{k}-F_{k-1}\|}{\|F_{k-1}\|^{2}} \alpha _{k-1}\|d_{k-1}\| \\ \leq & \kappa + \frac{\kappa L\| x_{k}-x_{k-1}\|}{\varepsilon _{0}^{2}}\alpha _{k-1} \|d_{k-1}\| \\ =& \kappa +\frac{\kappa L}{\varepsilon _{0}^{2}} \alpha _{k-1}^{2} \|d_{k-1}\|^{2} \\ =&\kappa \left (1+\frac{L}{\varepsilon _{0}^{2}}\alpha _{k-1}^{2}\|d_{k-1} \|^{2}\right ) \end{aligned}$$(26)for all \(k\in \mathbb{N}\).
From (20) it follows that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k-1}\|{ d}_{k-1}\|<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}\) and \(M_{3}=\max \{\|{ d}_{0}\|,\|{ d}_{1}\|,\ldots ,\|{ d}_{k_{0}}\|, \widetilde{M}_{3}\}\), where \(\widetilde{M}_{3}=\kappa (1+L)\), it holds that
$$ \|{ d}_{k}\|\leq M_{3} $$(27)for every \(k\in \mathbb{N}\).
-
If \(\beta _{k}^{HuS}=\beta _{k}^{FR}\), then \({ d}_{k}=-F_{k}+\beta _{k}^{FR}\left (I- \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k-1}\). From (15), (17) and (18), we have
$$\begin{aligned} \|d_{k}\| =& \|-F_{k}+\beta _{k}^{FR}\left (I- \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k-1}\| \\ \leq & \|F_{k}\|+\frac{\|F_{k}\|^{2}}{\|F_{k-1}\|^{2}}\|\left (I- \frac{F_{k}F_{k}^{T}}{F_{k}^{T}F_{k}}\right )w_{k-1}\| \\ \leq & \|F_{k}\|+\frac{\|F_{k}\|^{2}}{\|F_{k-1}\|^{2}}\|w_{k-1}\| \\ \leq & \kappa +\frac{\kappa ^{2}}{\varepsilon _{0}^{2}}\alpha _{k-1} \|d_{k-1}\| \\ =& \kappa \left (1+\frac{\kappa}{\varepsilon _{0}^{2}}\alpha _{k-1} \|d_{k-1}\|\right ) \end{aligned}$$(28)for all \(k\in \mathbb{N}\) and (20) implies that for every \(\varepsilon _{1}>0\) there exists \(k_{0}\) such that it holds \(\alpha _{k-1}\|{ d}_{k-1}\|<\varepsilon _{1}\) for every \(k>k_{0}\). Choosing \(\varepsilon _{1}=\varepsilon _{0}^{2}\) and \(M_{4}=\max \{\|{ d}_{0}\|,\|{ d}_{1}\|,\ldots ,\|{ d}_{k_{0}}\|, \widetilde{M}_{4}\}\), where \(\widetilde{M}_{4}=\kappa (1+\kappa )\), there holds
$$ \|{ d}_{k}\|\leq M_{4} $$(29)for every \(k\in \mathbb{N}\).
Relations (21), (27) and (29) imply that search directions \({ d}_{k}\) from two-term HuS method are bounded, i.e., \(\|{ d}_{k}\|\leq M\) for every \(k\in \mathbb{N}\cup \{0\}\), where \(M=\max \{\kappa , M_{3},M_{4}\}\). □
The line search rule (7) used in Algorithm 1 is derivative-free, function-value-based line search, because it does not use derivatives and merit function. Since all directions \({ d}_{k}\) generated by Algorithm 1 are bounded and satisfy condition (10), it is easy to see that this line search necessarily holds for sufficiently small step length \(\alpha _{k}>0\), which can be obtained by finite backtracking. This means that line search is well defined and therefore, Algorithm 1 is well defined.
Lemma 3.4
[5] Suppose that all conditions of Theorem 3.3are satisfied. Then the line search procedure in Algorithm 1 is well defined.
The global convergence of Algorithm 1, with bounded sequence of directions \(\{{ d}_{k}\}\) which satisfy condition (10), is given in the next theorem. It can be proved in a similar way as in [5], so the proof is omitted.
Theorem 3.5
Suppose that assumptions A1-A3 are satisfied and the sequence \(\{{ x}_{k}\}\) is generated by Algorithm 1. Then
An important property of Algorithm 1 is the fact that the whole sequence of iterations converges to the solution of the system with no regularity assumptions. Since F is continuous and \(\{{ x}_{k}\}\) is bounded, from (30) it is clear that the sequence \(\{{ x}_{k}\}\) generated by Algorithm 1, has the accumulation point \({ x}^{*}\) which is a solution of system (1). Besides, Lemma 3.2 implies that the sequence \(\{{ x}_{k}-{ x}^{*}\}\) is convergent, which points out that the whole sequence \(\{{ x}_{k}\}\) generated by Algorithm 1 globally converges to the solution \({ x}^{*}\) of system (1).
4 Numerical experiments
Some numerical experiments are presented in this section. The efficiency and robustness of both methods are tested on a collection of problems taken from [9, 36, 46, 49] and compared with numerical performances of methods proposed in [2, 5, 9, 37]. As in [37] we tested Problems 1–3 from [49] with dimensions \(1000,20000\) and 50000, Problem 1 from [9] with dimensions \(1000,20000\) and 50000, Problem 2 from [9] with dimensions 1000 and 5000, Problem 2 from [46] with dimension 1000, Problem 3 from [46] with dimensions 1000 and 3000, Problem 4 from [46] with dimensions \(1000, 20000\) and 50000 and monotone nonlinear system with dimension 20164, given in [36] and obtained by discretization of Dirichlet problem. Algorithms are implemented in Matlab R2013b environment on a 2.80 GHz Intel Core i7 860 processor computer with 8 GB of RAM.
The main stopping criteria as in [5, 37] is
where \(\varepsilon =10^{-4}\), but if it is not satisfied, algorithms are stopped after the maximum number of iterations \(k_{max}=500 000\). All methods are implemented with parameters \(\sigma = 0.3\) and \(\rho = 0.7\) and tested with 8 following starting iterations: \({ x}_{0}^{1}=10\cdot 1^{n\times 1}\), \({ x}_{0}^{2}=-10\cdot 1^{n\times 1}\), \({ x}_{0}^{3}=1^{n\times 1}\), \({ x}_{0}^{4}=-1^{n\times 1}\), \({ x}_{0}^{5}=0.1\cdot 1^{n\times 1}\), \({ x}_{0}^{6}=[1,\frac{1}{2},\frac{1}{3},\ldots ,\frac{1}{n}]^{T}\), \({ x}_{0}^{7}=[\frac{1}{n},\frac{2}{n},\ldots ,\frac{n-1}{n},1]^{T}\), \({ x}_{0}^{8}=[1-\frac{1}{n},1-\frac{2}{n},\ldots ,1-\frac{n-1}{n},0]^{T}\), where n is a dimension of system.
In the same way as in [5, 30, 37, 46], the initial step length α in kth iteration is defined by
where \(t=10^{-8}\).
The performance profile, proposed in [13], is used for comparing the efficiency and robustness of all methods. Let us denote the solver \(s\in \mathcal{S}\) from the set of solvers \(\mathcal{S}\), the problem \(p\in \mathcal{P}\) from the set of problems \(\mathcal{P}\) and \(m_{s,p}\) the performance measurement required to solve problem p by solver s. The performance ratio compares the performance of solver s on problem p with the best performance of any solver on this problem. It is defined by \(r_{s,p}=\frac{m_{s,p}}{\min \{m_{s,p}:s\in \mathcal{S}\}}\) if problem p is solved by solver s, or with \(r_{s,p}=r_{M}\) otherwise, where \(r_{M}\) is a fixed parameter. The probability for solver s that a performance ratio is within a factor \(t\in \mathbb{R}\) of the best possible ratio is defined by \(\rho _{s}(t)=\frac{1}{n_{p}}\text{size}\{p\in \mathcal{P}:r_{s,p} \leq t\}\), where \(n_{p}\) is the number of all problems.
The function \(\rho _{s}:\mathbb{R}\rightarrow [0,1]\) is the cumulative distribution function for performance ratio \(r_{s,p}\). It represents the performance of solver s and provides us with the following informations. The efficiency of solver s, which is the percentage of problems solved more quickly, can be determined as the value of \(\rho _{s}(1)\). The solver s̄, which maximizes the function \(\rho _{s}(1)\) is the most efficient and it solves the largest number of problems at the lowest possible value of performance measure m. The robustness of solver s is represented by the value \(\bar{t}\in [1,r_{M}]\), for which \(\rho _{s}(\bar{t})=1\). The solver ŝ, for which \(\bar{t}_{\hat{s}}=\min \{\bar{t}_{s},\forall s\in \mathcal{S}\}\) is the most robust solver.
We considered the number of iterations, the number of function evaluations and CPU time as measures of performance profile. In numerical experiments, we set \(r_{M}=1000\). Firstly, we compared HuS method with method proposed in [9], denoted by PRP. The efficiency and robustness of these algorithms are shown in Fig. 1, which presents performance profiles of methods. Figure 1 indicates that HuS method is more robust than PRP method taking into account the number of iterations, the number of function evaluations and CPU time, because its cumulative distribution function \(\rho (t)\) reaches the value 1 for smaller t. Besides, this figure reveals that both methods have similar efficiency in the sense of number of iterations and number of function evaluations.
Secondly, we compared HuS method with DLPM method from [2] and two three-term methods: DFPB1 method from [5] and M3TFR2 method proposed in [37]. The performance profiles of these algorithms are given in Fig. 2. This figure demonstrates again that HuS method is the most robust in the sense of number of iterations, number of function evaluations and CPU time. However, DLPM is the most efficient considering the number of iterations and number of function evaluations, while in case of CPU time, the most efficient is M3TFR2 which is followed by HuS method.
Then, we compared two-term HuS method with method proposed in [30] named by Li-Li. The performance profiles of these methods are given in Fig. 3, which shows that two-term HuS method is more robust and more efficient than Li-Li method in all cases. Two-term HuS method solves about 70% of all test problems with less number of iterations and less number of function evaluations, and about 55% of all problems with less CPU time.
The efficiency and robustness of two-term HuS method are also compared to DFPB1, M3TFR2 and DLPM methods. Figure 4 points out that two-term HuS method is the most robust in all cases. DLPM is the most efficient considering the number of iterations and number of function evaluations and it is followed by two-term HuS method, while M3TFR2 method is the most efficient in the sense of CPU time and also it is followed by two-term HuS method.
Next, we compared HuS method and two-term HuS method with EDYM1, EDYM2 from [43] and DFCG method from [27]. The performance profiles of these algorithms are given in Fig. 5. This figure demonstrates again that two-term HuS method is the most efficient and the most robust in the sense of number of iterations, number of function evaluations and CPU time.
Finally, we compared HuS method and two-term HuS method, whose performance profiles are given in Fig. 6. According to this figure, two-term HuS method is more robust in all cases and more efficient, if we consider the number of iterations and CPU time, while both methods have very similar efficiency taking into account the number of function evaluations. All numerical experiments indicate that both methods proposed in this paper have good computational performances. Presented figures show their great robustness and point out that these hybrid methods are successful and competitive with other methods given in [2, 5, 9, 37].
5 Application
In this section, we present an application of our new methods to signal recovery in compressed sensing. Compressed sensing can be modeled as the underdetermined linear equation
where \(x \in \mathbb{R}^{n}\) is a vector with n nonzero components to be recovered, \(b \in \mathbb{R}^{m}\) is the observed or measured data with noise ω and \(A\in \mathbb{R}^{m\times n}\) is a linear mapping. It is well known that solving model (31) can be seen as solving the non-smooth convex unconstrained optimization problem
where \(\tau >0\). From [45], we have seen that (32) can be written as the quadratic program problem with box constraints, which is equivalent to
where F is a vector valued function,
\(u\in \mathbb{R}^{n}\), \(v \in \mathbb{R}^{n}\), \(e_{n}\) is an n-dimensional vector with all elements one. Thus, (32) is equivalent to the nonlinear equation (33). Hence, it can be solved effectively by Algorithm 1.
Our interest in this section is to use the proposed methods HuS and two-term HuS (THuS) to reconstruct a length-n sparse signal from m observations, where \(m\ll n\). Due to the storage limitations of the PC, we test a small size signal with \(n = 2048\), \(m = 512\), and the original contains 128 randomly non-zero elements. The random A is the Gaussian matrix, which is generated by command \(randn(m,n)\) in Matlab. In this experiment, the measurement b is disturbed by noise, where ω is the Gaussian noise distributed as \(N(0, 10^{-4})\).
We compare the performance of the proposed methods with the CGD [45] and PCG [32] method. The control parameters for HuS and THuS methods were both set as \(\tau =10^{-8}\), \(\sigma =10^{-4}\), \(\rho =0.5\). For the compared methods, the control parameters were set as reported in their respective papers. The quality of the signal reconstructed is measured by the mean squared error (MSE)
where \(x_{0}\) is the original signal and \(x^{*}\) is the restored signal.
The iterative process is initiated at the measurement image, that is, when \(x_{0} = A^{T}b\) and it is terminated when the relative change between successive iterations falls below 10−4, that is,
where \(f(x) = \frac{1}{2} \|Ax -b\|_{2}^{2} + \tau \|x\|_{1}\) is the objective function. In this experiment, the parameter τ is chosen as suggested in [28], i.e., \(\tau = 0.008\|A^{T} b\|_{\infty}\).
The experiment is performed ten times to demonstrate the stability of the proposed methods. The results are presented in Table 1. It is clear from the obtained results that the original signal was recovered by the four method. However, THuS performed better than other three methods in terms of number of iteration, CPU and MSE. The plot of the numerical results consisting of the original sparse signal, the measurement, and the restored signal by each method can be seen in Fig. 7. Furthermore, Fig. 8, shows the performance of each method in terms of their convergence behavior from the view of relative errors and merit function values, when the iteration number and computing time increase.
6 Conclusions
In this paper, we present two hybrid methods for solving large-scale monotone systems of nonlinear equations, which use the derivative-free CG approach and the hyperplane projection technique. The CG approach is efficient for large-scale systems due to low memory, while projection procedure guarantees simply globalization for monotone systems. The methods incorporate hybrid Hu–Storey-type search directions, which combine PRP and FR-type directions with the aim to exploit their advantages, so the hybrid scheme leads to excellent performance in practice. The methods are based only on the evaluations of the function F, because they do not use derivatives and the merit function. The derivative-free, function value-based line search is combined with Hu–Storey-type search directions and projection technique, in order to construct globally convergent methods. So, with no regularity and differentiability assumptions, the global convergence of the whole generated sequence is obtained. Preliminary numerical results highlight the great robustness of new methods compared to other algorithms applicable for solving large-scale monotone systems. In addition, these methods can also be applied to non-smooth monotone systems and the application of new methods to the reconstruction of sparse signals is also very promising.
Data Availability
No datasets were generated or analysed during the current study.
References
Abubakar, A., Ibrahim, A., Abdullahi, M., Aphane, M., Chen, J.: A sufficient descent LS-PRP-BFGS-like method for solving nonlinear monotone equations with application to image restoration. Numer. Algorithms 1–42 (2023)
Abubakar, A.B., Kumam, P.: A descent Dai-Liao conjugate gradient method for nonlinear equations. Numer. Algorithms (2018)
Abubakar, A.B., Kumam, P., Malik, M., Chaipunya, P., Ibrahim, A.H.: A hybrid FR-DY conjugate gradient algorithm for unconstrained optimization with application in portfolio selection. AIMS Math. 6(6), 6506–6527 (2021)
Abubakar, A.B., Kumam, P., Malik, M., Ibrahim, A.H.: A hybrid conjugate gradient based approach for solving unconstrained optimization and motion control problems. Math. Comput. Simul. 201, 640–657 (2022)
Ahookhosh, M., Amini, K., Bahrami, S.: Two derivative-free projection approaches for systems of large-scale nonlinear monotone equations. Numer. Algorithms 64, 21–42 (2013)
Al-Baali, M.: Descent property and global convergence of the Fletcher-Reeves method with inexact line search. IMA J. Numer. Anal. 5, 121–124 (1985)
Amini, K., Faramarzi, P.: Global convergence of a modified spectral three-term CG algorithm for nonconvex unconstrained optimization problems. J. Comput. Appl. Math. 417, 114630 (2023)
Cheng, W.: A two-term PRP-based descent method. Numer. Funct. Anal. Optim. 28, 1217–1230 (2007)
Cheng, W.: A PRP type method for systems of monotone equations. Math. Comput. Model. 50, 15–20 (2009)
Dai, Y.H., Yuan, Y.: Convergence properties of the Fletcher-Reeves method. IMA J. Numer. Anal. 16, 155–164 (1996)
Dai, Y.H., Yuan, Y.: Convergence of the Fletcher-Reeves method under a generalized Wolfe search. J. Comput. Math. 2, 142–148 (1996)
Dai, Y.H., Yuan, Y.: A nonlinear conjugate gradient method with a strong global convergence property. SIAM J. Optim. 10, 177–182 (1999)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program., Ser. A 91, 201–213 (2002)
Fletcher, R.: Practical Methods of Optimization. Wiley, Chichester (1987)
Fletcher, R., Reeves, C.: Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964)
Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)
Hager, W.W., Zhang, H.: A survey of nonlinear conjugate methods. Pac. J. Optim. 2, 35–58 (2006)
Hu, Q., Zhang, H., Zhou, Z., Chen, Y.: A class of improved conjugate gradient methods for nonconvex unconstrained optimization. Numer. Linear Algebra Appl. 30(4), e2482 (2023)
Hu, Y.F., Storey, C.: Global convergence result for conjugate gradient methods. J. Optim. Theory Appl. 71, 399–405 (1991)
Ibrahim, A.H., Alshahrani, M., Al-Homidan, S.: Two classes of spectral three-term derivative-free method for solving nonlinear equations with application. Numer. Algorithms 2023:1–21
Ibrahim, A.H., Kimiaei, M., Kumam, P.: A new black box method for monotone nonlinear equations. Optimization 72(5), 1119–1137 (2023)
Ibrahim, A.H., Kumam, P., Abubakar, A.B., Abubakar, J.: A derivative-free projection method for nonlinear equations with non-Lipschitz operator: application to LASSO problem. Math. Methods Appl. Sci. 46(8), 9006–9027 (2023)
Ibrahim, A.H., Kumam, P., Kamandi, A., Abubakar, A.B.: An efficient hybrid conjugate gradient method for unconstrained optimization. Optim. Methods Softw. 37(4), 1370–1383 (2022)
Ivanov, B., Milovanović, G.V., Stanimirović, P.S.: Accelerated Dai-Liao projection method for solving systems of monotone nonlinear equations with application to image deblurring. J. Glob. Optim. 85(2), 377–420 (2023)
Jiang, X., Yang, H., Jian, J., Wu, X.: Two families of hybrid conjugate gradient methods with restart procedures and their applications. Optim. Methods Softw. 1–28 (2023)
Jiang, X.Z., Zhu, Y.H., Jian, J.B.: Two efficient nonlinear conjugate gradient methods with restart procedures and their applications in image restoration. Nonlinear Dyn. 111(6), 5469–5498 (2023)
Kaelo, P., Koorapetse, M., Sam, C.R.: A globally convergent derivative-free projection method for nonlinear monotone equations with applications. Bull. Malays. Math. Sci. Soc. 44, 4335–4356 (2021)
Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D., et al.: A method for large-scale l1-regularized least squares. IEEE J. Sel. Top. Signal Process. 1(4), 606–617 (2007)
Kimiaei, M., Hassan Ibrahim, A., Ghaderi, S.: A subspace inertial method for derivative-free nonlinear monotone equations. Optimization 2023:1–28
Li, Q., Li, D.H.: A class of derivative-free methods for large-scale nonlinear monotone equations. IMA J. Numer. Anal. 31, 1625–1635 (2011)
Liu, G.H., Han, J.Y., Yin, H.X.: Global convergence of the Fletcher-Reeves algorithm with an inexact line search. Appl. Math. J. Chin. Univ. Ser. B 10, 75–82 (1995)
Liu, J., Li, S.J.: A projection method for convex constrained monotone nonlinear equations with applications. Comput. Math. Appl. 70(10), 2442–2453 (2015)
Liu, P., Wu, X., Shao, H., Zhang, Y., Cao, S.: Three adaptive hybrid derivative-free projection methods for constrained monotone nonlinear equations and their applications. Numer. Linear Algebra Appl. 30(2), e2471 (2023)
Liu, Y., Storey, C.: Efficient generalized conjugate gradient algorithms. Part 1: Theory. J. Optim. Theory Appl. 69, 129–137 (1991)
Ma, G., Liu, L., Jian, J., Yan, X.: A new hybrid CGPM-based algorithm for constrained nonlinear monotone equations with applications. J. Appl. Math. Comput. 70(1), 103–147 (2024)
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)
Papp, Z., Rapajić, S.: FR type methods for systems of large-scale nonlinear monotone equations. Appl. Math. Comput. 269, 816–823 (2015)
Polak, E., Ribière, G.: Note sur la convergence de directions conjugées. Rev. Francaise Informat Recherche Opertionelle 16, 35–43 (1969)
Polyak, B.T.: The conjugate gradient method in extreme problems. USSR Comput. Math. Math. Phys. 9, 94–112 (1969)
Powell, M.J.D.: Restart procedures of the conjugate gradient method. Math. Program. 2, 241–254 (1997)
Solodov, M.V., Svaiter, B.F.: A globally convergent inexact Newton method for systems of monotone equations. In: Fukushima, M., Qi, L. (eds.) Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, pp. 355–369. Kluwer Academic, Dordrecht (1998)
Wang, X.: A class of spectral three-term descent Hestenes-Stiefel conjugate gradient algorithms for large-scale unconstrained optimization and image restoration problems. Appl. Numer. Math. (2023)
Waziri, M.Y., Ahmed, K.: Two descent Dai-Yuan conjugate gradient methods for systems of monotone nonlinear equations. J. Sci. Comput. 90, 1–53 (2022)
Wu, X., Shao, H., Liu, P., Zhang, Y., Zhuo, Y.: An efficient conjugate gradient-based algorithm for unconstrained optimization and its projection extension to large-scale constrained nonlinear equations with applications in signal recovery and image denoising problems. J. Comput. Appl. Math. 422, 114879 (2023)
Xiao, Y., Zhu, H.: A conjugate gradient method to solve convex constrained monotone equations with applications in compressive sensing. J. Math. Anal. Appl. 405(1), 310–319 (2013)
Yan, Q.R., Peng, X.Z., Li, D.H.: A globally convergent derivative-free method for solving large-scale nonlinear monotone equations. J. Comput. Appl. Math. 234, 649–657 (2010)
Zhang, L., Zhou, W.: Spectral gradient projection method for solving nonlinear monotone equations. J. Comput. Appl. Math. 196, 478–484 (2006)
Zhao, Y.B., Li, D.: Monotonicity of fixed point and normal mapping associated with variational inequality and its application. SIAM J. Optim. 4, 962–973 (2001)
Zhou, W.J., Li, D.H.: Limited memory BFGS method for nonlinear monotone equations. J. Comput. Math. 25, 89–96 (2007)
Funding
This research is supported by the Science Fund of the Republic of Serbia, #GRANT No 7359, Project title-LASCADO. Supak Phiangsungnoen acknowledges Rajamangala University of Technology Rattanakosin the financial support provided by Thailand Science Research and Innovation (TSRI) and Fundamental Fund of Rajamangala University of Technology Rattanakosin with funding under contract No.FRB6719/2567 and the NSRF via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation [grant number B41G670027].
Author information
Authors and Affiliations
Contributions
Writing the manuscript was jointly done by Z.P, S.R, A.H.I and S.P. More precisely, Z.P, S.R and S.P formulated the problem and proved the convergence analysis. The main manuscript text was written by Z.P, S.R and S.P. A.H.I. implemented the numerical experiments for the large-scale systems. S.P. implemented the signal reconstruction experiment. All authors reviewed the manuscript
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This article does not contain any studies with human participants or animals performed by any of the authors.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Papp, Z., Rapajić, S., Ibrahim, A.H. et al. Hybrid Hu-Storey type methods for large-scale nonlinear monotone systems and signal recovery. J Inequal Appl 2024, 110 (2024). https://doi.org/10.1186/s13660-024-03187-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-024-03187-1