- Research
- Open Access
Exact recovery of sparse multiple measurement vectors by \(l_{2,p}\)-minimization
- Changlong Wang^{1} and
- Jigen Peng^{1, 2}Email author
https://doi.org/10.1186/s13660-017-1601-y
© The Author(s) 2018
- Received: 11 December 2017
- Accepted: 28 December 2017
- Published: 10 January 2018
Abstract
Therefore the main contribution in this paper is two theoretical results about this technique. The first one is proving that in every multiple system of linear equations there exists a constant \(p^{\ast}\) such that the original unique sparse solution also can be recovered from a minimization in \(l_{p}\) quasi-norm subject to matrixes whenever \(0< p<p^{\ast}\). The other one is showing an analytic expression of such \(p^{\ast}\). Finally, we display the results of one example to confirm the validity of our conclusions, and we use some numerical experiments to show that we increase the efficiency of these algorithms designed for \(l_{2,p}\)-minimization by using our results.
Keywords
- sparse recovery
- multiple measurement vectors
- joint sparse recovery
- \(l_{2,p}\)-minimization
MSC
- 15A06
- 15A15
- 15A48
1 Introduction
In this paper, we define the support of X by \(\operatorname {support}(X)=S=\{i: \Vert X_{\text{row } i} \Vert _{2} \neq 0\}\) and say that the solution X is k-sparse when \(\vert S \vert \leq k\), and we also say that X can be recovered by \(l_{2,0}\)-minimization if X is the unique solution of an \(l_{2,0}\)-minimization problem.
It needs to be emphasized that we cannot regard the solution of multiple measurement vector (MMV) as a combination of several solutions of single measurement vectors, i.e., the solution matrix X to \(l_{2,0}\)-minimization is not always composed of the solution vectors to \(l_{0}\)-minimization.
Example 1
In [7], \(l_{0}\)-minimization has been proved to be NP-hard because of the discrete and discontinuous nature of \(\Vert x \Vert _{0}\). Therefore, it is obviously NP-hard to solve \(l_{2,0}\)-minimization too. Due to the fact that \(\Vert X \Vert _{2,0}=\lim_{p \to0} \Vert X \Vert _{2,p}^{p}\), it seems to be more natural to consider \(l_{2,p}\)-minimization instead of an NP-hard optimization \(l_{2,0}\)-minimization than others.
1.1 Related work
Many researchers have made a lot of contribution related to the existence, uniqueness, and other properties of \(l_{2,p}\)-minimization (see [8–11]). Eldar [12] gives a sufficient condition for MMV when \(p=1\), and Unser [13] analyses some properties of the solution to \(l_{2,p}\)-minimization when \(p=1\). Fourcart and Gribonval [9] studied the MMV setting when \(r=2\) and \(p=1\); they give a sufficient and necessary condition to judge whether a k-sparse matrix X can be recovered by \(l_{2,p}\)-minimization. Furthermore, Lai and Liu [10] consider the MMV setting when \(r \geq2\) and \(p\in[0,1]\), they improve the condition in [9] and give a sufficient and necessary condition when \(r\geq2\).
On the other hand, numerous algorithms have been proposed and studied for \(l_{2,0}\)-minimization (e.g., [14, 15]). Orthogonal matching pursuit (OMP) algorithms are extended to the MMV problem [16], and convex optimization formulations with mixed norm extend to the corresponding SMV solution [17]. Hyder [15] provides us a robust algorithm for joint sparse recovery, which shows a clear improvement in both noiseless and noisy environments. Furthermore, there exists a lot of excellent work (see[18–22]) presenting us algorithms designed for \(l_{2,p}\)-minimization. However, it is an important theoretical problem whether there exists a general equivalence relationship between \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization.
In the case \(r=1\), Peng [23] has given a definite answer to this theoretical problem. There exists a constant \(p(A,b)>0\) such that every solution to \(l_{p}\)-minimization is also the solution to \(l_{0}\)-minimization whenever \(0< p< p(A,b)\).
However, Peng only proves the conclusion when \(r=1\), so it is urgent to extend this conclusion to the MMV problem. Furthermore, Peng just proves the existence of such p, he does not give us a computable expression of such p. Therefore, the main purpose of this paper is not only to prove the equivalence relationship between \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization, but also present an analysis expression of such p in Section 2 and Section 3.
1.2 Main contribution
In this paper, we focus on the equivalence relationship between \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization. Furthermore, it is an application problem that an analytic expression of such \(p^{*}\) is needed, especially in designing some algorithms for \(l_{2,p}\)-minimization.
In brief, this paper gives answers to two problems which urgently need to be solved:
(I). There exists a constant \(p^{*}\) such that every k-sparse X can be recovered by \(l_{2,0}\)-minimization and \(l_{2,p}\)-minimization whenever \(0< p< p^{*}\).
(II). We give such an analytic expression of such \(p^{*}\) based on the restricted isometry property (RIP).
Our paper is organized as follows. In Section 2, we present some preliminaries which play a core role in the proof of our main theorem and prove the equivalence relationship between \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization. In Section 3 we focus on proving another main result of this paper. There we present an analytic expression of such \(p^{*}\). Finally, we summarize our findings in the last section.
1.3 Notation
For convenience, for \(x \in\mathbb {R}^{n}\), we define its support by \(\operatorname{support} (x)=\{i:x_{i} \neq0\}\) and the cardinality of set S by \(\vert S \vert \). Let \(\operatorname{Ker}(A)=\{x \in\mathbb {R}^{n}:Ax=0\}\) be the null space of matrix A. We also use the subscript notation \(x_{S}\) to denote a vector that is equal to x on the index set S and zero everywhere else and use the subscript notation \(X_{S}\) to denote a matrix whose rows are those of the rows of X that are in the set index S and zero everywhere else. Let \(X_{\text{col } i}\) be the ith column in X, and let \(X_{\text{row } i}\) be the ith row in X, i.e., \(X=[X_{\text{col } 1},X_{\text{col } 2},\dots, X_{\text{col } r}]=[X_{\text{row } 1},X_{\text{row } 2},\dots, X_{\text{row } m}]^{T}\) for \(X\in\mathbb{R}^{n \times r}\). We use \(\langle A,B\rangle=\operatorname{tr}(A^{T}B)\) and \(\Vert A \Vert _{F}=\sum_{i,j} \vert a_{ij} \vert ^{2}\).
2 Equivalence relationship between \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization
At the beginning of this section, we introduce a very important property of the measurement matrix A.
Definition 1
([24])
Next, we will introduce another important concept named M-null space constant, and this concept is the key to proving the equivalence relationship between \(l_{2,0}\)-minimization and \(l_{2,p}\)-minimization.
Definition 2
M-NSC provides us a sufficient and necessary condition of the solution to \(l_{2,0}\)-minimization and \(l_{2,p}\)-minimization, and it is important for proving the equivalence relationship between these two models. Furthermore, we emphasize a few important properties of \(h(p,A,r,k)\).
Proposition 1
For a given matrix A, the M-NSC \(h(p,A,r,k)\) defined in Definition 2 is nondecreasing in \(p \in[0,1]\).
Proof
The proof is divided into two steps.
Step 1: To prove \(h(p,A,r,k) \leq h(1,A,r,k)\) for any \(p\in[0,1]\).
For any \(X \in(N(A))^{r} \setminus \{(\textbf{0},\textbf{0},\dots,\textbf{0})\}\), without loss of generality, we assume that \(\Vert X_{\text{row } 1} \Vert _{2} \geq \Vert X_{\text{row } 2} \Vert _{2} \geq\cdots\geq \Vert X_{\text{row } n} \Vert _{2} \).
Because \(h(p,A,r,k)=\max_{ \vert S \vert \leq k} \sup_{X \in(N(A))^{r} \setminus \{(\textbf{0},\textbf{0}, \dots)\}} \theta (p,X,k)\), we can get that \(h(p,A,r,k) \leq h(1,A,r,k)\).
Step 2: To prove \(h(pq,A,r,k) \leq h(p,A,r,k)\) for any \(p\in[0,1]\) and \(q\in(0,1)\).
Therefore, we can get that \(\theta(pq,X,k)\leq\theta(p,X,k)\); in other words, \(\theta(p_{1},X,k)\leq\theta(p_{2},X,k)\) as long as \(p_{1} \leq p_{2}\).
Because \(h(p,A,r,k)=\max_{ \vert S \vert \leq k} \sup_{X \in(N(A))^{r} \setminus \{0,0, \dots,0\}} \theta(p,X,k)\), so we can get that \(h(p,A,r,k)\) is nondecreasing in \(p \in[0,1]\)
The proof is completed. □
Proposition 2
For a given matrix A, the M-NSC \(h(p,A,r,k)\) defined in Definition 2 is a continuous function in \(p \in[0,1]\)
Proof
As has been proved in Proposition 1, \(h(p,A,r,k)\) is nondecreasing in \(p \in[0,1]\) such that there is jump discontinuous if \(h(p,A,r,k)\) is discontinuous at a point. Therefore, it is enough to prove that it is impossible to have jump discontinuous points of \(h(p,A,r,k)\).
For convenience, we still use \(\theta(p,X,S)\) which is defined in the proof of Proposition 1, and the following proof is divided into three steps.
Step 1. To prove that there exist \(X\in(N(A))^{r}\) and a set \(S \subset \{1,2,\dots, n\}\) such that \(\theta(p,X,S)=h(p,A,r,k)\).
Let \(V=\{X\in((N(A))^{r}): \Vert X_{\text{row } i} \Vert _{2}=1, i=1,2,\dots, n\}\), and it is easy to get that \(h(p,A,r,k)=\max_{ \vert S \vert \leq k} \sup_{X \in V} \theta(p,X,S)\)
It needs to be pointed out that the choice of the set \(S\subset {1,2,\dots, n}\) with \(\vert S \vert \leq k\) is limited, so there exists a set \(S^{\prime}\) with \(\vert S^{\prime} \vert \leq k\) such that \(h(p,A,r,k)=\sup_{X \in V} \theta(p,X,S^{\prime})\).
On the other hand, \(\theta(p,X,S^{\prime})\) is obviously continuous in X on V. Because of the compactness of V, there exists \(X^{\prime}\in V\) such that \(h(p,A,r,k)=\theta(p,X^{\prime},S^{\prime})\).
Step 2. To prove that \(\lim_{p \to p_{0}^{-}} h(p,A,r,k)=h(p_{0},A,r,k)\).
According to the proof in Step 1, there exist \(X^{\prime}\in(N(A))^{r}\) and \(S \subset\{1,2,\dots, n\}\) such that \(h(p_{0},A,r,k)=\theta (p_{0},X^{\prime},S^{\prime})\). It is easy to get that \(\lim_{p \to p_{0}^{-}} \theta(p_{n},X,S^{\prime})=\theta(p,X^{\prime},S^{\prime})=h(p_{0},A,r,k)\).
Therefore, we have that \(\lim_{p \to p_{0}^{-}} h(p,A,r,k)=h(p_{0},A,r,k)\).
Step 3. To prove that \(\lim_{p \to p_{0}^{+}} h(p,A,r,k)=h(p_{0},A,r,k)\) for any \(p_{0}\in[0,1)\).
We consider a sequence of \(\{p_{n}\}\) with \(p_{0} \leq p_{n}<1\) and \(p \to p_{0}^{+}\).
According to Step 1, there exist \(X_{n} \in V\) and \(\vert S_{n} \vert \leq k\) such that \(h(p_{n},A,r,k)=\theta(p_{n},X_{n},S_{n})\). Since the choice of \(S\subset\{1,2,\dots, n\}\) with \(\vert S \vert \leq k\) is limited, there exist two subsequences \(\{p_{n_{i}}\}\) of \(\{ p_{n}\}\), \(\{X_{n_{i}}\}\) of \(\{X_{n}\}\) and a set \(S^{\prime}\) such that \(\theta (p_{n_{i}},X_{n_{i}},S^{\prime})=h(p_{n_{i}},A,r,k)\).
Furthermore, since \(X_{n} \in V\), it is easy to get a subsequence of \(X_{n_{i}}\) which is convergent. Without loss of generality, we assume that \(X_{n_{i}} \to X^{\prime}\).
Therefore, we can get that \(h(p_{n_{i}},A,r,k)=\theta (p_{n_{i}},X_{n_{i}},S^{\prime}) \to \theta(p_{0},X^{\prime},S^{\prime})\).
According to the definition of \(h(p_{0},A,r,k)\), we can get that \(\theta (p_{0},X^{\prime},S^{\prime}) \leq h(p_{0},A,r,k)\) such that \(\lim_{p \to p_{0}^{+}} h(p,A,r,k)=h(p_{0},A,r,k)\).
Combining Step 2 and Step 3, we show that it is impossible for \(h(p,A,r,k)\) to have jump discontinuous.
The proof is completed. □
Theorem 1
If every k-sparse matrix X can be recovered by \(l_{2,0}\)-minimization, then there exists a constant \(p^{*}(A,B,r)\) such that X also can be recovered by \(l_{2,p}\)-minimization whenever \(0< p< p^{*}(A,B,r)\).
Proof
First of all, we will prove that \(h(0,A,r,k)<1\) under the assumption. If \(h(0,A,r,k)\geq1\) for some fixed r and k, then there exists \(X\in(N(A))^{r}\) such that \(\Vert X_{S} \Vert _{2,0}\geq \Vert X_{S^{C}} \Vert _{2,0}\) for a certain set \(S\subset\{ 1,2,\ldots,n\}\) with \(\vert S \vert \leq k\). Let \(B=AX_{S}\), then it is obvious that \(X_{S^{C}}\) is a sparser solution than \(X_{S}\), which contradicts the assumption.
By Propositions 1 and 2, since \(h(p,A,r,k)\) is continuous and nondecreasing at the point \(p=0\), there exist a constant \(p^{*}(A,B,r)\) and a small enough number δ that \(h(0,A,r,k)< h(p,A,r,k)\leq h(0,A,r,k)+\delta< 1\) for any \(p \in (0,p^{*}(A,B,r))\).
3 An analysis expression of such p
In Section 2, we have proved the fact that there exists a constant \(p^{*}(A,B,r)\) such that both \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization have the same solution. However, it is also important to give such an analytic expression of \(p^{*}(A,B,r)\). In Section 3, we focus on giving an analytic expression of an upper bound of \(h(p,A,r,k)\), and we can get the equivalence relationship between \(l_{2,p}\)-minimization and \(l_{2,0}\)-minimization as long as \(h(p,A,r,k)<1\) is satisfied. In order to reach our goal, we postpone our main theorems and begin with some lemmas.
Lemma 1
Proof
For any \(X \in\mathbb{R}^{n \times r}\), without loss of generality, we assume that \(\Vert X_{\text{row } i} \Vert _{2} =0\) for \(i \in\{ \Vert X \Vert _{2,0}+1,\ldots,n\}\).
Lemma 2
([25])
Lemma 3
For \(p\in(0,1]\), we have that \((\frac{p}{2} )^{\frac{1}{2}} (\frac{1}{2-p} )^{\frac{1}{2}-\frac {1}{p}}\geq\frac{\sqrt{2}}{2}\).
Proof
The proof is completed. □
Lemma 4
For any \(0< p\leq1\), we have that \((1-\frac{p}{2} )^{\frac {1}{p}-\frac{1}{2}}\leq (\frac{\sqrt{2}}{2}-e^{-\frac {1}{2}} )p+e^{-\frac{1}{2}}\).
Proof
Because \(\lim_{p\rightarrow0} \varphi(p)=e^{-\frac{1}{2}}\) and \(\varphi(1)=\frac{\sqrt{2}}{2}\), we can get that \(\varphi(p)\) is an increasing function and \(\varphi(p)\leq\frac{\sqrt{2}}{2}\) for \(0< p\leq1\).
Furthermore, it is easy to get that \(\frac{1}{p(2-p)}+\frac {1}{2p}=\frac{24-18p+3p^{2}}{6p(2-p)^{2}}\) and \(g(p)\geq0\) since \(4>3p\).
Lemma 5
Proof
Now, we present another main contribution in this paper.
Theorem 2
Proof
Theorem 3
Proof
The proof is completed. □
Now, we present one example to demonstrate the validation of our main contribution in this paper.
Example 2
It is obvious that \(\Vert X \Vert _{2,p}\) has the minimum point at \(s=t=0\), which is the original solution to \(l_{2,0}\)-minimization.
4 Numerical experiment
5 Conclusion
In this paper we have studied the equivalence relationship between \(l_{2,0}\)-minimization and \(l_{2,p}\)-minimization, and we have given an analysis expression of such \(p^{\ast}\).
Furthermore, it needs to be pointed out that the conclusion in Theorems 2 and 3 is valid in a single measurement vector problem, i.e., \(l_{p}\)-minimization also can recover the original unique solution to \(l_{0}\)-minimization when \(0< p< p^{\ast}\).
However, the analysis expression of such \(p^{\ast}\) in Theorem 3 may not be the optimal result. In this paper, we have considered all the underdetermined matrices \(A\in\mathbb{R}^{m \times n}\) and \(B\in\mathbb{R}^{m \times r}\) from a theoretical point of view. So the result can be improved with a particular structure of the matrices A and B. The authors think the answer to this problem will be an important improvement of the application of \(l_{2,p}\)-minimization. In conclusion, the authors hope that in publishing this paper a brick will be thrown out and be replaced with a gem.
Declarations
Acknowledgements
The work was supported by the National Natural Science Foundations of China under the grants nos. 11771347, 91730306, 41390454.
Authors’ contributions
All authors contributed equally to this work. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Olshausen, BA, Field, DJ: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607 (1996) View ArticleGoogle Scholar
- Candès, EJ, Recht, B: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717 (2009) MathSciNetView ArticleMATHGoogle Scholar
- Malioutov, D, Cetin, M, Willsky, AS: A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 53(8), 3010-3022 (2005) MathSciNetView ArticleMATHGoogle Scholar
- Wright, J, Ganesh, A, Zhou, Z, Wanger, A, Ma, Y: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210 (2009) View ArticleGoogle Scholar
- Cotter, SF, Rao, BD: Sparse channel estimation via matching pursuit with application to equalization. IEEE Trans. Commun. 50(3), 374-377 (2002) View ArticleGoogle Scholar
- Fevrier, IJ, Gelfand, SB, Fitz, MP: Reduced complexity decision feedback equalization for multipath channels with large delay spreads. IEEE Trans. Commun. 47(6), 927-937 (1999) View ArticleGoogle Scholar
- Natarajan, BK: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227-234 (1995) MathSciNetView ArticleMATHGoogle Scholar
- Liao, A, Yang, X, Xie, J, Lei, Y: Analysis of convergence for the alternating direction method applied to joint sparse recovery. Appl. Math. Comput. 269, 548-557 (2015) MathSciNetView ArticleGoogle Scholar
- Foucart, S, Gribonval, R: Real versus complex null space properties for sparse vector recovery. C. R. Math. 348(15), 863-865 (2010) MathSciNetView ArticleMATHGoogle Scholar
- Lai, MJ, Liu, Y: The null space property for sparse recovery from multiple measurement vectors. Appl. Comput. Harmon. Anal. 30(3), 402-406 (2011) MathSciNetView ArticleMATHGoogle Scholar
- Van Den Berg, E, Friedlander, MP: Theoretical and empirical results for recovery from multiple measurements. IEEE Trans. Inf. Theory 56(5), 2516-2527 (2010) MathSciNetView ArticleMATHGoogle Scholar
- Eldar, YC, Michaeli, T: Beyond bandlimited sampling. IEEE Signal Process. Mag. 26(3), 48-68 (2009) View ArticleGoogle Scholar
- Unser, M: Sampling 50 years after Shannon. Proc. IEEE 88(4), 569-587 (2000) View ArticleGoogle Scholar
- Hyder, MM, Mahata, K: Direction-of-arrival estimation using a mixed \(l_{2,0}\) norm approximation. IEEE Trans. Signal Process. 58(9), 4646-4655 (2010) MathSciNetView ArticleGoogle Scholar
- Hyder, MM, Mahata, K: A robust algorithm for joint-sparse recovery. IEEE Signal Process. Lett. 16(12), 1091-1094 (2009) View ArticleGoogle Scholar
- Tropp, JA, Gilbert, AC: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inf. Theory 53(12), 4655-4666 (2007) MathSciNetView ArticleMATHGoogle Scholar
- Milzarek, A, Ulbrich, M: A semismooth Newton method with multidimensional filter globalization for \(l_{1}\)-optimization. SIAM J. Optim. 24(1), 298-333 (2014) MathSciNetView ArticleMATHGoogle Scholar
- Wang, L, Chen, S, Wang, Y: A unified algorithm for mixed, \(l_{2,p}\) -minimizations and its application in feature selection. Comput. Optim. Appl. 58(2), 409-421 (2014) MathSciNetView ArticleMATHGoogle Scholar
- Kowalski, M: Sparse regression using mixed norms. Appl. Comput. Harmon. Anal. 27(3), 303-324 (2009) MathSciNetView ArticleMATHGoogle Scholar
- Zhao, M, Zhang, H, Cheng, W, Zhang, Z: Joint \(l_{p}\) and \(l_{2,p}\)-norm minimization for subspace clustering with outlier pursuit. In: International Joint Conference on Neural Networks IEEE, pp. 3658-3665 (2016) Google Scholar
- Van Den Berg, E, Friedlander, MP: Joint-sparse recovery from multiple measurements. Mathematics (2009) Google Scholar
- Cotter, SF, Rao, BD, Engan, K, Kreutz-Delgado, K: Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans. Signal Process. 53(7), 2477-2488 (2005) MathSciNetView ArticleMATHGoogle Scholar
- Peng, J, Yue, S, Li, H: NP/CMP equivalence: a phenomenon hidden among sparsity models, \(l_{0}\)-minimization and \(l_{p}\)-minimization for information processing. IEEE Trans. Inf. Theory 61(7), 4028-4033 (2015) View ArticleMATHGoogle Scholar
- Candès, EJ, Tao, T: Decoding by linear programming. IEEE Trans. Inf. Theory 51(12), 4203-4215 (2005) MathSciNetView ArticleMATHGoogle Scholar
- Foucart, S: Sparse recovery algorithms: sufficient conditions in terms of restricted isometry constants. Springer Proc. Math. Stat. 13, 65-77 (2012) MathSciNetView ArticleMATHGoogle Scholar