Toeplitz matrix completion via a low-rank approximation algorithm

In this paper, we propose a low-rank matrix approximation algorithm for solving the Toeplitz matrix completion (TMC) problem. The approximation matrix was obtained by the mean projection operator on the set of feasible Toeplitz matrices for every iteration step. Thus, the sequence of the feasible Toeplitz matrices generated by iteration is of Toeplitz structure throughout the process, which reduces the computational time of the singular value decomposition (SVD) and approximates well the solution. On the theoretical side, we provide a convergence analysis to show that the matrix sequences of iterates converge. On the practical side, we report the numerical results to show that the new algorithm is more effective than the other algorithms for the TMC problem.


Introduction
The problem of completing a low-rank matrix, which is to recover a matrix from the partial entries, has became more and more popular. The problem arises in many areas of engineering and applied science such as machine learning [1,2], control [20], image inpainting [4], and so on. And the problem is mathematically written as follows: min X∈R m×n rank(X) s.t. P Ω (X) = P Ω (D), (1.1) where the objective function rank(X) denotes the rank of X, Ω is a subset of indices for the q known entries, D ∈ R m×n is the unknown matrix to be reconstructed and the one that has available q sampled entries, P Ω (X) is the sampling projector which acquires only the entries indexed by Ω.
When the rank of X is known in advance or can be estimated [17,18], the matrix completion problem (1.1) is equivalent to the following non-convex optimization problem on the manifold: where X r represents the smooth r-dimensional manifold set of rank r matrix. It is clear that problem (1.2) is a smooth Riemannian optimization since the objective function is also smooth.
Recently, there have been some algorithms, mentioned in what follows, to solve the problem (1.1) or (1.2). The orthogonal rank-one matrix pursuit (OR1MP) [33] was presented based on the orthogonal matching pursuit for wavelet decomposition [22], only the top singular value and the corresponding singular vector as well as a least square problem on Ω were computed, but the precision was poor. It is well known that a manifold of rank r matrix can be factorized into a bi-linear form: X = GH T with G ∈ R m×r , H ∈ R n×r , so some algorithms have been proposed to solve the problem (1.1) or (1.2), for example, Vandereycken [27] proposed the geometric conjugate gradient method and Newton method on Riemannian manifold; Mishra [21], Boumal and Absil [5] presented the Riemannian geometry method, i.e., the gradient descent scheme and trust-region scheme; Tanner and Wei [26] took advantage of the different descent directions for exact linear search respectively; the alternating direction methods were presented by Jain [15]; Keshavan [16] and Wen [38] presented gradient descent method and nonlinear successive over-relaxation algorithm respectively. Based on the shortest distance scheme in [32] and the alternating steepest descent algorithm in [26], the inner and outer iteration was presented in [37] and was proved to be feasible, but it depended on the inner iterative methods. The optimal low-rank approximate algorithm in [32] was showed to be effective, since the algorithm was simple and iterated repeatedly on the manifold of rank r matrix.
On the current practice, the sample matrix D often has a special structure such as the Hankel or Toeplitz structure and so on. The Hankel or Toeplitz matrix also plays important roles in many areas of engineering and applied science, especially in signal and image processing [3,14]. There are a lot algorithms which are effective for the low-rank Hankel or Toeplitz matrix completion problem. For example, the nuclear norm minimization for the low-rank Hankel matrix reconstruction problem under the random Gaussian sampling model was investigated in [7]. The common drawback of this otherwise very appealing convex optimization approach is the high computational complexity of solving the equivalent semi-definite programming (SDP). Cai et al. [6] developed a fast non-convex algorithm for a low-rank Hankel matrix completion by minimizing the distance between a low-rank matrix and a Hankel matrix with partial known anti-diagonals, and an accelerated variant has been developed by using Nesterov's memory technique. [9] came up with an iterative hard thresholding (IHT) and fast IHT (FIHT) algorithms for efficient reconstruction of spectrally sparse signals via low-rank Hankel matrix completion. By utilizing the low-rank structure of the Hankel matrix corresponding to a spectrally sparse signal x, [8] introduced a computationally efficient algorithm for the spectral compressed sensing problem. Chen and Chi [10] studied nuclear norm minimization for the low-rank Hankel matrix completion problem, the method proposed in [10] utilized convex relaxation and was theoretically guaranteed to work. Fazel et al. [11] proposed also the reconstruction of the low-rank Hankel matrices via nuclear norm minimization for system identification realization. Sebert et al. considered the Toeplitz block matrices as sensing matrices whose elements are drawn from the same distributions in [23]. Shaw et al. [24] presented algorithms for least-squares approximation of the Toeplitz and Hankel matrices from noise corrupted or ill-composed matrices, which may not have correct structural or rank properties. Wang et al. [29,30] proposed a mean value algorithm to force the Toeplitz structure and a structure-preserving algorithm for the Toeplitz matrix recovery. Wen et al. [34][35][36] gave some algorithms for the TMC problem. Further, Ying et al. [40] exploited the low-rank tensor structure of the signal when developing recovery problems for the multidimensional spectrally sparse signal recovery problems. For details, one can refer to the above mentioned algorithms and references given therein.
An n × n Toeplitz matrix T ∈ R n×n is of the following form: It is clear that the entries of T are constant along each diagonal. Based on the Lanczos method [13] and the FFT technique [28], an algorithm for the SVD of a Hankel matrix [19,39] has been presented; its complexity is O(n 2 log n) but that of the general SVD algorithm is O(n 3 ). It is significant to study the TMC problem because most of the popular algorithms need to compute the SVD. The fast SVD (denoted by lansvd) algorithm with O(n 2 log n) complexity has been applied to some recent algorithms for solving the TMC problem as inspired by [24]. That is also the interest of this study.
Then it motivates us to switch the iteration matrix into the Toeplitz matrix for solving the TMC problem. To exploit the structure well, in this paper, we modify the general matrix completion algorithm to the TMC problem by using mean projection approximation. The proposed algorithm ensures that the sequence of iteration matrices remains the Toeplitz structure so that the Lanczos SVD algorithm can always be used. It not only reduces the computational cost, but also improves the accuracy.
The rest of the paper is organized as follows. We present a new algorithm after giving an outline of the OLRMA [32], MOR1MP, and MEOR1MP [12] algorithms for solving the TMC problem in Sect. 2. Section 3 establishes the convergence results for the new algorithm given in Sect. 2. We compare our algorithm with the OLRMA, MOR1MP, and MEOR1MP algorithms through numerical experiments in Sect. 4. Finally, we conclude the paper with a short discussion in Sect. 5.
Before continuing, we provide here a brief summary of the necessary notations used throughout the paper. R is the set of real numbers, R m×n denotes the set of m × n real matrices. Likewise, T n×n denotes the set of real n×n Toeplitz matrices. X F is the F-norm of a matrix X. X, Y = trace(X T Y ) represents the standard inner product between two matrices ( X 2 F = X, X ), X T stands for the transpose of X. Ω ⊂ {1, 2, . . . , m} × {1, 2, . . . , n} is a subset of indices of the observed entries of the matrix X ∈ R m×n , andΩ represents the indices of the missing entries. The singular value decomposition (SVD) of a matrix A ∈ R m×n with rank r is where U ∈ R m×r and V ∈ R n×r are column orthogonal matrices, σ 1 ≥ σ 2 ≥ · · · ≥ σ r > 0. I n = (e 1 , e 2 , . . . , e n ) ∈ R n×n denotes the n × n identity matrix and S n = (e 2 , e 3 , . . . , e n , 0) ∈ R n×n is called the shift matrix where 0 is a zero-vector. It is clear that with "O" standing for a zero-matrix with the corresponding size. Thus, a Toeplitz matrix T ∈ T n×n , shown in (1.3), can be rewritten as a linear combination of these shift matrices, that is,

Description of algorithms
For the goal of completing comparison subsequently, we briefly review and introduce some algorithms for solving the TMC problem firstly. Here, the TMC problem is considered as follows: where D ∈ T n×n is the underlying Toeplitz matrix to be reconstructed, Ω ⊂ {-n + 1, . . . , n -1} is the indices of the observed diagonals of D,Ω is the complementary set of Ω. For X ∈ T n×n , the diag(X, l) denotes the vector formed by lth diagonal of X, l ∈ {-n + 1, . . . , n -1}, P Ω is the orthogonal projector on Ω, satisfying Likewise, diag(PΩ (X), l) can be defined as follows: Based on the OR1MP algorithm in [33], we came up with three schemes for solving the TMC problem in the following. For convenience, [U k , Σ k , V k ] = lansvd(Y k , r) denotes the SVD of the matrix Y k to find the top-r singular pairs by using the Lanczos method. Algorithm 2.1 ([12] (The MOR1MP algorithm for the TMC problem)) Given a sampled set Ω and the sampled entries P Ω (D) = Y Ω , ε = 10 -3 , maxiter. Given also the initial Toeplitz matrix X 0 = 0, k := 1. Step Step 2. Compute the weight vector θ k by using the closed form least squares solution Step Step 5. If k ≥ maxiter or R k F / Y Ω F ≤ ε, stop; otherwise set k := k + 1; go to Step 1. ([12] (The MEOR1MP algorithm for the TMC problem)) Given a sampled set Ω and the sampled entries P Ω (D) = Y Ω , ε = 10 -3 , maxiter. Given also the initial Toeplitz matrix X * 0 = 0, k := 1.

Algorithm 2.2
Step 2. Compute the optimal weights α k for X * k-1 and M k by solving Step 3. Compute Step 5. If k ≥ maxiter or R k F ≤ ε, stop; otherwise set k := k + 1; go to Step 1.

Remark
The main difference between the proposed algorithms and OR1MP is that the rank-one matrix M k was switched into the Toeplitz matrix by the mean value operator before solving the least square problem, which guarantees that X k and Y k are all Toeplitz matrices. So the computing time of the SVD is decreased due to using of the FFT technique. Then, in the iteration process, all the matrices are of the Toeplitz structure, which ensures lower computational cost even though the process of the mean value increases the computing time.

It is clear that M(A)
is a Toeplitz matrix derived from the matrix A. Namely, any A ∈ R n×n can be changed into the Toeplitz structure via the mean projection operator M(·), which is the best Toeplitz approximation under F-norm (see Theorem 3.1 in Sect. 3).
Next, we are to switch the approximation matrix X k in Algorithm 2.3 into the Toeplitz one by mean projection matrix for solving the TMC problem, and the new algorithm was presented as follows.

Remark
We can see that both the iteration sequences {Y k } obtained by Algorithms 2.3-2.4 are close to the underlying matrix. If the unknown matrix to be reconstructed D is of the Toeplitz structure, however, it is clear that the iteration sequence {Y k } with the Toeplitz structure in Algorithm 2.4 better approximates to the Toeplitz matrix. Further, the iteration in Algorithm 2.4 ensures lower computational cost even though the process of mean projection increases the computing time since the Lanczos SVD method can be used.

Convergence analysis
In this section, based on the structure of a Toeplitz matrix and some reasonable conditions, the convergence analysis of the new algorithm is given. Firstly, the distance between the feasible Toeplitz matrix and its projection onto the r-dimensional manifold is provided. The one-to-one correspondence between a matrix and its projection enables us to devise a notation of distance between a matrix and an r-dimensional manifold as follows.
is called distance between a matrix Y and an r-dimensional manifold X r . That is essentially the distance between matrix Y and its projection onto the r-dimensional manifold X r .
We trivially introduce the distance between a feasible matrix and its projection onto the r-dimensional manifold.
For P Ω (Y ) = P Ω (D), we call d(Y , r) = min dim(X)=r Y -X 2 F a distance between a feasible matrix Y and an r-dimensional manifold X r . Evidently,

Theorem 3.1 ([25]) For any matrix Y ∈ R n×n , the mean projection matrix M(Y ) presented in (2.2) is the solution of the following optimization problem:
Proof It is known that According to Lemma 3.2 and PΩ (Y k+1 ) = M(PΩ (X k )) for the Toeplitz matrix sequence Y k , we have = 0, and then (3.6) holds true.

Lemma 3.6
Suppose that the low-rank r is known in problem (2.1), if rank(X k ) = r, k ≥ k 0 (k 0 is a positive integer) and there exists a parameter 0 < θ < 1 such that Y k-1 -Y k F ≥ θ Y k-1 -X k-1 F . Then there exists a positive constant 0 < c < 1 such that Proof According to Theorem 3.5 and the assumptions of Lemma 3.6, we have and then (3.9) holds.

Theorem 3.7
Let {Y k } be the matrix sequence generated by Algorithm 2.4. And suppose that the low-rank r is known in problem (2.1), if rank(X k ) = r, k ≥ k 0 (k 0 is a positive integer) and there exists 0 (3.11) can be reformulated as by Lemma 3.2: Thus, we obtain the following inequality from Lemma 3.4: Suppose that (3.9) holds for km k and r -1. Then, from Algorithm 2.4 and Lemma 3.6, we have From Lemma 3.4, the following inequality holds true: Thus, (3.10) holds for k and r.

Numerical experiments
This In the experiments, we suppose that p = q/(2n -1) is an observation ratio, where q denotes the number of the observed entries and set p = 0.1, 0.3, 0.5, respectively. The true matrix is denoted by D,Ŷ stands for the output matrices. Here, we report the running time in seconds (denoted by CPU (s)), the iteration number (denoted by IT), and the error of the reconstruction matrix is Ŷ -D F / D F . We generate the Toeplitz matrices or general matrices of rank r by sampling uniformly at random, which is the same as the method in [12], and hence the true matrix D is indeed of low-rank.
Brief comparison results of four algorithms are provided in Table 1. The changes of the CPU time of four algorithms with respect to the different matrix size are shown in Fig. 1, and Fig. 2 shows the convergence behavior of the relative error in MOLRMA runs. Table 1 describes the comparison between the OLRMA, MOLRMA, MOR1MP, and MEOR1MP algorithms in iteration number, CPU time, and error when the rank of the completion matrix is not known and P Ω (D) -P Ω (X k ) F / P Ω (D) F ≤ ε = 10 -4 .
From Table 1 and Figs. 1-2, we can see that the MOLRMA algorithm performs better than the OLRMA algorithm in computing time, iteration number, and error. Compared with the MEOR1MP algorithm, the new algorithm is better than the MEOR1MP algorithm in error, computing time, and iteration number when p=0.3 and 0.5.

Conclusion
In this paper, we have presented a new algorithm for solving the TMC problem based on the mean projection matrix operator. The approximation matrices are of the Toeplitz structure so that the fast SVD is used in the whole iterative process. Meanwhile, the MOL-RMA algorithm has a much better performance in precision than the OLRMA, MOR1MP, and MEOR1MP algorithms, and it is also better in computing time and iteration number when the sampling ratio is large. It is exciting that the iteration number of the new algorithm is less than the rank of the objective matrix. In addition, the iterative error of the MOLRMA algorithm is inversely related to the sampling ratio, which has little relation to the rank of the completion matrix.