• Research
• Open Access

# The efficiency comparisons between OLSE and BLUE in a singular linear model

Journal of Inequalities and Applications20132013:17

https://doi.org/10.1186/1029-242X-2013-17

• Received: 12 August 2012
• Accepted: 21 December 2012
• Published:

## Abstract

This paper is mainly concerned with the efficiency comparison between OLSE and BLUE in a singular linear model. We define the efficiencies between OLSE and BLUE by means of the matrix Euclidean norm and prove a matrix Euclidean norm version of the Kantorovich inequality to limit upper or lower bounds of these efficiencies. It relaxes the assumptions that the covariance matrix is positive definite and the design matrix has full column rank.

MSC:62J05, 62H05, 62H20.

## Keywords

• Euclidean norm
• least squares
• singular linear model
• Watson efficiency

## 1 Introduction

Inequalities are studied and utilized widely in many fields such as in matrix theory, statistics and so on. In statistics, they are often used to make efficiency comparisons between two estimators. For example, Wang and Shao  have discussed the efficiency comparisons between the ordinary least squares estimator (OLSE) and the best linear unbiased estimator (BLUE) in linear models. In this paper, our goal is to make the comparison of efficiencies between OLSE and BLUE in a singular linear model by using matrix norm versions of the Kantorovich inequality involving a nonnegative definite matrix.

Consider the following linear regression model:
$y=X\beta +\epsilon ,$
(1.1)

where $y\in {R}^{n}$ is the vector of n observations, $X\in {R}^{n×p}$ is the known design matrix, $\beta \in {R}^{P}$ is the unknown vector of regression coefficients and $\epsilon \in {R}^{n}$ is the error vector with mean vector zero and the covariance matrix Σ.

When X has full column rank and Σ is assumed to be positive definite, it is well known that the best linear unbiased estimator (BLUE) of β can be expressed as
$\stackrel{˜}{\beta }={\left({X}^{\prime }{\mathrm{\Sigma }}^{-1}X\right)}^{-1}{X}^{\prime }{\mathrm{\Sigma }}^{-1}y$
(1.2)
and the ordinary least squares estimator (OLSE) of β is given by
$\stackrel{ˆ}{\beta }={\left({X}^{\prime }X\right)}^{-1}{X}^{\prime }y.$
(1.3)
According to the Löwner ordering, we can easily compute from (1.2) and (1.3) that $cov\left(\stackrel{ˆ}{\beta }\right)-cov\left(\stackrel{˜}{\beta }\right)\ge 0$, which is nonnegative definite. Since there is no unique way to measure how ‘bad’ the OLSE can be with respect to the BLUE, various criteria have been considered in the literature; see, e.g., . Among these criteria, the frequently used measure is the Watson efficiency  defined as follows:
${\varphi }_{1}=\frac{|cov\left(\stackrel{˜}{\beta }\right)|}{|cov\left(\stackrel{ˆ}{\beta }\right)|}=\frac{{|{X}^{\prime }X|}^{2}}{|{X}^{\prime }\mathrm{\Sigma }X|\cdot |{X}^{\prime }{\mathrm{\Sigma }}^{-1}X|},$
(1.4)
where $|\cdot |$ indicates the determinant of the matrix concerned. The lower bound is provided by the Bloomfild-Watson-Knott inequality; see, e.g., [7, 8]. However, Yang and Wang  have shown that such a criterion is not always so satisfactory and provided an alternative form defined as the ratio of the Euclidean norms (or Frobenius norms) of the corresponding covariance matrices:
${\varphi }_{2}=\frac{\parallel cov\left(\stackrel{˜}{\beta }\right)\parallel }{\parallel cov\left(\stackrel{ˆ}{\beta }\right)\parallel }=\frac{\parallel {\left({X}^{\prime }{\mathrm{\Sigma }}^{-1}X\right)}^{-1}\parallel }{\parallel {\left({X}^{\prime }X\right)}^{-1}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{-1}\parallel }.$
(1.5)

Many authors assume that the covariance matrix is nonsingular in their analysis of this classic linear model. But the number of characteristics that could be included in the model may be clearly limited by this assumption of nonsingularity. A few authors relax the condition of nonsingularity and consider a singular linear model. For example, Liski et al.  and Liu  make efficiency comparisons between the OLSE and BLUE in a singular linear model. In the present paper, the singular linear model is further studied.

The Watson efficiency ${\varphi }_{1}$ has been generalized to a weakly singular model; see, e.g., . For a general case of the underlying singular linear model, it is not interesting because the denominator reduces to zero. In order to relax assumptions on the rank of X and Σ, we mainly discuss its alternative form based on the Euclidean norm [9, 12].

We hereinafter introduce some useful notations. Let the symbols ${A}^{\prime }$, ${A}^{-}$, ${A}^{+}$, $\mathcal{R}\left(A\right)$, $\mathcal{R}{\left(A\right)}^{\perp }$ and $rk\left(A\right)$ stand for the transpose, a generalized inverse, the Moore-Penrose inverse, the column space, the orthogonal complement of the column space and the rank of the matrix A, respectively. Moreover, write ${P}_{A}=A{A}^{+}=A{\left({A}^{\prime }A\right)}^{+}{A}^{\prime }$ and ${M}_{A}=I-{P}_{A}$, in particular, $H={P}_{X}$, $M=I-H$. ${\lambda }_{i}\left(A\right)$ denotes the i th largest eigenvalue of the matrix A.

## 2 A new Kantorovich-type inequality

We start with some lemmas which are very useful in the following.

Lemma 2.1 Let A be an $n×n$ complex matrix and ${\lambda }_{1},\dots ,{\lambda }_{n}$ be eigenvalues of A. Then we have
$\sum _{i=1}^{n}|{\lambda }_{i}{|}^{2}\le {\parallel A\parallel }^{2}$

and the equality holds if and only if A is a regular matrix.

Proof The proof is very easy, we therefore omit it here. □

Lemma 2.2 Let A be an $n×n$ positive semidefinite Hermitian matrix and U be an orthogonal projection matrix with $rk\left(U\right)=k$. Then we have
${\lambda }_{n-k+i}\left(A\right)\le {\lambda }_{i}\left(AU\right)\le {\lambda }_{i}\left(A\right),\phantom{\rule{1em}{0ex}}i=1,\dots ,k.$

The proof can be found in . See also .

Lemma 2.3 The Pólya and Szegö inequality
$\left(\sum _{i=1}^{n}{a}_{i}^{2}\right)\left(\sum _{i=1}^{n}{b}_{i}^{2}\right)\le \frac{{\left({M}_{1}{M}_{2}+{m}_{1}{m}_{2}\right)}^{2}}{4{m}_{1}{m}_{2}{M}_{1}{M}_{2}}{\left(\sum _{i=1}^{n}{a}_{i}{b}_{i}\right)}^{2},$

where $0<{m}_{1}\le {a}_{i}\le {M}_{1}$, $0<{m}_{2}\le {b}_{i}\le {M}_{2}$, $i=1,\dots ,n$.

Lemma 2.4 Let A and B be two $n×n$ positive semidefinite Hermitian matrices, and U be an $n×k$ matrix, $\mathcal{R}\left(A\right)\subset \mathcal{R}\left(B\right)$, $rk\left(B\right)=q$, $rk\left(BU\right)=t$, then
${\lambda }_{q-t+i}\left({B}^{-}A\right)\le {\lambda }_{i}\left({\left({U}^{\ast }BU\right)}^{-}{U}^{\ast }AU\right)\le {\lambda }_{i}\left({B}^{-}A\right),\phantom{\rule{1em}{0ex}}i=1,\dots ,t.$

The proof can be found in . See also .

Theorem 2.5 Let A be an $n×n$ positive semidefinite Hermitian matrix and ${\lambda }_{1}\ge \cdots \ge {\lambda }_{s}>0$ ($s\le n$) be the ordered eigenvalues of A, and let U be an $n×p$ complex matrix such that ${U}^{\ast }U={I}_{p}$. If $p\le s$, we then have

Proof The proof is similar to Theorem 1 in , we therefore omit it here. □

## 3 The comparison of efficiencies

The Watson efficiency [2, 16] and its decompositions  are usually used to measure the efficiency of the ordinary least squares. However, Yang and Wang  show that such a criterion does not always work well in some cases and propose an alternative form
$\rho =\frac{\parallel cov\left(X\stackrel{˜}{\beta }\right)\parallel }{\parallel cov\left(X\stackrel{ˆ}{\beta }\right)\parallel }=\frac{\parallel X{\left({X}^{\prime }{\mathrm{\Sigma }}^{-1}X\right)}^{-1}{X}^{\prime }\parallel }{\parallel X{\left({X}^{\prime }X\right)}^{-1}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{-1}{X}^{\prime }\parallel }.$
(3.1)

The above formula and its lower bound both require the covariance matrix Σ to be positive definite and the design matrix to have full column rank. This assumption limits clearly the number of characters which may be included in the model. We here generalize this formula to the situation where the matrices X and Σ can be of arbitrary rank.

In the following, we divide singular linear models into three categories in accordance with the assumptions on X and Σ. These categories are as follows:
1. (1)

$\mathcal{R}\left(X\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$, $rk\left(X\right)=p$, Σ is possibly singular;

2. (2)

$\mathcal{R}\left(X\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$, $rk\left(X\right), Σ is possibly singular;

3. (3)

Σ is possibly singular.

From now on, we always assume $rk\left(\mathrm{\Sigma }\right)=s$ ($s). Then any given singular linear model can be uniquely classified into i ($i=1,2,3$). Many authors have contributed to the theory in the literature; see, e.g., [17, 18]. The general representations for the BLUE of and their covariance matrices can be given respectively by
where $W=\mathrm{\Sigma }+XU{X}^{\prime }$ and here $U\ge 0$ is an arbitrary matrix such that $\mathcal{R}\left(W\right)=\mathcal{R}\left(X:\mathrm{\Sigma }\right)$. In particular, ${\mathrm{\Sigma }}^{-}$ can play the same role as ${\mathrm{\Sigma }}^{-1}$ does when Σ is nonsingular as long as $\mathcal{R}\left(X\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$. That is to say, ${W}^{-}$ can be replaced by ${\mathrm{\Sigma }}^{-}$ in this case. We then have
$cov\left(X\stackrel{˜}{\beta }\right)=X{\left({X}^{\prime }{\mathrm{\Sigma }}^{-}X\right)}^{-}{X}^{\prime }.$
(3.4)
The covariance matrix of the well-known OLSE of is given by
$cov\left(X\stackrel{ˆ}{\beta }\right)=X{\left({X}^{\prime }X\right)}^{-}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{-}{X}^{\prime }=H\mathrm{\Sigma }H.$
(3.5)

In the following, we make efficiency comparisons between the OLSE and BLUE in a singular model according to the above category.

Firstly, we will discuss the category (1). The matrix product ${X}^{\prime }{\mathrm{\Sigma }}^{-}X$ is invariant for all the choices of generalized inverse ${\mathrm{\Sigma }}^{-}$ because of the column space inclusion $\mathcal{R}\left(X\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$. Applying the rank rule of the matrix product , we can get that
$rk\left({X}^{\prime }\mathrm{\Sigma }X\right)=rk\left(\mathrm{\Sigma }X\right)=rk\left(X\right)-dim\mathcal{R}\left(X\right)\cap \mathcal{R}{\left(\mathrm{\Sigma }\right)}^{\perp }=p.$
(3.6)
Note that $\mathcal{R}\left(\mathrm{\Sigma }\right)=\mathcal{R}\left({\mathrm{\Sigma }}^{+}\right)$, and then we can conclude that ${X}^{\prime }\mathrm{\Sigma }X$ and ${X}^{\prime }{\mathrm{\Sigma }}^{+}X$ are both nonsingular. In the literature, such a model is often regarded as a weakly singular model or the Zyskind-Martin model . In this model, the relative efficiency ρ becomes
${\rho }_{1}=\frac{\parallel cov\left(X\stackrel{˜}{\beta }\right)\parallel }{\parallel cov\left(X\stackrel{ˆ}{\beta }\right)\parallel }=\frac{\parallel X{\left({X}^{\prime }{\mathrm{\Sigma }}^{+}X\right)}^{-1}{X}^{\prime }\parallel }{\parallel X{\left({X}^{\prime }X\right)}^{-1}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{-1}{X}^{\prime }\parallel }.$
(3.7)

It is easy to prove that ${\rho }_{1}\le 1$. The following theorem gives its lower bound.

Theorem 3.1 In the linear regression model (1.1), let ${\lambda }_{1}\ge \cdots \ge {\lambda }_{s}$ ($s) be the ordered eigenvalues of Σ and X be an $n×p$ design matrix with $rk\left(X\right)=p$, $\mathcal{R}\left(X\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$. Then we have
${\rho }_{1}\ge \frac{2p\sqrt{{\lambda }_{1}{\lambda }_{p}{\lambda }_{s-p+1}{\lambda }_{s}}}{\left({\lambda }_{1}{\lambda }_{s-p+1}+{\lambda }_{s}{\lambda }_{p}\right){\sum }_{i=1}^{p}\frac{{\lambda }_{i}}{{\lambda }_{s-p+i}}}.$
There exists some orthogonal matrix P such that $\mathrm{\Sigma }=P\mathrm{\Lambda }{P}^{\prime }$, so , where $\mathrm{\Lambda }=diag\left({\lambda }_{1},\dots ,{\lambda }_{s},0,\dots ,0\right)$. Let , and then we have that and

Using Theorem 2.5, the result in Theorem 3.1 can be established. □

Secondly, we will consider the category (2). Let $rk\left(X\right)=r$ ($r). Using equation (3.6), we can get that $rk\left({X}^{\prime }\mathrm{\Sigma }X\right)=rk\left({X}^{\prime }{\mathrm{\Sigma }}^{+}X\right)=r$. Analogically, the relative efficiency ρ becomes
${\rho }_{2}=\frac{\parallel cov\left(X\stackrel{˜}{\beta }\right)\parallel }{\parallel cov\left(X\stackrel{ˆ}{\beta }\right)\parallel }=\frac{\parallel X{\left({X}^{\prime }{\mathrm{\Sigma }}^{+}X\right)}^{+}{X}^{\prime }\parallel }{\parallel X{\left({X}^{\prime }X\right)}^{+}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{+}{X}^{\prime }\parallel }.$
(3.8)

It is easy to prove that ${\rho }_{2}\le 1$. The following theorem gives its lower bound.

Theorem 3.2 In the linear regression model (1.1), let ${\lambda }_{1}\ge \cdots \ge {\lambda }_{s}$ ($s) be the ordered eigenvalues of Σ and X be an $n×p$ design matrix with $rk\left(X\right)=r$ ($r), $\mathcal{R}\left(X\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$. We then have
${\rho }_{2}\ge \frac{2r\sqrt{{\lambda }_{1}{\lambda }_{r}{\lambda }_{s-r+1}{\lambda }_{s}}}{\left({\lambda }_{1}{\lambda }_{s-r+1}+{\lambda }_{s}{\lambda }_{r}\right){\sum }_{i=1}^{r}\frac{{\lambda }_{i}}{{\lambda }_{s-r+i}}}.$
Proof It is easy to prove that
$X{\left({X}^{\prime }\mathrm{\Sigma }X\right)}^{+}{X}^{\prime }={\left(H{\mathrm{\Sigma }}^{+}H\right)}^{+}.$

Then the proof is similar to Theorem 3.1, therefore we omit it here. □

Finally, we take into account the category (3). Owing to (3.3), we may write
$cov\left(X\stackrel{˜}{\beta }\right)=cov\left(X\stackrel{ˆ}{\beta }\right)-H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H.$
Therefore, we define
${\rho }_{3}=\frac{\parallel cov\left(X\stackrel{ˆ}{\beta }\right)-cov\left(X\stackrel{˜}{\beta }\right)\parallel }{\parallel cov\left(X\stackrel{ˆ}{\beta }\right)\parallel }=\frac{\parallel H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H\parallel }{\parallel H\mathrm{\Sigma }H\parallel }.$
(3.9)
Let $dim\mathcal{R}\left(X\right)\cap \mathcal{R}\left(\mathrm{\Sigma }\right)=g$ ($g\ge 0$) and $rk\left(X\right)=r$ ($r\le p$). Note that
$\begin{array}{rcl}rk\left({X}^{\prime }\mathrm{\Sigma }X\right)& =& rk\left({X}^{\prime }X{\left({X}^{\prime }X\right)}^{+}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{+}{X}^{\prime }X\right)\\ \le & rk\left(X{\left({X}^{\prime }X\right)}^{+}{X}^{\prime }\mathrm{\Sigma }X{\left({X}^{\prime }X\right)}^{+}{X}^{\prime }\right)\le rk\left({X}^{\prime }\mathrm{\Sigma }X\right).\end{array}$
Due to equation (3.6), we have
$rk\left(H\mathrm{\Sigma }H\right)=rk\left(X\right)-dim\mathcal{R}\left(X\right)\cap \mathcal{R}{\left(\mathrm{\Sigma }\right)}^{\perp }.$
(3.10)
As a result from
$\mathcal{R}\left(X\right)=\mathcal{R}\left(X\right)\cap \left(\mathcal{R}\left(\mathrm{\Sigma }\right)\oplus \mathcal{R}{\left(\mathrm{\Sigma }\right)}^{\perp }\right)=\left(\mathcal{R}\left(X\right)\cap \mathcal{R}\left(\mathrm{\Sigma }\right)\right)\oplus \left(\mathcal{R}\left(X\right)\cap \mathcal{R}{\left(\mathrm{\Sigma }\right)}^{\perp }\right),$
we then have
$rk\left(H\mathrm{\Sigma }H\right)=r-\left(r-g\right)=g.$
(3.11)
Similarly, we can obtain that
$rk\left(M\mathrm{\Sigma }M\right)=rk\left(\mathrm{\Sigma }M\right)=rk\left(M\right)-dim\mathcal{R}\left(M\right)\cap \mathcal{R}{\left(\mathrm{\Sigma }\right)}^{\perp }=dim\mathcal{R}\left(M\right)\cap \mathcal{R}\left(\mathrm{\Sigma }\right).$
(3.12)
In view of $\mathcal{R}\left(M\right)=\mathcal{R}{\left(X\right)}^{\perp }$ and
$\mathcal{R}\left(\mathrm{\Sigma }\right)=\mathcal{R}\left(\mathrm{\Sigma }\right)\cap \left(\mathcal{R}\left(X\right)\oplus \mathcal{R}{\left(X\right)}^{\perp }\right)=\left(\mathcal{R}\left(\mathrm{\Sigma }\right)\cap \mathcal{R}\left(X\right)\right)\oplus \left(\mathcal{R}\left(\mathrm{\Sigma }\right)\cap \mathcal{R}{\left(X\right)}^{\perp }\right),$
we can get that
$rk\left(M\mathrm{\Sigma }M\right)=rk\left(\mathrm{\Sigma }M\right)=s-g.$
(3.13)
Theorem 3.3 In the linear regression model (1.1), let ${\lambda }_{1}\ge \cdots \ge {\lambda }_{s}$ ($s) be the ordered eigenvalues of Σ and X be an $n×p$ design matrix with $rk\left(X\right)=r$ ($r\le p$), $dim\mathcal{R}\left(X\right)\cap \mathcal{R}\left(\mathrm{\Sigma }\right)=g$, $rk\left(H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H\right)=h$, we then have
Proof For convenience, let $a=\parallel H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H\parallel$ and $b=\parallel H\mathrm{\Sigma }H\parallel$. Then $H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H$ is invariant for all the choices of generalized inverses ${\left(M\mathrm{\Sigma }M\right)}^{-}$. From Lemma 2.1, we can easily get that
${a}^{2}=\sum _{i=1}^{h}{\lambda }_{i}^{2}\left(H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H\right)=\sum _{i=1}^{h}{\lambda }_{i}^{2}\left({\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H\mathrm{\Sigma }M\right).$
Obviously, $h=rk\left(H\mathrm{\Sigma }M{\left(M\mathrm{\Sigma }M\right)}^{-}M\mathrm{\Sigma }H\right)\le rk\left(\mathrm{\Sigma }H\right)=g$ and $h\le rk\left(\mathrm{\Sigma }M\right)=s-g$. Since Σ and $\mathrm{\Sigma }H\mathrm{\Sigma }$ are positive semidefinite matrices and $\mathcal{R}\left(\mathrm{\Sigma }H\mathrm{\Sigma }\right)\subset \mathcal{R}\left(\mathrm{\Sigma }\right)$, we can derive from Lemma 2.4 that
${a}^{2}\le \sum _{i=1}^{h}{\lambda }_{i}^{2}\left({\mathrm{\Sigma }}^{-}\mathrm{\Sigma }H\mathrm{\Sigma }\right)=\sum _{i=1}^{h}{\lambda }_{i}^{2}\left(\mathrm{\Sigma }{\mathrm{\Sigma }}^{-}\mathrm{\Sigma }H\right)=\sum _{i=1}^{h}{\lambda }_{i}^{2}\left(\mathrm{\Sigma }H\right).$
Here H is an orthogonal projection matrix, and then we obtain from Lemma 2.2 that
${a}^{2}\le \sum _{i=1}^{h}{\lambda }_{i}^{2}\left(\mathrm{\Sigma }\right)\le \sum _{i=1}^{h}{\lambda }_{i}^{2}.$
Furthermore, since $H\mathrm{\Sigma }H$ is a Hermitian matrix, by Lemma 2.1, we can get that
${b}^{2}=\sum _{i=1}^{g}{\lambda }_{i}^{2}\left(H\mathrm{\Sigma }H\right)=\sum _{i=1}^{g}{\lambda }_{i}^{2}\left(\mathrm{\Sigma }H\right).$
The Sylvester theorem (see, e.g., ) shows that $n-r+g>s$. Analogically, we can get that
${b}^{2}\ge \sum _{i=1}^{s+r-n}{\lambda }_{n-r+i}^{2}.$
Applying the well-known arithmetic-harmonic mean inequality, we have
$\frac{1}{{b}^{2}}\le \frac{1}{{\left(s+r-n\right)}^{2}}\sum _{i=1}^{s+r-n}{\lambda }_{n-r+i}^{-2}.$
Firstly, we suppose that $h\le s+r-n$. We can then compute that
${\rho }_{3}^{2}=\frac{{a}^{2}}{{b}^{2}}\le \frac{1}{{\left(s+r-n\right)}^{2}}\sum _{i=1}^{s+r-n}{\lambda }_{i}^{2}\sum _{i=1}^{s+r-n}{\lambda }_{n-r+i}^{-2}.$

By the Pólya and Szegö inequality and a nontrivial but elementary combinational argument, we can establish the first inequality. In fact, the second inequality is similar. □

## 4 Conclusions

In this article, we use several new matrix norm versions of the Kantorovich inequality involving a nonnegative definite matrix to make the comparison of efficiencies between OLSE and BLUE in a singular linear model. The singular linear model is divided into three categories in accordance with the assumptions on the ranks of X and Σ. We introduce some new relative efficiency criteria and their lower or upper bounds are given based on matrix norm inequalities in Theorem 3.1, Theorem 3.2 and Theorem 3.3.

## Declarations

### Acknowledgements

The authors thank very much associate editors and referees for their insightful comments that led to improving the presentation. This work is partly supported by the National Natural Science Foundation of China (No. 11126211; No. 61201398) and the Natural Science Foundation of Zhejiang Province (No. LQ12A01021).

## Authors’ Affiliations

(1)
Department of Applied Mathematics, Zhejiang University of Technology, Hangzhou, China
(2)
College of Mechanical Electrical Engineering, Zhejiang University of Technology, Hangzhou, China

## References 