Skip to main content

On some bounds on the perturbation of invariant subspaces of normal matrices with application to a graph connection problem

Abstract

We provide upper bounds on the perturbation of invariant subspaces of normal matrices measured using a metric on the space of vector subspaces of \(\mathbb{C}^{n}\) in terms of the spectrum of both unperturbed and perturbed matrices as well as the spectrum of the unperturbed matrix only. The results presented give tighter bounds than the Davis–Kahan sinΘ theorem. We apply the result to a graph perturbation problem.

1 Introduction

Classical results on perturbation of invariant subspaces of a matrix usually take one of the two forms: (1) perturbation measured in terms of a natural metric in the space of vector subspaces (usually expressed as the sine of the angle between subspaces) with upper bound described in terms of the perturbation in the matrices as well as the spectra of both unperturbed and perturbed matrices (for example, the Davis–Kahan sinΘ theorem [1] – see Section VIII.3 of [2] where a generalization of this theorem is given for normal matrices); or (2) perturbation measured in terms of bounds on norms of matrices that relate an invariant subspace with its perturbation in a more complex manner (which, in general, is not a natural metric in the space of vector subspaces) although the upper bound is based on the spectrum of the unperturbed matrix only (see, for example, [1, 3] or Chapter V of [4]).

In this paperFootnote 1 we first derive an upper bound reminiscent of the Davis–Kahan sinΘ theorem, but generalized for normal matrices and with modestly tighter bound (Proposition 1). Then we use some geometric methods to derive a bound on perturbation measured in terms of a natural metric in the space of subspaces, but with upper bounds in terms of spectrum of the unperturbed matrix only (Proposition 2) when the spectrum is well clustered (a relation formally described as “separation-preserving perturbation”). In the latter case our proposed result also allows easy identification of the perturbed invariant subspace (Lemma 7).

Definition 1

(Notations)

Throughout the paper we assume \(M, \widetilde{M} \in \mathbb{C}^{n\times n}\) to be normal matrices unless specified otherwise, and by “eigenvectors” we refer to their right eigenvectors. The eigenvalues (not necessarily distinct) and corresponding unit eigenvectors (for degenerate eigenspaces, any orthonormal basis thereof) of M are \(\lambda _{j}\) and \(\mathbf{u}_{j}\) for \(j=1,2,\ldots ,n\). Likewise, the eigenvalues and corresponding unit eigenvectors of are \(\widetilde{\lambda }_{j}\) and \(\widetilde{\mathbf{u}}_{j}\) for \(j=1,2,\ldots ,n\). We will usually consider the eigenvectors to be column vectors in \(\mathbb{C}^{n\times 1}\). Let \(U = [\mathbf{u}_{1},\mathbf{u}_{2},\ldots ,\mathbf{u}_{n}]\) and \(\widetilde{U} = [\widetilde{\mathbf{u}}_{1},\widetilde{\mathbf{u}}_{2}, \ldots ,\widetilde{\mathbf{u}}_{n}]\) be the unitary matrices that diagonalize M and respectively. A dagger as superscript on a matrix or a vector, \({(\cdot )}^{{\dagger }}\), denotes the conjugate transpose (Hermitian transpose) of the matrix or vector. For notational convenience, define \(N = \{1,2,\ldots ,n\}\).

As a convention, we choose primed lower-case Latin letters to index variables (eigenvalues or eigenvectors) with tilde on them. Given a set \(S \subseteq N\), we define the set \(\mathbf{u}_{S} = \{\mathbf{u}_{j} | j\in S\}\). Likewise \(\widetilde{\mathbf{u}}_{S} = \{\widetilde{\mathbf{u}}_{j'} | {j'} \in S\}\). Define the multi-sets \(\lambda _{S} = \{\lambda _{j} | j\in S\}\) and \(\widetilde{\lambda }_{S} = \{\widetilde{\lambda }_{j'} | {j'} \in S\}\) (by asserting that these are multi-sets, we allow multiplicity in the values, thus ensuring these sets have the same number of elements as S). We also define the complement of S as \(S^{c} = N-S\).

The outline of the paper is as follows:

  1. 1

    In Sect. 2.1 we describe a natural metric \(d_{\mathrm{{sp}}}\) on \(\operatorname{Gr}(q,\mathbb{C}^{n})\) (the space of q-dimensional complex vector subspaces of \(\mathbb{C}^{n}\)) to measure perturbation of invariant subspaces of \(n\times n\) normal matrices. This metric is equivalent to the Frobenius norm of the sinΘ matrix between subspaces of \(\mathbb{C}^{n}\).

  2. 2

    Some geometry lemmas are proven in Sect. 2.2, and then they are used in Sect. 3.3 for deriving bounds on the perturbation of invariant subspaces in terms of the spectrum of the unperturbed matrix only (when the spectrum is well clustered).

  3. 3

    In Sect. 3.2 we describe an upper bound on the distance between invariant subspaces in terms of the spectrum of both unperturbed and perturbed matrices. Some of these results give improvements on the Davis–Kahan sinΘ theorem for normal matrices (although the Davis–Kahan sinΘ is usually stated for Hermitian matrices, there exist generalizations of the theorem for normal matrices – see Section VIII.3 of [2]). As an example (see Fig. 1), for any \(J,\widetilde{J} \subseteq N\), with \(|J| = |\widetilde{J}| = q\), Proposition 1 states

    $$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) \leq \sqrt{ \frac{1}{q} \sum_{j\in J} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda }_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j'\in{\widetilde{J}}^{c}} \vert \widetilde{\lambda }_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda }_{{j'}} - \lambda _{j} \vert ^{2} } } \end{aligned}$$

    with

    $$ \kappa _{j} = \textstyle\begin{cases} 0, & \text{if } \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} \geq \min_{j' \in{\widetilde{J}}^{c}} \vert \widetilde{\lambda }_{{j'}} - \lambda _{j} \vert , \\ 1, & \text{if } \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} < \min_{j' \in{\widetilde{J}}^{c}} \vert \widetilde{\lambda }_{{j'}} - \lambda _{j} \vert . \end{cases} $$

    This is a tighter upper bound than the Davis–Kahan sinΘ theorem, which, as a consequence, leads to the rediscovery of a couple of slight variations on the Davis–Kahan sinΘ theorem in Corollary 5, where, as an example, one result states

    $$ d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) \leq \frac{ \min ( 1, \sqrt{\frac{n-q}{q}} ) }{ \max ( \operatorname{sep} ( \lambda _{J}, \widetilde{\lambda }_{{\widetilde{J}}^{c}} ), \operatorname{sep} ( \lambda _{J^{c}}, \widetilde{\lambda }_{\widetilde{J}} ) ) } \Vert \widetilde{M} - M \Vert _{2}, $$

    where \(\operatorname{sep}(P,Q) = \min_{p\in P, q\in Q}|p-q|\) simply measures the min-min distance between the sets (this is unlike the Davis–Kahan sinΘ theorem generalized for normal matrices, where it is necessary to find a ‘strip’ or ‘annulus’ of width δ separating \(\lambda _{J}\) and \(\widetilde{\lambda }_{{\widetilde{J}}^{c}}\) – see Theorem VIII.3.1 of [2]).

    Figure 1
    figure 1

    Partition of the eigenvalues of M (in blue) and (in red)

  4. 4

    The next set of the main results of this paper appears in Sect. 3.3, which formalizes the notion of well-clustered spectrum in Lemma 7, followed by Proposition 2 that provides the upper bound on the perturbation of an invariant subspace in terms of the spectrum of the unperturbed matrix only. These results rely on the geometry lemmas from Sect. 2.2. As an example, one of the results of Proposition 2 states that if \(\|\widetilde{M} - M\|_{2} < \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}})\), then

    $$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) \bigr) \\ & \quad \leq \frac{1}{\sqrt{q}} \min \biggl( \sqrt{ \sum _{j\in J} \biggl( \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} }{ \min_{k\in {J^{c}}} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} } \biggr)^{ 2}} , \\ & \qquad \sqrt{ \sum_{j\in {J^{c}}} \biggl( \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} }{ \min_{k\in J} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} } \biggr)^{ 2} } \biggr) \\ & \quad \leq \min \biggl( 1, \sqrt{\frac{n-q}{q}} \biggr) \frac{ \Vert \widetilde{M} - M \Vert _{2} }{ \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2} }, \end{aligned}$$

    where \(\widehat{J} = \{j' | \min_{j\in N} |\widetilde{\lambda }_{j'} - \lambda _{j}| = \min_{j\in J} |\widetilde{\lambda }_{j'} - \lambda _{j}| \}\) is the set of indices corresponding to the eigenvalues of that are closer to \(\lambda _{J}\) than to \(\lambda _{J^{c}}\).

  5. 5

    Sect. 4 demonstrates an application to the perturbation of a null-space of a matrix in the context of a graph perturbation problem.

2 Preliminaries

2.1 A metric on \(\operatorname{Gr}(q,\mathbb{C}^{n})\)

Definition 2

(Subspace distance)

Suppose that \(X,Y \subseteq \mathbb{C}^{n}\) are q-dimensional vector sub-spaces of \(\mathbb{C}^{n}\).

Let \(\{\mathbf{x}_{j}\}_{j=1,2,\ldots ,q}\) and \(\{\mathbf{y}_{j}\}_{j=1,2,\ldots ,q}\) be orthonormal bases on X and Y. The subspace distance between X and Y is defined as

$$ d_{\mathrm{sp}} (X, Y) = \frac{1}{\sqrt{2q}} \bigl\Vert \mathbf{X} \mathbf{X}^{{ \dagger }}- \mathbf{Y}\mathbf{Y}^{{\dagger }} \bigr\Vert _{F}, $$
(1)

where

$$ \mathbf{X} = [\mathbf{x}_{1}, \mathbf{x}_{2}, \ldots , \mathbf{x}_{q}] \quad \text{and}\quad \mathbf{Y} = [ \mathbf{y}_{1}, \mathbf{y}_{2}, \ldots , \mathbf{y}_{q}] $$
(2)

are the \(n\times q\) matrices in which the columns represent the unit vectors \(\{\mathbf{x}_{j}\}_{j=1,2,\ldots ,q}\) and \(\{\mathbf{y}_{j}\}_{j=1,2,\ldots ,q}\). \(\mathbf{X}^{{ \dagger }}\) and \(\mathbf{Y}^{{\dagger }}\) are the Hermitian transpose (i.e., adjoint) of X and Y respectively.

Note that the matrices \(\mathbf{X}\mathbf{X}^{{\dagger }}\) and \(\mathbf{Y}\mathbf{Y}^{{\dagger }}\) are the projection operators on X and Y respectively. The space of difference of such projection operators is well studied in the literature (see [6, 7] for example), and the norms of such differences have been used as a metric on \(\operatorname{Gr}(q,\mathbb{C}^{n})\) (see [8] for example). In fact this metric is equivalent to the Frobenius norm of the sinΘ matrix between subspaces of \(\mathbb{C}^{n}\) that is used for measuring perturbation of invariant subspaces in the context of the Davis–Kahan sinΘ theorem. We choose the Frobenius norm for measuring the distance between the projection operators and use a scaling factor of \(\frac{1}{\sqrt{2q}}\) for convenience and some additional properties of the metric. The following lemmas outline some elementary and mostly standard properties of this metric.

Let \(X^{\perp}\) and \(Y^{\perp}\) be orthogonal complements of X and Y respectively in \(\mathbb{C}^{n}\). Let \(\{\mathbf{x}_{j}\}_{j=q+1,q+2,\ldots ,n}\) and \(\{\mathbf{y}_{k}\}_{k=q+1,q+2,\ldots ,n}\) be orthonormal basis for \(X^{\perp}\) and \(Y^{\perp}\) respectively. Define

$$ \mathbf{X}^{\perp}= [\mathbf{x}_{q+1}, \mathbf{x}_{q+2}, \ldots , \mathbf{x}_{n}] \quad \text{and} \quad \mathbf{Y}^{\perp}= [\mathbf{y}_{q+1}, \mathbf{y}_{q+2}, \ldots , \mathbf{y}_{n}]. $$
(3)

Lemma 1

(Equivalent forms of \(d_{\mathrm{sp}}\))

  1. 1

    \(d_{\mathrm{sp}} (X, Y) = \sqrt{1 - \frac{1}{q} \| \mathbf{X}^{{\dagger }}\mathbf{Y}\|_{F}^{2}} = \sqrt{1 - \frac{1}{q} \sum_{j=1}^{q} \sum_{k=1}^{q} \vert \mathbf{x}_{j}^{{ \dagger }}\mathbf{y}_{k} \vert ^{2}}\)

  2. 2

    \(d_{\mathrm{sp}} (X, Y) = \sqrt{ \frac{1}{q} \|{ \mathbf{X}^{\perp}}^{{\dagger }} \mathbf{Y}\|_{F}^{2}} = \sqrt{ \frac{1}{q} \sum_{j=q+1}^{n} \sum_{k=1}^{q} \vert \mathbf{x}_{j}^{{ \dagger }}\mathbf{y}_{k} \vert ^{2}}\)

Proof

  1. 1

    In the following we use the definition \(\|\mathbf{A}\|_{F}^{2} = \operatorname{tr}(\mathbf{A}^{{\dagger }} \mathbf{A})\) and the property that \(\operatorname{tr}(\mathbf{A}\mathbf{B}) = \operatorname{tr}(\mathbf{B}\mathbf{A})\).

    $$\begin{aligned} & \bigl( d_{\mathrm{sp}} (X, Y) \bigr)^{2} \\ &\quad = \frac{1}{2q} \bigl\Vert \mathbf{X}\mathbf{X}^{{\dagger }}- \mathbf{Y} \mathbf{Y}^{{\dagger }} \bigr\Vert _{F}^{2} \\ &\quad = \frac{1}{2q} \operatorname{tr} \bigl( \bigl( \mathbf{X} \mathbf{X}^{{ \dagger }}- \mathbf{Y}\mathbf{Y}^{{\dagger }} \bigr)^{{\dagger }} \bigl( \mathbf{X}\mathbf{X}^{{\dagger }}- \mathbf{Y} \mathbf{Y}^{{ \dagger }} \bigr) \bigr) \\ &\quad = \frac{1}{2q} \operatorname{tr} \bigl(\mathbf{X} \mathbf{X}^{{\dagger }} \mathbf{X}\mathbf{X}^{{\dagger }} \bigr) + \operatorname{tr} \bigl(\mathbf{Y} \mathbf{Y}^{{\dagger }}\mathbf{Y} \mathbf{Y}^{{\dagger }} \bigr) - \operatorname{tr} \bigl(\mathbf{X} \mathbf{X}^{{\dagger }}\mathbf{Y}\mathbf{Y}^{{ \dagger }} \bigr) - \operatorname{tr} \bigl( \mathbf{Y}\mathbf{Y}^{{\dagger }} \mathbf{X} \mathbf{X}^{{\dagger }} \bigr) \\ &\quad = \frac{1}{2q} \bigl( \operatorname{tr} \bigl(\mathbf{X} \mathbf{X}^{{ \dagger }} \bigr) + \operatorname{tr} \bigl(\mathbf{Y} \mathbf{Y}^{{\dagger }} \bigr) - 2 \operatorname{tr} \bigl( \mathbf{Y}^{{\dagger }}\mathbf{X}\mathbf{X}^{{\dagger }} \mathbf{Y} \bigr) \bigr) \\ & \qquad \bigl(\text{since } \mathbf{X}^{{\dagger }}\mathbf{X} = \mathbf{Y}^{{\dagger }} \mathbf{Y} = I \bigr) \\ &\quad = 1 - \frac{1}{q} \bigl\Vert \mathbf{X}^{{\dagger }} \mathbf{Y} \bigr\Vert _{F}^{2} \\ & \qquad \Biggl(\text{since } \operatorname{tr} \bigl( \mathbf{X} \mathbf{X}^{{\dagger }} \bigr) = \operatorname{tr} \bigl( \mathbf{X}^{{\dagger }} \mathbf{X} \bigr) = \sum_{j=1}^{q} \mathbf{x}_{j}^{{\dagger }}\mathbf{x}_{j} = q,\text{ and likewise for }\mathbf{Y} \Biggr) \\ &\quad = 1 - \frac{1}{q} \sum_{j=1}^{q} \sum_{k=1}^{q} \bigl\vert \mathbf{x}_{j}^{{\dagger }}\mathbf{y}_{k} \bigr\vert ^{2}. \end{aligned}$$
  2. 2

    Note that \([\mathbf{X}, \mathbf{X}^{\perp}]\) is an \(n \times n\) unitary matrix with columns being the vectors of the orthonormal basis \(\{\mathbf{x}_{i}\}_{i=1,2,\ldots ,n}\). Thus, \([\mathbf{X}, \mathbf{X}^{\perp}] [\mathbf{X}, \mathbf{X}^{\perp}]^{{ \dagger }}= \mathbf{X} \mathbf{X}^{{\dagger }}+ \mathbf{X}^{\perp}{ \mathbf{X}^{\perp}}^{{\dagger }}= I\). Thus,

    $$\begin{aligned} \bigl( d_{\mathrm{{sp}}} ( X, Y ) \bigr)^{2} & = 1 - \frac{1}{q} \bigl\Vert \mathbf{X}^{{\dagger }}\mathbf{Y} \bigr\Vert _{F}^{2} \\ & = 1 - \frac{1}{q} \operatorname{tr} \bigl( \mathbf{Y}^{{\dagger }} \mathbf{X} \mathbf{X}^{{\dagger }}\mathbf{Y} \bigr) \\ & = 1 - \frac{1}{q} \operatorname{tr} \bigl( \mathbf{Y}^{{\dagger }} \bigl(I - \mathbf{X}^{\perp}{\mathbf{X}^{\perp}}^{{\dagger }} \bigr) \mathbf{Y} \bigr) \\ & = 1 - \frac{1}{q} \operatorname{tr} \bigl( \mathbf{Y}^{{\dagger }} \mathbf{Y} \bigr) + \frac{1}{q} \operatorname{tr} \bigl( \mathbf{Y}^{{ \dagger }}\mathbf{X}^{\perp}{\mathbf{X}^{\perp}}^{{\dagger }} \mathbf{Y} \bigr) \\ & = 1 - \frac{1}{q} q + \frac{1}{q} \operatorname{tr} \bigl( \mathbf{Y}^{{ \dagger }}\mathbf{X}^{\perp}{\mathbf{X}^{\perp}}^{{\dagger }} \mathbf{Y} \bigr) \\ & = \frac{1}{q} \bigl\Vert {\mathbf{X}^{\perp}}^{{\dagger }} \mathbf{Y} \bigr\Vert _{F}^{2} = \frac{1}{q} \sum_{j=q+1}^{n} \sum _{k=1}^{q} \bigl\vert \mathbf{x}_{j}^{{ \dagger }} \mathbf{y}_{k} \bigr\vert ^{2}. \end{aligned}$$

 □

Lemma 2

(Properties of \(d_{\mathrm{sp}}\))

  1. 1

    The value of \(d_{\mathrm{{sp}}} ( X, Y )\) is independent of the choice of basis on X or Y (or the basis on \(X^{\perp}\) or \(Y^{\perp}\), if using the equivalent form in Lemma 1.2).

  2. 2

    \(d_{\mathrm{{sp}}}\) is a metric on \(\operatorname{Gr}(q, \mathbb{C}^{n})\) (the space of q-dimensional complex subspaces of \(\mathbb{C}^{n}\)).

  3. 3

    \(\sqrt{q} d_{\mathrm{{sp}}} ( X, Y ) = \sqrt{n-q} d_{\mathrm{{sp}}} ( X^{\perp}, Y^{\perp}) \).

  4. 4

    \(d_{\mathrm{{sp}}} ( X, Y ) \leq 1\), with equality holding iff X and Y are orthogonal subspaces (which is possible only if \(q \leq n/2\)).

Proof

  1. 1

    Suppose that \(\{\mathbf{x}'_{j}\}_{j=1,2,\ldots ,q}\) and \(\{\mathbf{y}'_{j}\}_{j=1,2,\ldots ,q}\) are a different set of orthonormal bases on X and Y respectively. Define \(\mathbf{X}' = [\mathbf{x}'_{1}, \mathbf{x}'_{2}, \ldots , \mathbf{x}'_{q}]\), \(\mathbf{Y}' = [\mathbf{y}'_{1}, \mathbf{y}'_{2}, \ldots , \mathbf{y}'_{q}]\). Thus there exist \(q\times q\) unitary matrices \(R_{X}, R_{Y} \in U(q)\) such that \(\mathbf{X} = \mathbf{X}' R_{X}\) and \(\mathbf{Y} = \mathbf{Y}' R_{Y}\). Then

    $$\begin{aligned} & \bigl( d_{\mathrm{sp}} (X, Y) \bigr)^{2} \\ &\quad = \frac{1}{2q} \bigl\Vert \mathbf{X}\mathbf{X}^{{\dagger }}- \mathbf{Y} \mathbf{Y}^{{\dagger }} \bigr\Vert _{F}^{2} \\ &\quad = \frac{1}{2q} \bigl\Vert \bigl(\mathbf{X}' R_{X} \bigr) \bigl( \mathbf{X}' R_{X} \bigr)^{{\dagger }}- \bigl(\mathbf{Y}' R_{Y} \bigr) \bigl(\mathbf{Y}' R_{Y} \bigr)^{{\dagger }} \bigr\Vert _{F}^{2} \\ & \quad = \frac{1}{2q} \bigl\Vert \mathbf{X}' { \mathbf{X}'}^{{\dagger }}- \mathbf{Y}' { \mathbf{Y}'}^{{\dagger }} \bigr\Vert _{F}^{2}. \end{aligned}$$

    For the equivalent form in Lemma 1.2 we can use the orthonormal basis \(\{\mathbf{x}'_{j}\}_{j=q+1,q+2,\ldots ,n}\) and \(\{\mathbf{y}'_{k}\}_{k=q+1,q+2,\ldots ,n}\) for \(X^{\perp}\) and \(Y^{\perp}\) respectively and analogously derive at the equivalent form using the primed basis.

  2. 2

    Nonnegativity and symmetry properties are obvious from the definition of \(d_{\mathrm{sp}}\).

    If X and Y are the same subspaces, we can choose the same basis for them (since the value of \(d_{\mathrm{sp}} (X, Y)\) is independent of the choice of a basis on X and Y), doing so makes it obvious that \(d_{\mathrm{sp}} (X, Y) = 0\).

    Triangle inequality holds due to the fact that Frobenius norm of the difference of matrices is a metric on \(\mathbb{C}^{n\times n}\).

  3. 3

    Note that \(X^{\perp}\) and \(Y^{\perp}\) are \((n-q)\)-dimensional subspaces of \(\mathbb{C}^{n}\). Furthermore, X is the orthogonal complement of \(X^{\perp}\). As a consequence, due to Lemma 1.2,

    $$\begin{aligned} d_{\mathrm{sp}} \bigl(X^{\perp}, Y^{\perp}\bigr) & = \sqrt{ \frac{1}{n-q} } \bigl\Vert {\mathbf{X}}^{{\dagger }} \mathbf{Y}^{\perp}\bigr\Vert _{F} \\ & = \sqrt{ \frac{1}{n-q} } \bigl\Vert {\mathbf{Y}^{\perp}}^{{\dagger }} \mathbf{X} \bigr\Vert _{F} \quad \bigl(\text{since } \Vert \mathbf{A} \Vert _{F} = \bigl\Vert \mathbf{A}^{{\dagger }} \bigr\Vert _{F}. \bigr) \\ & = \sqrt{\frac{1}{n-q}} \sqrt{q} d_{\mathrm{sp}} (Y, X) = \sqrt{ \frac{q}{n-q}} d_{\mathrm{sp}} (X, Y). \end{aligned}$$
  4. 4

    The last property is obvious from the result of Lemma 1.1.

 □

2.2 Some results involving set distances

In this section we provide some geometry results that will be used in Sect. 3.3 for computing the upper bounds on the perturbation of invariant subspaces in terms of the spectrum of the unperturbed matrix only. For the purpose of this paper and for simplicity, we consider only closed subsets of metric spaces in the following lemmas, although all these results can potentially be generalized for subsets that are open or/and closed in the metric space.

Definition 3

Given closed subsets A, B of a metric space \((\Psi ,d)\), we define the following:

  1. 1

    Separation between the sets

    $$ \operatorname{sep}(A,B) = \min_{a\in A,\atop b\in B} d(a,b); $$
  2. 2

    Hausdorff distance between the sets

    $$ d_{H}(A,B) = \max \Bigl( \max_{a\in A} \min _{b\in B} d(a,b) , \max_{b\in B} \min _{a\in A} d(a,b) \Bigr); $$
  3. 3

    Diameter of a set

    $$ \operatorname{diam}(A) = \max_{a\in A,\atop a'\in A} d \bigl(a,a' \bigr). $$

Lemma 3

If \((\Psi ,d)\) is a metric space, then for any closed subsets \(P, Q, R \subseteq \Psi \),

$$ \operatorname{sep}(P,Q) \leq \operatorname{sep}(P,R) + \operatorname{sep}(R,Q) + \operatorname{diam}(R). $$
(4)

Proof

Let \((p^{*},r_{1}) \in \arg \min_{p\in P, r\in R} d(p,r)\) (that is, \(p^{*}\in P\), \(r_{1} \in R\) are a pair of points such that \(d(p^{*},r_{1}) = \min_{p\in P, r\in R} d(p,r) = \operatorname{sep}(P,R)\)). Likewise, let \((q^{*},r_{2}) \in \arg \min_{q\in Q, r\in R} d(q,r)\) (that is, \(d(q^{*}, r_{2}) = \operatorname{sep}(R,Q)\)). Then

$$\begin{aligned} \operatorname{sep}(P,Q) &\leq d \bigl(p^{*},q^{*} \bigr) \quad \Bigl(\text{since } \operatorname{sep}(P,Q) = \min _{p\in P,\atop q\in Q} d(p,q) \Bigr) \\ &\leq d \bigl(p^{*},r_{1} \bigr) + d \bigl(r_{1},q^{*} \bigr) \quad \text{(triangle inequality)} \\ &= \operatorname{sep}(P,R) + d \bigl(r_{1},q^{*} \bigr) \\ &\leq \operatorname{sep}(P,R) + d(r_{1},r_{2}) + d \bigl(q^{*}, r_{2} \bigr) \quad \text{(triangle inequality)} \\ &= \operatorname{sep}(P,R) + \operatorname{sep}(R,Q) + d(r_{1},r_{2}) \\ &\leq \operatorname{sep}(P,R) + \operatorname{sep}(R,Q) + \operatorname{diam}(R). \end{aligned}$$
(5)

 □

Lemma 4

If \((\Psi ,d)\) is a connected path metric space, then for any closed subsets \(P, Q, \widetilde{Q} \subseteq \Psi \),

$$ \operatorname{sep}(P,Q) \leq \operatorname{sep}(P, \widetilde{Q}) + d_{H}( \widetilde{Q}, Q). $$
(6)

Proof

Let \((p_{0},q^{*}) \in \arg \min_{p\in P, q\in Q} d(p,q)\) (that is, \(p_{0}\in P\), \(q^{*} \in Q\) are a pair of points such that \(d(p_{0},q^{*}) = \min_{p\in P, q\in Q} d(p,q)\)) – see Fig. 2. Likewise, let \((p_{1},\widetilde{q}^{*}) \in \arg \min_{p\in P, q'\in \widetilde{Q}} d(p,q')\). Furthermore, let \(\widetilde{q}^{\dagger}\in \arg \min_{q' \in \widetilde{Q}} d(q^{*}, q')\) and \(q^{\dagger}\in \arg \min_{q \in Q} d(q, \widetilde{q}^{*})\).

Figure 2
figure 2

Illustration for the proof of Lemma 4

Consider the shortest path \(\gamma :[0,1]\rightarrow \Psi \) connecting \(q^{*}\) and \(\widetilde{q}^{\dagger}\) and parameterized by the normalized distance from \(q^{*}\), so that \(\gamma (0) = q^{*}\), \(\gamma (1) = \widetilde{q}^{\dagger}\) and

$$ d \bigl(q^{*},\gamma (u) \bigr) = u d \bigl(q^{*},\widetilde{q}^{\dagger}\bigr). $$
(7)

Likewise, \(\mu :[0,1]\rightarrow \Psi \) be the shortest path connecting \(q^{\dagger}\) and \(\widetilde{q}^{*}\), nd parameterized by the normalized distance from \(q^{\dagger}\), so that \(\mu (0) = q^{\dagger}\), \(\mu (1) = \widetilde{q}^{*}\) and \(d(q^{\dagger},\mu (u)) = u d(q^{\dagger},\widetilde{q}^{*})\). Consequently, since \(\mu (u)\) is a point on the shortest path connecting \(q^{\dagger}\) and \(\widetilde{q}^{*}\), we have

$$ d \bigl(\mu (u),\widetilde{q}^{*} \bigr) = d \bigl(q^{\dagger},\widetilde{q}^{*} \bigr) - d \bigl(q^{\dagger},\mu (u) \bigr) = (1-u) d \bigl(q^{\dagger}, \widetilde{q}^{*} \bigr). $$
(8)

Define \(f:[0,1]\rightarrow \mathbb{R}\) as \(f(t) = d(p_{0}, \gamma (t))\), and \(g:[0,1]\rightarrow \mathbb{R}\) as \(g(t) = d(p_{1}, \mu (t))\). It is easy to note that both f and g are continuous.

As a consequence, we have the following:

$$\begin{aligned} &f(0) = d \bigl(p_{0}, q^{*} \bigr) = \min _{p\in P, \atop q\in Q} d(p,q) \leq d \bigl(p_{1}, q^{\dagger}\bigr) = g(0), \\ &g(1) = d \bigl(p_{1}, \widetilde{q}^{*} \bigr) = \min _{p\in P, \atop q'\in \widetilde{Q}} d \bigl(p,q' \bigr) \leq d \bigl(p_{0}, \widetilde{q}^{\dagger}\bigr) = f(1). \end{aligned}$$

Thus, by intermediate value theorem, there exists \(u\in [0,1]\) such that \(f(u)=g(u)\). That is,

$$ d \bigl(p_{0}, \gamma (u) \bigr) = d \bigl(p_{1}, \mu (u) \bigr) \quad \text{for some }u\in [0,1]. $$
(9)

Using this, we have

$$\begin{aligned} \min_{p\in P, \atop q\in Q} d(p,q) & = d \bigl(p_{0},q^{*} \bigr) \\ & \leq d \bigl(p_{0},\gamma (u) \bigr) + d \bigl(q^{*}, \gamma (u) \bigr) \quad \text{(triangle inequality)} \\ & = d \bigl(p_{1}, \mu (u) \bigr) + d \bigl(q^{*}, \gamma (u) \bigr) \quad \text{(using (9))} \\ & \leq d \bigl(p_{1}, \widetilde{q}^{*} \bigr) + d \bigl( \mu (u), \widetilde{q}^{*} \bigr) + d \bigl(q^{*}, \gamma (u) \bigr) \quad \text{(triangle inequality)} \\ & = \min_{p\in P, \atop q'\in \widetilde{Q}} d \bigl(p,q' \bigr) + d \bigl( \mu (u), \widetilde{q}^{*} \bigr) + d \bigl(q^{*},\gamma (u) \bigr) \\ & = \min_{p\in P, \atop q'\in \widetilde{Q}} d \bigl(p,q' \bigr) + (1-u) d \bigl(q^{\dagger},\widetilde{q}^{*} \bigr) + u d \bigl(q^{*},\widetilde{q}^{\dagger}\bigr) \quad \text{(using (7) and (8))} \\ & \leq \min_{p\in P, \atop q'\in \widetilde{Q}} d \bigl(p,q' \bigr) + \max \bigl( d \bigl(q^{\dagger},\widetilde{q}^{*} \bigr) , d \bigl(q^{*},\widetilde{q}^{\dagger}\bigr) \bigr) \\ & = \min_{p\in P, \atop q'\in \widetilde{Q}} d \bigl(p,q' \bigr) + \max \Bigl( \min_{q \in Q} d \bigl(q, \widetilde{q}^{*} \bigr) , \min_{q' \in \widetilde{Q}} d \bigl(q^{*}, q' \bigr) \Bigr) \\ & \quad \bigl(\text{definitions of }q^{\dagger}\text{ and } \widetilde{q}^{\dagger}\bigr) \\ & \leq \min_{p\in P, \atop q'\in \widetilde{Q}} d \bigl(p,q' \bigr) + \max \Bigl( \max_{q' \in \widetilde{Q}} \min_{q\in Q} d \bigl(q,q' \bigr) , \max_{q\in Q} \min _{q' \in \widetilde{Q}} d \bigl(q,q' \bigr) \Bigr) \\ & = \operatorname{sep}(P,\widetilde{Q}) + d_{H}(\widetilde{Q}, Q). \end{aligned}$$

 □

Lemma 5

Suppose that P, Q, are closed subsets of a metric space \((\Psi ,d)\) such that

$$ \max_{r'\in \widetilde{R}} \min_{s\in P\cup Q} d \bigl(s,r' \bigr) + d_{H} (P\cup Q, \widetilde{R}) < \operatorname{sep} (P, Q) $$
(10)

Define (see Fig3) \(\widetilde{P}, \widetilde{Q} \subseteq \widetilde{R}\) such that

$$ \begin{gathered} \widetilde{P} = \Bigl\{ r' \in \widetilde{R} \big| \min_{s\in P\cup Q} d \bigl(s,r' \bigr) = \min _{p\in P} d \bigl(p,r' \bigr) \Bigr\} \quad \textit{and} \\ \widetilde{Q} = \Bigl\{ r' \in \widetilde{R} \big| \min _{s\in P\cup Q} d \bigl(s,r' \bigr) = \min _{q\in Q} d \bigl(q,r' \bigr) \Bigr\} . \end{gathered} $$
(11)
Figure 3
figure 3

Illustration for Lemma 5

Then

  1. 1

    \(\{\widetilde{P}, \widetilde{Q}\}\) constitutes a partition of .

  2. 2

    \(\arg \min_{s \in P \cup Q} d(s,p') \subseteq P\), \(\forall p' \in \widetilde{P}\), and \(\arg \min_{s \in P \cup Q} d(s,q') \subseteq Q\), \(\forall q' \in \widetilde{Q}\). (Consequently, \(\min_{s \in P \cup Q} d(s,p') = \min_{s \in P} d(s,p')\), \(\forall p' \in \widetilde{P}\), and \(\min_{s \in P \cup Q} d(s,q') = \min_{s \in Q} d(s,q')\), \(\forall q' \in \widetilde{Q}\).)

  3. 3

    \(\arg \min_{r' \in \widetilde{R}} d(p,r') \subseteq \widetilde{P}\), \(\forall p \in P\), and \(\arg \min_{r' \in \widetilde{R}} d(q,r') \subseteq \widetilde{Q}\), \(\forall q \in Q\). (Consequently, \(\min_{r' \in \widetilde{R}} d(p,r') = \min_{r' \in \widetilde{P}} d(p,r')\), \(\forall p \in P\), and \(\min_{r' \in \widetilde{R}} d(q,r') = \min_{r' \in \widetilde{Q}} d(q,r')\), \(\forall q \in Q\).)

  4. 4

    \(d_{H}(P,\widetilde{P}) \leq d_{H}(P\cup Q, \widetilde{R})\), \(d_{H}(Q,\widetilde{Q}) \leq d_{H}(P\cup Q,\widetilde{R})\), and \(\max ( d_{H}(P,\widetilde{P}) , d_{H}(Q,\widetilde{Q}) ) = d_{H}(P\cup Q, \widetilde{R})\).

  5. 5

    If \((\Psi ,d)\) is a connected path metric space, then \(\operatorname{sep}(\widetilde{P}, \widetilde{Q}) \geq \operatorname{sep} (P, Q) - 2 d_{H}(P\cup Q, \widetilde{R}) \).

If the above holds, we say “R̃ is a separation-preserving perturbation of P and Q” and call \(\{\widetilde{P}, \widetilde{Q}\}\) to be the “separation-preserving partition of R̃”.

Proof

1. We first prove that \(\{\widetilde{P}, \widetilde{Q}\}\) constitutes a partition of .

Proof for \(\widetilde{P} \cup \widetilde{Q} = \widetilde{R}\): For fixed \(r'\in \widetilde{R}\), an element of \(\arg \min_{s\in P \cup Q} d(s,r')\) is either in P or in Q. In the former case the point \(r'\) will belong to , while in the latter case it will belong to (with the possibility that it belongs to both) due to Definition (11). Thus there does not exist a point \(r'\in \widetilde{R}\) that does not belong to either or .

Proof for \(\widetilde{P} \cap \widetilde{Q} = \emptyset \): We prove this by contradiction. If possible, let \(\rho ' \in \widetilde{P} \cap \widetilde{Q}\). Since \(\rho ' \in \widetilde{P}\), due to Definition (11), there exists \(p_{1} \in P\) such that \(\min_{s\in P\cup Q} d(s,\rho ') = d(p_{1},\rho ')\). Likewise, there exists \(q_{1} \in Q\) such that \(\min_{s\in P\cup Q} d(s,\rho ') = d(q_{1},\rho ')\). Thus,

$$\begin{aligned} & \begin{aligned} 2 \min_{s\in P\cup Q} d \bigl(s,\rho ' \bigr) & = d \bigl(p_{1},\rho ' \bigr) + d \bigl(q_{1},\rho ' \bigr) \\ & \geq d(p_{1},q_{1}) \quad \text{(triangle inequality)} \\ & \geq \min_{p\in P,\atop q\in Q} d(p,q) \quad (\text{since } p_{1}\in P, q_{1}\in Q) \end{aligned} \\ & \quad \Rightarrow \quad 2 \max_{r'\in \widetilde{R}} \min _{s \in P\cup Q} d \bigl(s,r' \bigr) \geq \min _{p\in P,\atop q\in Q} d(p,q) \\ &\quad \Rightarrow \quad \max_{r'\in \widetilde{R}} \min _{s\in P \cup Q} d \bigl(s,r' \bigr) + d_{H} (P\cup Q, \widetilde{R}) \geq \operatorname{sep}(P,Q). \end{aligned}$$

This contradicts assumption (10) of the lemma. Hence there cannot exist a \(\rho ' \in \widetilde{P} \cap \widetilde{Q}\). Thus \(\widetilde{P} \cap \widetilde{Q} = \emptyset \).

2. We next prove \(\arg \min_{s \in P \cup Q} d(s,p') \subseteq P\), \(\forall p' \in \widetilde{P}\). We do this by contradiction.

If possible, suppose that there exists \(p' \in \widetilde{P}\) such that \(\arg \min_{s \in P \cup Q} d(s,p') \nsubseteq P\). Then there exists \(q\in Q\) such that \(\min_{s \in P \cup Q} d(s,p') = d(q,p')\). But \(d(q,p') \geq \min_{s\in Q} d(s,p') \geq \min_{s \in P \cup Q} d(s,p')\). This implies \(\min_{s \in P \cup Q} d(s,p') = \min_{s\in Q} d(s,p')\). Due to the definition of in (11) this implies \(p'\in \widetilde{Q}\). However, we have already shown that \(\widetilde{P} \cap \widetilde{Q} = \emptyset \). This leads to a contradiction. Thus \(\arg \min_{s \in P \cup Q} d(s,p') \subseteq P\), \(\forall p' \in \widetilde{P}\).

Likewise, we can prove \(\arg \min_{s \in P \cup Q} d(s,q') \subseteq Q\), \(\forall q' \in \widetilde{Q}\).

3. We next prove \(\arg \min_{r' \in \widetilde{R}} d(p,r') \subseteq \widetilde{P}\), \(\forall p \in P\). We do this by contradiction.

If possible, suppose that there exists \(p_{3}\in P\) such that \(\arg \min_{r' \in \widetilde{R}} d(p_{3},r') \nsubseteq \widetilde{P}\). Then there exists \(\rho ' \in \widetilde{Q}\) such that \(\min_{r' \in \widetilde{R}} d(p_{3},r') = d(p_{3},\rho ')\).

Again, due to the definition of in (11), for any \(\rho ' \in \widetilde{Q}\), there exists \(q_{3}\in Q\) such that \(d(q_{3},\rho ') = \min_{s\in P\cup Q} d(s, \rho ')\).

Thus,

$$\begin{aligned} & \begin{aligned} \min_{r' \in \widetilde{R}} d \bigl(p_{3},r' \bigr) + \min_{s\in P \cup Q} d \bigl(s,\rho ' \bigr) & = d \bigl(p_{3},\rho ' \bigr) + d \bigl(q_{3},\rho ' \bigr) \\ & \geq d(p_{3},q_{3}) \quad \text{(triangle inequality)} \\ & \geq \min_{p\in P,\atop q\in Q} d(p,q) \quad (\text{since } p_{3}\in P , q_{3}\in Q ) \end{aligned} \\ & \quad \Rightarrow \quad \max_{s\in P\cup Q} \min _{r' \in \widetilde{R}} d \bigl(s,r' \bigr) + \max _{r' \in \widetilde{R}} \min_{s\in P \cup Q} d \bigl(s,r' \bigr) \geq \min_{p\in P,\atop q\in Q} d(p,q) \\ &\quad \Rightarrow \quad d_{H} (P\cup Q, \widetilde{R}) + \max _{r'\in \widetilde{R}} \min_{s\in P\cup Q} d \bigl(s,r' \bigr) \geq \operatorname{sep}(P,Q). \end{aligned}$$

This contradicts assumption (10) of the lemma. Hence there cannot exist \(p_{3}\in P\) such that \(\arg \min_{r' \in \widetilde{R}} d(p_{3},r') \nsubseteq \widetilde{P}\). Thus \(\arg \min_{r' \in \widetilde{R}} d(p,r') \subseteq \widetilde{P}\), \(\forall p \in P\).

Likewise, we can prove \(\arg \min_{r' \in \widetilde{R}} d(q,r') \subseteq \widetilde{Q}\), \(\forall q \in Q\).

4. Since \(\arg \min_{s \in P \cup Q} d(s,p') \subseteq P\), \(\forall p' \in \widetilde{P}\), we have \(\min_{s \in P \cup Q} d(s,p') = \min_{p \in P} d(p,p')\), \(\forall p' \in \widetilde{P}\). Thus, \(\max_{p'\in \widetilde{P}} \min_{p \in P} d(p,p') = \max_{p'\in \widetilde{P}} \min_{s \in P \cup Q} d(s,p') \).

Likewise, since \(\arg \min_{r' \in \widetilde{R}} d(p,r') \subseteq \widetilde{P}\), \(\forall p \in P\), we have \(\max_{p \in P} \min_{p'\in \widetilde{P}} d(p,p') = \max_{p \in P} \min_{r'\in \widetilde{R}} d(p,r')\).

Thus,

$$\begin{aligned} d_{H}(P,\widetilde{P}) & = \max \Bigl( \max_{p \in P} \min_{p' \in \widetilde{P}} d \bigl(p,p' \bigr) , \max _{p'\in \widetilde{P}} \min_{p \in P} d \bigl(p,p' \bigr) \Bigr) \\ & = \max \Bigl( \max_{p \in P} \min_{r'\in \widetilde{R}} d \bigl(p,r' \bigr) , \max_{p'\in \widetilde{P}} \min _{s \in P \cup Q} d \bigl(s,p' \bigr) \Bigr) \\ & \leq \max \Bigl( \max_{s \in P \cup Q} \min_{r'\in \widetilde{R}} d \bigl(s,r' \bigr) , \max_{r'\in \widetilde{R}} \min _{s \in P \cup Q} d \bigl(s,r' \bigr) \Bigr) \\ & \quad (\text{since } P \subseteq P\cup Q, \widetilde{P} \subseteq \widetilde{R}) \\ & = d_{H}(P\cup Q,\widetilde{R}). \end{aligned}$$
(12)

Similarly, we can show

$$\begin{aligned} d_{H}(Q,\widetilde{Q}) & = \max \Bigl( \max_{q \in Q} \min_{r' \in \widetilde{R}} d \bigl(q,r' \bigr) , \max _{q'\in \widetilde{Q}} \min_{s \in P \cup Q} d \bigl(s,q' \bigr) \Bigr) \\ & \leq d_{H}(P\cup Q,\widetilde{R}). \end{aligned}$$
(13)

Again, from (12) and (13),

$$\begin{aligned} & \max \bigl( d_{H}(P,\widetilde{P}) , d_{H}(Q, \widetilde{Q}) \bigr) \\ & \quad = \max \Bigl( \max_{p \in P} \min_{r'\in \widetilde{R}} d \bigl(p,r' \bigr) , \max_{q \in Q} \min _{r'\in \widetilde{R}} d \bigl(q,r' \bigr) , \\ & \qquad \max_{p'\in \widetilde{P}} \min_{s \in P \cup Q} d \bigl(s,p' \bigr) , \max_{q'\in \widetilde{Q}} \min _{s \in P \cup Q} d \bigl(s,q' \bigr) \Bigr) \\ &\quad = \max \Bigl( \max_{p \in P\cup Q} \min_{r'\in \widetilde{R}} d \bigl(p,r' \bigr) , \max_{p'\in \widetilde{P}\cup \widetilde{Q}} \min _{s \in P \cup Q} d \bigl(s,p' \bigr) \Bigr) \\ &\quad = d_{H}(P\cup Q,\widetilde{R}) \quad (\text{since } \widetilde{P}\cup \widetilde{Q} = \widetilde{R}) \end{aligned}$$

5.

$$\begin{aligned} \operatorname{sep}(\widetilde{P}, \widetilde{Q}) \geq {}& \operatorname{sep} ( \widetilde{P}, Q) - d_{H}(Q, \widetilde{Q}) \quad \text{(using Lemma4)} \\ \geq{} & \operatorname{sep} (P, Q) - d_{H}(P, \widetilde{P}) - d_{H}(Q, \widetilde{Q}) \quad \text{(using Lemma4)} \\ \geq {}& \operatorname{sep} (P, Q) - 2 d_{H}(P\cup Q, \widetilde{R}) \\ & {} \bigl(\text{since } d_{H}(P,\widetilde{P}) \leq d_{H}(P\cup Q, \widetilde{R})\text{ and } d_{H}(Q, \widetilde{Q}) \leq d_{H}(P\cup Q,\widetilde{R}) \bigr) \end{aligned}$$

 □

Corollary 1

If P, Q, are closed subsets of a metric space \((\Psi ,d)\) such that \(d_{H} (P\cup Q, \widetilde{R}) < \frac{1}{2} \operatorname{sep} (P, Q)\), then is a separation-preserving perturbation of P and Q.

As a consequence, the separation-preserving partition \(\{\widetilde{P}, \widetilde{Q}\}\) of as defined in (11) satisfies properties ‘1’ to ‘4’ in Lemma 5, as well as property ‘5 (if \((\Psi ,d)\) is a connected path metric space) with an additional inequality:

$$ \operatorname{sep}(\widetilde{P}, \widetilde{Q}) \geq \operatorname{sep} (P, Q) - 2 d_{H}(P\cup Q, \widetilde{R}) > 0.$$

Proof

The result follows directly from Lemma 5 by observing that

$$ \max_{r'\in \widetilde{R}} \min_{s\in P\cup Q} d \bigl(s,r' \bigr) + d_{H} (P \cup Q, \widetilde{R}) \leq 2 d_{H} (P\cup Q, \widetilde{R}) < \operatorname{sep} (P, Q). $$

 □

3 Results on perturbation upper bounds

Throughout this section we use the notations and conventions described in Definition 1.

3.1 Elementary results on spectrum perturbation

In this section we provide some elementary results relating the norm of the matrix perturbation and the perturbation of eigenvalues and eigenvectors.

Lemma 6

Define \(D \in \mathbb{C}^{n\times n}\) such that \(D_{jj'} = (\widetilde{\lambda}_{j'} - \lambda _{j}) \mathbf{u}_{j}^{{ \dagger }}\widetilde{\mathbf{u}}_{j'} \). Then

$$ D = U^{{\dagger }}(\widetilde{M} - M) \widetilde{U}. $$
(14)

Equivalently,

$$ (\widetilde{\lambda}_{j'} - \lambda _{j}) \mathbf{u}_{j}^{{\dagger }} \widetilde{ \mathbf{u}}_{j'} = \mathbf{u}_{j}^{{\dagger }}( \widetilde{M} - M) \widetilde{\mathbf{u}}_{j'}, \quad \forall j,j' \in N. $$
(15)

The latter relation in fact holds even when is not normal but \(\widetilde{\mathbf{u}}_{j'}\) is simply a right eigenvector of with the corresponding eigenvalue \(\widetilde{\lambda}_{j'}\).

Proof

First we note that since M is normal with \(\mathbf{u}_{j}\), a right eigenvector and the corresponding eigenvalue \(\lambda _{j}\), \(\mathbf{u}_{j}^{{\dagger }}\) is a left eigenvector of M with the same eigenvalue. Thus,

$$\begin{aligned} \mathbf{u}_{j}^{{\dagger }}(\widetilde{M} - M) \widetilde{ \mathbf{u}}_{j'} & = \mathbf{u}_{j}^{{\dagger }} \widetilde{M} \widetilde{\mathbf{u}}_{j'} - \mathbf{u}_{j}^{{\dagger }}M \widetilde{\mathbf{u}}_{j'} \\ &= \mathbf{u}_{j}^{{\dagger }} \widetilde{\lambda}_{j'} \widetilde{\mathbf{u}}_{j'} - \lambda _{j} \mathbf{u}_{j}^{{ \dagger }}\widetilde{ \mathbf{u}}_{j'} \\ &= (\widetilde{\lambda}_{j'} - \lambda _{j}) \mathbf{u}_{j}^{{\dagger }}\widetilde{ \mathbf{u}}_{j'}. \end{aligned}$$

This proves (15).

We note that if both M and are normal, the L.H.S. of (15) is the \((j,j')\)th element of \(U^{{\dagger }}(\widetilde{M} - M) \widetilde{U}\) and the R.H.S. is \(D_{jj'}\). □

Corollary 2

$$\begin{aligned} \begin{gathered} \Vert \widetilde{M} - M \Vert ^{2}_{2} \geq \bigl\Vert ( \widetilde{M} - M) \widetilde{\mathbf{u}}_{j'} \bigr\Vert ^{2}_{2} = \sum_{j=1}^{n} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert ^{2} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{j'} \bigr\vert ^{2},\quad \forall j' \in N \\ \Vert \widetilde{M} - M \Vert ^{2}_{2} \geq \bigl\Vert ( \widetilde{M} - M) \mathbf{u}_{j} \bigr\Vert ^{2}_{2} = \sum_{j'=1}^{n} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert ^{2} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{j'} \bigr\vert ^{2},\quad \forall j \in N. \end{gathered} \end{aligned}$$
(16)

The first relation holds even when is not normal, while the second relation holds even when M is not normal.

Proof

The inequalities follow from the definition of induced 2-norm for matrices.

When M is normal, \(\{\mathbf{u}_{j}\}_{j\in N}\) forms an orthonormal basis in \(\mathbb{C}^{n}\). Noting that (15) is a scalar equation, multiplying on both sides with \(\mathbf{u}_{j}\) and summing over j, we get

$$\begin{aligned} \sum_{j=1}^{n} \bigl( (\widetilde{ \lambda}_{j'} - \lambda _{j}) \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{j'} \bigr) \mathbf{u}_{j} =& \sum_{j=1}^{n} \mathbf{u}_{j} \bigl( \mathbf{u}_{j}^{{ \dagger }}(\widetilde{M} - M) \widetilde{\mathbf{u}}_{j'} \bigr) \\ =& \Biggl( \sum_{j=1}^{n} \mathbf{u}_{j} \mathbf{u}_{j}^{{\dagger }} \Biggr) ( \widetilde{M} - M) \widetilde{\mathbf{u}}_{j'} \\ =& I ( \widetilde{M} - M) \widetilde{\mathbf{u}}_{j'}. \end{aligned}$$

Taking the 2-norm on both sides of the above gives the first equality.

Switching the roles of tilde and nontilde terms in Lemma 6 and the above gives the second relation. □

Corollary 3

  1. 1
    $$\begin{aligned}& \Vert \widetilde{M} - M \Vert _{2} \geq \bigl\Vert ( \widetilde{M} - M) \widetilde{\mathbf{u}}_{j'} \bigr\Vert _{2} \geq \min_{j\in N} \vert \widetilde{ \lambda}_{j'} - \lambda _{j} \vert ,\quad \forall j'\in N,\quad \textit{and} \\& \Vert \widetilde{M} - M \Vert _{2} \geq \bigl\Vert ( \widetilde{M} - M) {\mathbf{u}}_{j} \bigr\Vert _{2} \geq \min_{j'\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert ,\quad \forall j \in N. \end{aligned}$$

    The first relation holds even when is not normal, while the second relation holds even when M is not normal.

  2. 2

    The following results are a consequence of the Bauer–Fike theorem for normal matrices [9]:

    $$\begin{aligned} & \Vert \widetilde{M} - M \Vert _{2}\geq \max _{j\in N} \bigl\Vert (\widetilde{M} - M) \widetilde{ \mathbf{u}}_{j'} \bigr\Vert _{2} \geq \max _{j'\in N} \min_{j\in N} \vert \widetilde{ \lambda}_{j'} - \lambda _{j} \vert , \\ & \Vert \widetilde{M} - M \Vert _{2}\geq \max _{j\in N} \bigl\Vert (\widetilde{M} - M) {\mathbf{u}}_{j} \bigr\Vert _{2} \geq \max_{j\in N} \min _{j'\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert . \end{aligned}$$

    Once again, the first relation holds even when is not normal, while the second relation holds even when M is not normal.

Proof

From the result of Corollary 2, when M is normal (and is not necessarily normal), for all \(j' \in N\),

$$\begin{aligned} \Vert \widetilde{M} - M \Vert ^{2}_{2} & \geq \bigl\Vert ( \widetilde{M} - M) \widetilde{\mathbf{u}}_{j'} \bigr\Vert ^{2}_{2} \\ & = \sum_{j=1}^{n} \vert \widetilde{ \lambda}_{j'} - \lambda _{j} \vert ^{2} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{ \mathbf{u}}_{j'} \bigr\vert ^{2} \\ & \geq \min_{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert ^{2} \sum _{j=1}^{n} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{j'} \bigr\vert ^{2} \\ &= \min_{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert ^{2} \Vert \widetilde{ \mathbf{u}}_{j'} \Vert ^{2}\quad \bigl(\text{since } \{\mathbf{u}_{j}\}_{j\in N}\text{ forms an orthonormal basis} \bigr) \\ &= \min_{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert ^{2}. \end{aligned}$$

Since this is true for any \(j'\in N\), it follows that \(\Vert \widetilde{M} - M \Vert _{2} \geq \max_{j'\in N} \min_{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert \).

A similar set of the results can be derived with the tilde and nontilde terms exchanged. □

3.2 Distance between invariant subspaces of normal matrices with partitioned spectra

Suppose \(J, \widetilde{J} \subseteq N\) such that \(|J| = |\widetilde{J}| = q\). We are interested in understanding how much the invariant space \(\operatorname{span}(\mathbf{u}_{J})\) of M differs from the invariant space \(\operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}})\) of . The results in this section are variations and modest improvements on the Davis–Kahan sinΘ theorem [1] (see Section VIII.3 of [2] for example). In Proposition 1 and the two corollaries that follow, we present results of the form

$$ d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) \leq \mathscr{F}(\widetilde{M} - M, \mathbf{u}_{N}, \lambda _{N}, \widetilde{\lambda}_{N}; J, \widetilde{J}), $$

where \(\mathscr{F}\) is a function specific to the exact statement of the proposition or corollary.

For a given invariant subspace \(\operatorname{span}(\mathbf{u}_{J})\) of M, we can consider all the possible q-dimensional invariant subspaces of and choose the one that is closest to \(\operatorname{span}(\mathbf{u}_{J})\) as its perturbation. As a consequence, for any of these results, we can write

$$ \min_{\widetilde{J}\in S_{q,n}} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}( \widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) \leq \min_{ \widetilde{J}\in S_{q,n}} \mathscr{F}(\widetilde{M} - M, \mathbf{u}_{N}, \lambda _{N}, \widetilde{\lambda}_{N}; J, \widetilde{J}), $$

where \(S_{q,n}\) is the set of all q-element subsets of \(N = \{1,2,\ldots ,n\}\). This gives a combinatorial means of finding the q-dimensional invariant subspace of that is closest to \(\operatorname{span}(\mathbf{u}_{J})\).

Definition 4

For \(a,b,c \in \mathbb{R}\) with \(a \leq \min (b,c)\), we define

$$\bigl[a,\min (b,c_{-}) \bigr] = \textstyle\begin{cases} {[a,b]} & \text{if }c>b, \\ {[a,c)} & \text{if }c\leq b. \end{cases} $$

Proposition 1

For any \(J, \widetilde{J} \subseteq N\) with \(|J| = |\widetilde{J}|=q\),

$$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{ \widetilde{J}}) \bigr) \\ & \quad \leq \sqrt{ \frac{1}{q} \sum_{j\in J} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } } \end{aligned}$$
(17)

for any \(\kappa _{j} \in [ 0,\min (1, ( \frac{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\), \(j\in J\).

The tightest bound in (17) is obtained by choosing

$$ \kappa _{j} = \textstyle\begin{cases} 0 & \textit{if } \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} \geq \min_{j' \in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert , \\ 1 & \textit{if } \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} < \min_{j' \in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert . \end{cases} $$
(18)

Proof

From Corollary 2, for all \({j} \in N\),

$$\begin{aligned} & \bigl\Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert ^{2}_{2} \\ & \qquad = \sum_{j'\in \widetilde{J}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert ^{2} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{ \mathbf{u}}_{{j'}} \bigr\vert ^{2} + \sum _{j'\in { \widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{{j'}} \bigr\vert ^{2} \\ &\qquad \geq \min_{j'\in \widetilde{J}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert ^{2} \sum_{j'\in \widetilde{J}} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{{j'}} \bigr\vert ^{2} + \min _{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \sum_{j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }}\widetilde{ \mathbf{u}}_{{j'}} \bigr\vert ^{2} \end{aligned}$$
(19)
$$\begin{aligned} &\qquad = \min_{j'\in \widetilde{J}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert ^{2} \biggl(1 - \sum_{j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }}\widetilde{\mathbf{u}}_{{j'}} \bigr\vert ^{2} \biggr) + \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \sum_{j'\in { \widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{\mathbf{u}}_{{j'}} \bigr\vert ^{2} \\ &\qquad \geq \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \biggl(1 - \sum_{j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }} \widetilde{ \mathbf{u}}_{{j'}} \bigr\vert ^{2} \biggr) + \min _{j'\in { \widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \sum_{j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{ \dagger }}\widetilde{ \mathbf{u}}_{{j'}} \bigr\vert ^{2} \\ & \quad \qquad \text{for any $\kappa _{j} \in [0,1]$.} \\ & \quad \Rightarrow \quad \Bigl( \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \Bigr) \sum_{j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }}\widetilde{\mathbf{u}}_{{j'}} \bigr\vert ^{2} \\ & \hphantom{\quad \Rightarrow \quad }\quad \leq \bigl\Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert ^{2}_{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert ^{2} \end{aligned}$$
(20)
$$\begin{aligned} & \quad \Rightarrow\quad \sum_{j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }}\widetilde{\mathbf{u}}_{{j'}} \bigr\vert ^{2} \leq \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \\ & \hphantom{\quad \Rightarrow \quad }\quad \text{for any $\kappa _{j} \in [ 0,\min (1, ( \frac{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]$.} \end{aligned}$$
(21)

In the last step, we ensured that \(\min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \) is positive by restricting the domain of \(\kappa _{j}\) appropriately.

Thus, from (21) we have

$$\begin{aligned} & \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \bigr)^{2} \\ & \quad = \frac{1}{q} \sum_{j\in J \atop j'\in {\widetilde{J}}^{c}} \bigl\vert \mathbf{u}_{j}^{{\dagger }}\widetilde{ \mathbf{u}}_{{j'}} \bigr\vert ^{2} \quad \text{(due to Lemma 1.2)} \\ & \quad \leq \frac{1}{q} \sum_{j\in J} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \end{aligned}$$
(22)

for any \(\kappa _{j} \in [ 0,\min (1, ( \frac{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\), \(j\in J\).

Additionally, we note that

$$\begin{aligned} & \bigl\Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert _{2} < \min_{j'\in { \widetilde{J}}^{c}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert \quad \Rightarrow \quad \min _{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert > \min_{j'\in \widetilde{J}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert \\ &\quad \Bigl(\text{since, due to Corollary 3, } \bigl\Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert _{2} \geq \min _{j'\in {N}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert \Bigr). \end{aligned}$$

Thus, when \(\| (\widetilde{M} - M) \mathbf{u}_{{j}} \|_{2} < \min_{j'\in { \widetilde{J}}^{c}} |\widetilde{\lambda}_{{j'}} - \lambda _{j}|\), the valid domain of \(\kappa _{j}\) is \([0,1]\). The statement about the tightest bound then follows from the fact that the function \(f(\kappa ) = \frac{a-\kappa c}{b-\kappa c}\), \(\kappa \in [0,d]\) (with \(d < \frac{b}{c}\)) is minimized with \(\kappa =0\) when \(a\geq b\), and with \(\kappa =d\) when \(a < b\). □

The key achievement in the above proposition is to provide an upper bound on the distance (in terms of \(d_{\mathrm{{sp}}}\)) between the invariant subspaces \(\operatorname{span}(\mathbf{u}_{J})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}})\) in terms of the distance between the matrices M and and their eigenvalues. For a given/fixed matrix perturbation \((\widetilde{M} - M)\) and appropriately chosen , inequality (17) can be interpreted as a relation between the perturbation in the eigenvalues \(\{\lambda _{j}| j\in J\}\) and the perturbation in the invariant space \(\operatorname{span}(\mathbf{u}_{J})\). This relationship, in general, can be expected to be an inverse one – with higher perturbation in the eigenvalues we will have a lower (upper bound on the) perturbation in the invariant space, and vice versa.

It is easy to note that the equality in (17) holds when

  1. (i)

    \(\frac{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} > 1\), \(\forall j\in J\), allowing us to choose \(\kappa _{j}=1\), \(\forall j\in J\), and

  2. (ii)
    $$\begin{aligned}& \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j_{1}} \vert = \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j_{2}} \vert , \quad \forall j_{1},j_{2} \in J, \\& \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j_{1}} \vert = \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j_{2}} \vert ,\quad \forall j_{1},j_{2} \in {J^{c}}. \end{aligned}$$

    (These conditions hold, for example, when \(\widetilde{\lambda}_{\widetilde{J}}\) and \(\widetilde{\lambda}_{{\widetilde{J}}^{c}}\) are small translations of \({\lambda}_{{J}}\) and \({\lambda}_{{{J^{c}}}}\) respectively in \(\mathbb{C}\).)

In Proposition 1, without loss of generality, we can interchange the roles of J and \({J^{c}}\) (likewise and \({\widetilde{J}}^{c}\)). Observing that \(\operatorname{span}(\mathbf{u}_{J^{c}})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{{\widetilde{J}}^{c}})\) are \((n-q)\) dimensional sub-spaces of \(\mathbb{C}^{n}\) which are orthogonal complements of \(\operatorname{span}(\mathbf{u}_{J})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}})\) respectively, we then obtain

$$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \\ &\quad = \sqrt{\frac{n-q}{q}} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J^{c}}) , \operatorname{span}( \widetilde{\mathbf{u}}_{{\widetilde{J}}^{c}}) \bigr) \quad \text{(due to Lemma2.3)} \\ &\quad \leq \sqrt{ \frac{1}{q} \sum_{j\in {J^{c}}} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{j} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }} \end{aligned}$$
(23)

for any \(\kappa _{j} \in [ 0,\min (1, ( \frac{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\), \(j\in {J^{c}}\).

Corollary 4

For any \(\kappa _{J} \in [ 0,\min (1, ( \frac{ \min_{j\in J , j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \max_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\) and

\(\kappa _{J^{c}} \in [ 0,\min (1, ( \frac{ \min_{j\in {J^{c}} , j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \max_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\),

  1. 1.
    $$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \\ & \quad \leq \frac{1}{q} \min \biggl( \sqrt{ \frac{ \sum_{j\in J} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{J} \sum_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in J \atop j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J} \max_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } } , \\ & \qquad \sqrt{ \frac{ \sum_{j\in {J^{c}}} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{J^{c}} \sum_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in {J^{c}} \atop j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J^{c}} \max_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } } \biggr). \end{aligned}$$
    (24)
  2. 2.
    $$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \\ &\quad \leq \sqrt{ \frac{ \frac{1}{q} \left ( \Vert \widetilde{M} - M \Vert ^{2}_{F} - (\textstyle\begin{array}{c} \kappa _{J} \sum_{j\in J} \min_{{j'} \in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \\ {} + \kappa _{J^{c}} \sum_{j\in {J^{c}}} \min_{{j'} \in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \end{array}\displaystyle ) \right ) }{ (\textstyle\begin{array}{c} \min_{{j'} \in {\widetilde{J}}^{c} \atop j\in J} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \\ {} + \min_{{j'} \in \widetilde{J} \atop j\in {J^{c}}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \end{array}\displaystyle ) - (\textstyle\begin{array}{c} \kappa _{J} \max_{j\in J} \min_{{j'} \in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \\ {} + \kappa _{J^{c}} \max_{j\in {J^{c}}} \min_{{j'} \in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \end{array}\displaystyle ) } } \end{aligned}$$
    (25)

Proof

With \(\kappa _{j} \in [ 0,\min (1, ( \frac{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\), \(j\in J\),

$$\begin{aligned} & q \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \bigr)^{2} \\ & \quad \leq \sum_{j\in J} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \text{(due to Proposition1)} \\ & \quad \leq \frac{ \sum_{j\in J} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \sum_{j\in J} \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in J} ( \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} ) } \\ & \qquad \biggl(\text{since } \sum_{k\in S} \frac{c_{k}}{d_{k}} \leq \frac{\sum_{k\in S} c_{k}}{\min_{k\in S} d_{k}} \biggr) \\ & \quad \leq \frac{ \sum_{j\in J} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \sum_{j\in J} \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in J \atop j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \max_{j\in J} \kappa _{j} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \\ & \qquad \Bigl(\min_{k\in S} (c_{k} - d_{k}) \geq \min_{k\in S} c_{k} - \max_{k\in S} d_{k} \Bigr). \end{aligned}$$
(26)

We next choose \(\kappa _{j} = \kappa _{k}\), \(\forall j,k \in J\) and denote this value by

$$\begin{aligned} & \kappa _{J} \in \bigcap_{j\in J} \biggl[ 0,\min \biggl(1, \biggl( \frac{ \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} \biggr)_{ -} \biggr) \biggr] \\ & \quad \supseteq \biggl[ 0,\min \biggl(1, \biggl( { \frac{ \min_{j\in J \atop j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \max_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}} \biggr)_{ -} \biggr) \biggr]. \end{aligned}$$

Thus,

$$\begin{aligned} & q \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \bigr)^{2} \\ & \quad \leq \frac{ \sum_{j\in J} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{J} \sum_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in J \atop j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J} \max_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \end{aligned}$$
(27)

for any \(\kappa _{J} \in [ 0,\min (1, ( \frac{ \min_{j\in J , j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \max_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\).

By interchanging the roles of J and \({J^{c}}\) (accordingly, and \({\widetilde{J}}^{c}\)) and noting that \(\operatorname{span}(\mathbf{u}_{J^{c}})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{{\widetilde{J}}^{c}})\) are \((n-q)\) dimensional sub-spaces of \(\mathbb{C}^{n}\), we get

$$\begin{aligned} & (n-q) \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J^{c}}) , \operatorname{span}( \widetilde{\mathbf{u}}_{{ \widetilde{J}}^{c}}) \bigr) \bigr)^{2} \\ &\quad \leq \frac{ \sum_{j\in {J^{c}}} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{J^{c}} \sum_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in {J^{c}} \atop j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J^{c}} \max_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \end{aligned}$$
(28)

for any \(\kappa _{J^{c}} \in [ 0,\min (1, ( \frac{ \min_{j\in {J^{c}} , j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \max_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\).

On the other hand, since \(\operatorname{span}(\mathbf{u}_{J})\) and \(\operatorname{span}(\mathbf{u}_{J^{c}})\) are orthogonal complements (likewise, \(\operatorname{span}(\mathbf{u}_{\widetilde{J}})\) and \(\operatorname{span}(\mathbf{u}_{{\widetilde{J}}^{c}})\) are orthogonal complements), using Lemma 2, we can write (28) as

$$\begin{aligned} & q \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \bigr)^{2} \\ &\quad \leq \frac{ \sum_{j\in {J^{c}}} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} - \kappa _{J^{c}} \sum_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} }{ \min_{j\in {J^{c}} \atop j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J^{c}} \max_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \end{aligned}$$
(29)

for any \(\kappa _{J^{c}} \in [ 0,\min (1, ( \frac{ \min_{j\in {J^{c}} , j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}}{ \max_{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}} )_{ -} ) ]\).

Combining (27) and (29) proves part ‘1’.

Again, adding (27) and (29), we have

$$\begin{aligned} & q \Bigl( \min_{j\in J \atop j'\in {\widetilde{J}}^{c}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert ^{2} + \min_{j\in {J^{c}} \atop j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} \\ & \quad - \kappa _{J} \max_{j\in J} \min _{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J^{c}} \max _{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{ \lambda}_{{j'}} - \lambda _{j} \vert ^{2} \Bigr) \\ & \qquad{} \times \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{ \widetilde{J}}) \bigr) \bigr)^{2} \\ & \quad \leq \sum_{j\in J} \bigl\Vert ( \widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert ^{2}_{2} + \sum_{j \in {J^{c}}} \bigl\Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert ^{2}_{2} \\ & \qquad {}- \kappa _{J} \sum_{j\in J} \min_{j'\in \widetilde{J}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} - \kappa _{J^{c}} \sum _{j\in {J^{c}}} \min_{j'\in {\widetilde{J}}^{c}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2}. \end{aligned}$$

The part ‘2’ of the result then follows by observing that

$$ \sum_{{j} \in {J}} \bigl\Vert (\widetilde{M} - M) \mathbf{u}_{j} \bigr\Vert ^{2}_{2} + \sum _{{j} \in {{J^{c}}}} \bigl\Vert ( \widetilde{M} - M) \mathbf{u}_{j} \bigr\Vert ^{2}_{2} = \bigl\Vert ( \widetilde{M} - M) U \bigr\Vert ^{2}_{F} = \Vert \widetilde{M} - M \Vert ^{2}_{F}. $$

 □

Corollary 5

(Generalized Davis–Kahan [1] sinΘ theorem for normal matrices – see Section VIII.3 of [2])

  1. 1.
    $$d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{ \widetilde{J}}) \bigr) \leq \frac{ \min ( 1, \sqrt{ \frac{n-q}{q} } ) }{ \max ( \operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} ), \operatorname{sep} ( \lambda _{J^{c}}, \widetilde{\lambda}_{\widetilde{J}} ) ) } \Vert \widetilde{M} - M \Vert _{2} ; $$
  2. 2.
    $$\begin{aligned} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) \leq& \frac{ \frac{1}{\sqrt{q}} \Vert \widetilde{M} - M \Vert _{F} }{ \sqrt{ \operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} )^{ 2} + \operatorname{sep} ( \lambda _{J^{c}}, \widetilde{\lambda}_{\widetilde{J}} )^{ 2} } } \\ \leq& \sqrt{ \frac{ n/q }{ \operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} )^{ 2} + \operatorname{sep} ( \lambda _{J^{c}}, \widetilde{\lambda}_{\widetilde{J}} )^{ 2} } } \Vert \widetilde{M} - M \Vert _{2} . \end{aligned}$$

Proof

In (27), setting \(\kappa _{J} = 0\), we get

$$ \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \bigr)^{2} \leq \frac{\frac{1}{q} \sum_{j\in J} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2}}{\operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} )^{2}} \leq \frac{ \Vert (\widetilde{M} - M) \Vert ^{2}_{2}}{\operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} )^{2}} .$$

Interchanging the roles of the tilde and nontilde terms in this result, we analogously obtain

$$ \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \widetilde{ \mathbf{u}}_{\widetilde{J}}), \operatorname{span}( \mathbf{u}_{J}) \bigr) \bigr)^{2} \leq \frac{ \Vert (\widetilde{M} - M) \Vert ^{2}_{2}}{\operatorname{sep} ( \widetilde{\lambda}_{\widetilde{J}}, \lambda _{J^{c}} )^{2}}.$$

The above two together give

$$\begin{aligned} \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \widetilde{\mathbf{u}}_{\widetilde{J}}), \operatorname{span}( \mathbf{u}_{J}) \bigr) \bigr)^{2} \leq \frac{ \Vert (\widetilde{M} - M) \Vert ^{2}_{2}}{ \max ( \operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} ), \operatorname{sep} ( \lambda _{J^{c}}, \widetilde{\lambda}_{\widetilde{J}} ) )^{2}}. \end{aligned}$$
(30)

In the above inequality, interchanging the roles of J and \({J^{c}}\) (accordingly, and \({\widetilde{J}}^{c}\)) and observing that by Lemma 2\(d_{\mathrm{{sp}}} ( \operatorname{span}(\widetilde{\mathbf{u}}_{ \widetilde{J}}), \operatorname{span}(\mathbf{u}_{J}) ) = \sqrt{ \frac{n-q}{q}} d_{\mathrm{{sp}}} ( \operatorname{span}( \widetilde{\mathbf{u}}_{{\widetilde{J}}^{c}}), \operatorname{span}( \mathbf{u}_{J^{c}}) )\), we obtain

$$\begin{aligned} \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \widetilde{\mathbf{u}}_{\widetilde{J}}), \operatorname{span}( \mathbf{u}_{J}) \bigr) \bigr)^{2} \leq \frac{n-q}{q} \frac{ \Vert (\widetilde{M} - M) \Vert ^{2}_{2}}{ \max ( \operatorname{sep} ( \lambda _{J}, \widetilde{\lambda}_{{\widetilde{J}}^{c}} ), \operatorname{sep} ( \lambda _{J^{c}}, \widetilde{\lambda}_{\widetilde{J}} ) )^{2}}. \end{aligned}$$
(31)

(30) and (31) together conclude the proof of part ‘1’.

The second result follows directly from part ‘2’ of Corollary 4 by setting \(\kappa _{J} = \kappa _{J^{c}} = 0\) and using the fact that for \(Q\in \mathbb{C}^{n \times n}\), \(\|Q\|_{F} \leq \sqrt{n} \|Q\|_{2}\). □

3.3 Bound on perturbation of invariant subspace of a normal matrix with well-clustered spectrum

In this section we specialize the earlier results for the situation when \(\lambda _{J}\) and \(\lambda _{J^{c}}\) are well-clustered (i.e., the separation between them is large) compared to the perturbation \((\widetilde{M} - M)\). In the following lemma we outline the conditions under which the perturbed eigenvalues \(\widetilde{\lambda}_{N}\) will also remain well clustered.

Lemma 7

For any \(J \subseteq N\), define \({J^{c}} = N - J\). If \(\|\widetilde{M} - M\|_{2} < \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}})\), then:

  1. 1.

    \(\widetilde{\lambda}_{N}\) is a separation-preserving perturbation of \(\lambda _{J}\) and \(\lambda _{J^{c}}\). More explicitly, defining

    $$ \begin{gathered} \widehat{J} = \Bigl\{ j' \big| \min _{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert = \min_{j\in J} \vert \widetilde{ \lambda}_{j'} - \lambda _{j} \vert \Bigr\} \quad \textit{and}\\ {\widehat{J}^{c}} = \Bigl\{ j' \big| \min _{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert = \min_{j\in {J^{c}}} \vert \widetilde{ \lambda}_{j'} - \lambda _{j} \vert \Bigr\} \end{gathered} $$
    (32)

    makes \(\{ \widetilde{\lambda}_{\widehat{J}}, \widetilde{\lambda}_{{ \widehat{J}^{c}}} \}\) a separation-preserving partition of \(\widetilde{\lambda}_{N}\), with

    $$ \operatorname{sep}(\widetilde{\lambda}_{\widehat{J}}, \widetilde{ \lambda}_{{ \widehat{J}^{c}}}) > \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - 2 \Vert \widetilde{M} - M \Vert _{2}. $$
  2. 2.

    \(|\widetilde{\lambda}_{\widehat{J}}| = |\lambda _{J}|\) (equivalently, \(|\widetilde{\lambda}_{{\widehat{J}^{c}}}| = |\lambda _{J^{c}}|\)), where \(|\cdot |\) denotes the number of elements in the multi-sets (recall that \(\lambda _{J}\) and \(\widetilde{\lambda}_{\widehat{J}}\) are multi-sets, allowing them to contain multiple copies of nondistinct eigenvalues, if any, of M and respectively).

Proof

  1. 1.

    We first observe that

    $$ \Vert \widetilde{M} - M \Vert _{2} \geq \max \Bigl( \max_{j\in N} \min_{j' \in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert , \max _{j'\in N} \min_{j\in N} \vert \widetilde{ \lambda}_{j'} - \lambda _{j} \vert \Bigr) = d_{H}(\lambda _{N}, \widetilde{\lambda}_{N}). $$
    (33)

    As a consequence, \(d_{H}(\lambda _{N}, \widetilde{\lambda}_{N}) \leq \|\widetilde{M} - M \|_{2} < \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}})\). Then the proof of the first part follows directly from Corollary 1 by setting \(P = \lambda _{J}\), \(Q = \lambda _{J^{c}}\) and \(\widetilde{R} = \widetilde{\lambda}_{N}\).

  2. 2.

    We prove the second part by contradiction.

    If possible, let \(|\widetilde{\lambda}_{\widehat{J}}| \neq |\lambda _{J}|\). Without loss of generality, we will assume \(|\widetilde{\lambda}_{\widehat{J}}| < |\lambda _{J}|\) (if the \(|\widetilde{\lambda}_{\widehat{J}}| > |\lambda _{J}|\), we can show the contradiction for \(|\widetilde{\lambda}_{{\widehat{J}^{c}}}| < |\lambda _{J^{c}}|\) instead).

    Define a path \(\overline{M}:[0,1]\rightarrow \mathbb{R}^{n\times n}\) connecting M and as

    $$ \overline{M}(t) = t \widetilde{M} + (1-t)M. $$

    Although \(\overline{M}(t)\) is not necessarily normal for all t, its characteristic equation is a degree-n polynomial equation in its eigenvalue with coefficient of the highest degree term equal to 1 and other coefficients being polynomials in t. Since the roots of such a polynomial are continuous functions of the coefficients, the eigenvalues of \(\widetilde{M}(t)\) are continuous functions of t. Thus, we define \(\overline{\lambda}_{j}:[0,1]\rightarrow \mathbb{C}\) to be the paths of the eigenvalues such that \(\overline{\lambda}_{j}(0) = \lambda _{j}\) for all \(j\in \{1,2,\ldots ,n\}\). \(\overline{\lambda}_{j}(1)\) are the eigenvalues of \(\overline{M}(1) = \widetilde{M}\), so that \(\overline{\lambda}_{j}(1) = \widetilde{\lambda}_{\sigma (j)}\) for some permutation \(\sigma :\{1,2,\ldots ,n\}\rightarrow \{1,2,\ldots ,n\}\) (see Fig. 4).

    Figure 4
    figure 4

    Illustration for the proof of Lemma 7

    Since \(|\widetilde{\lambda}_{\widehat{J}}| < |\lambda _{J}|\), there exists at least one \(k\in J\) (with \(\lambda _{k} = \overline{\lambda}_{k}(0) \in \lambda _{J}\)) such that \(\overline{\lambda}_{k}(1) \notin \widetilde{\lambda}_{\widehat{J}}\) (equivalently, \(\overline{\lambda}_{k}(1) \in \widetilde{\lambda}_{{\widehat{J}^{c}}}\)).

    Define \(g(t) = \min_{j\in J} |\overline{\lambda}_{k}(t) - \lambda _{j}|\) and \(h(t) = \min_{j\in {J^{c}}} |\overline{\lambda}_{k}(t) - \lambda _{j}|\). Thus,

    $$\begin{aligned} g(0) = \min_{j\in J} \bigl\vert \overline{ \lambda}_{k}(0) - \lambda _{j} \bigr\vert &= \min _{j\in J} \vert \lambda _{k} - \lambda _{j} \vert = 0 \ (\text{since }\lambda _{k} \in \lambda _{J}) \leq h(0). \end{aligned}$$

    Again,

    $$\begin{aligned} h(1) ={}& \min_{j\in {J^{c}}} \bigl\vert \overline{ \lambda}_{k}(1) - \lambda _{j} \bigr\vert \\ \leq {}& \min_{j\in J} \bigl\vert \overline{ \lambda}_{k}(1) - \lambda _{j} \bigr\vert \quad \Bigl(\text{since } \overline{\lambda}_{k}(1) \in \widetilde{ \lambda}_{{\widehat{J}^{c}}}, \text{ from the definition of }{\widehat{J}^{c}}, \\ & \min_{j\in {J^{c}}} \bigl\vert \overline{\lambda}_{k}(1) - \lambda _{j} \bigr\vert = \min_{j\in N} \bigl\vert \overline{\lambda}_{k}(1) - \lambda _{j} \bigr\vert \Bigr) \\ ={}& g(1). \end{aligned}$$

    Thus, by intermediate value theorem, there exists \(t'\in [0,1]\) such that \(g(t') = h(t')\). That is, \(\min_{j\in J} |\overline{\lambda}_{k}(t') - \lambda _{j}| = \min_{j \in {J^{c}}} |\overline{\lambda}_{k}(t') - \lambda _{j}|\). Equivalently,

    $$\begin{aligned} \operatorname{sep} \bigl(\lambda _{J}, \bigl\{ \overline{\lambda}_{k} \bigl(t' \bigr) \bigr\} \bigr) = \operatorname{sep} \bigl(\lambda _{J^{c}}, \bigl\{ \overline{ \lambda}_{k} \bigl(t' \bigr) \bigr\} \bigr)\quad \text{for some } t'\in [0,1]. \end{aligned}$$
    (34)

    Now,

    $$\begin{aligned} \bigl\Vert \overline{M} \bigl(t' \bigr) - M \bigr\Vert _{2} & \geq \min_{j\in N} \bigl\vert \overline{ \lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert \quad \text{(Corollary3.1)} \\ & = \min \Bigl( \min_{j\in J} \bigl\vert \overline{ \lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert , \min_{j\in {J^{c}}} \bigl\vert \overline{ \lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert \Bigr) \\ & = \frac{1}{2} \Bigl( \min_{j\in J} \bigl\vert \overline{\lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert + \min_{j\in {J^{c}}} \bigl\vert \overline{\lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert \Bigr) \\ & \quad \Bigl(\text{since from (34), } \min_{j\in J} \bigl\vert \overline{\lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert = \min_{j \in {J^{c}}} \bigl\vert \overline{\lambda}_{k} \bigl(t' \bigr) - \lambda _{j} \bigr\vert \Bigr) \\ & = \frac{1}{2} \bigl( \operatorname{sep} \bigl(\lambda _{J}, \bigl\{ \overline{\lambda}_{k} \bigl(t' \bigr) \bigr\} \bigr) + \operatorname{sep} \bigl(\lambda _{J^{c}}, \bigl\{ \overline{\lambda}_{k} \bigl(t' \bigr) \bigr\} \bigr) + \operatorname{diam} \bigl( \bigl\{ \overline{\lambda}_{k} \bigl(t' \bigr) \bigr\} \bigr) \bigr) \\ & \quad \text{(since the diameter of a point is zero)} \\ & \geq \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) \quad \text{(using Lemma3)}. \end{aligned}$$
    (35)

    However, \(\|\overline{M}(t') - M\|_{2} = t' \|\widetilde{M} - M\|_{2} < t' \frac{1}{2} \operatorname{sep}(\lambda _{J},\lambda _{J^{c}}) \leq \frac{1}{2} \operatorname{sep}(\lambda _{J},\lambda _{J^{c}})\). We thus end up with a contradiction.

 □

In the following propositions, we express the upper bounds on \(d_{\mathrm{{sp}}} ( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) )\) in terms of \((\widetilde{M}-M)\) and nontilde terms only.

Proposition 2

For any \(J \subseteq N\) such that \(|J|=q\), define \({J^{c}} = N - J\). If \(\|\widetilde{M} - M\|_{2} < \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}})\),

  1. 1.
    $$\begin{aligned} & d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) \bigr) \\ &\quad \leq \frac{1}{\sqrt{q}} \min \biggl( \sqrt{ \sum _{j\in J} \biggl( \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} }{ \min_{k\in {J^{c}}} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} } \biggr)^{ 2} } , \\ & \qquad \sqrt{ \sum_{j\in {J^{c}}} \biggl( \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert _{2} }{ \min_{k\in J} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} } \biggr)^{ 2} } \biggr) \end{aligned}$$
    (36)
    $$\begin{aligned} & \quad \leq \min \biggl( 1, \sqrt{\frac{n-q}{q}} \biggr) \frac{ \Vert \widetilde{M} - M \Vert _{2} }{ \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2} }. \end{aligned}$$
    (37)
  2. 2.
    $$\begin{aligned} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) \bigr) & \leq \frac{\frac{1}{\sqrt{2q}} \Vert \widetilde{M} - M \Vert _{F}}{\operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2}}, \end{aligned}$$
    (38)

where Ĵ and \({\widehat{J}^{c}}\) are as defined in (32).

Proof

For any \(j\in J\),

$$\begin{aligned} \min_{j'\in {\widehat{J}^{c}}} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert & = \operatorname{sep} \bigl(\{\lambda _{j}\}, \widetilde{\lambda}_{{ \widehat{J}^{c}}} \bigr) \\ &\geq \operatorname{sep} \bigl(\{\lambda _{j}\}, \lambda _{J^{c}} \bigr) - d_{H}( \lambda _{J^{c}}, \widetilde{\lambda}_{{\widehat{J}^{c}}}) \quad \text{(due to Lemma4)} \\ & \geq \operatorname{sep} \bigl(\{\lambda _{j}\}, \lambda _{J^{c}} \bigr) - d_{H}( \lambda _{N}, \widetilde{\lambda}_{N}) \\ & \quad \bigl(\text{due to Lemma5.4., } d_{H}(\lambda _{J^{c}}, \widetilde{\lambda}_{{\widehat{J}^{c}}}) \leq d_{H}(\lambda _{N}, \widetilde{\lambda}_{N}) \bigr) \\ & \geq \operatorname{sep} \bigl(\{\lambda _{j}\}, \lambda _{J^{c}} \bigr) - \Vert \widetilde{M} - M \Vert _{2} \quad \text{(using (33))} \\ & = \min_{k\in {J^{c}}} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} . \end{aligned}$$
(39)

Thus, in Proposition 1 choosing \(\kappa _{j}=0\), \(\forall j\in J\), we get

$$\begin{aligned} \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widehat{J}}) \bigr) \bigr)^{2} & \leq \frac{1}{q} \sum_{j\in J} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} }{ \min_{j'\in {\widehat{J}^{c}}} \vert \widetilde{\lambda}_{{j'}} - \lambda _{j} \vert ^{2} } \\ & \leq \frac{1}{q} \sum_{j\in J} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} }{ ( \min_{k\in {J^{c}}} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} )^{2} } \end{aligned}$$
(40)
$$\begin{aligned} & \leq \frac{ \frac{1}{q} \sum_{j\in J} \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} }{ \min_{j\in J} ( \min_{k\in {J^{c}}} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} )^{2} } \\ & \quad \biggl(\text{since } \sum_{k\in S} \frac{c_{k}}{d_{k}} \leq \frac{\sum_{k\in S} c_{k}}{\min_{k\in S} d_{k}} \biggr) \\ & \leq \frac{ \Vert \widetilde{M} - M \Vert ^{2}_{2} }{ ( \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2} )^{2} } \\ & \quad \Bigl(\text{since } \Vert \widetilde{M} - M \Vert _{2} \geq \bigl\Vert ( \widetilde{M} - M) \mathbf{u}_{{j}} \bigr\Vert _{2}\text{ and} \\ & \quad \min_{k\in J}(c_{k} - \alpha )^{2} = \Bigl(\min_{k\in J} c_{k} - \alpha \Bigr)^{2} \Bigr). \end{aligned}$$
(41)

In the above, switching the roles of J and \({J^{c}}\) (likewise, Ĵ and \({\widehat{J}^{c}}\)) and noting that \(\operatorname{span}(\mathbf{u}_{J^{c}})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{{\widehat{J}^{c}}})\) are \((n-q)\)-dimensional subspaces of \(\mathbb{C}^{n}\), we get

$$\begin{aligned} \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J^{c}}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{{ \widehat{J}^{c}}}) \bigr) \bigr)^{2} & \leq \frac{1}{n-q} \sum_{j\in {J^{c}}} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} }{ ( \min_{k\in J} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} )^{2} } \\ & \leq \frac{ \Vert \widetilde{M} - M \Vert ^{2}_{2} }{ ( \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2} )^{2} }. \end{aligned}$$

But since \(\operatorname{span}(\mathbf{u}_{J^{c}})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{{\widehat{J}^{c}}})\) are orthogonal complements of \(\operatorname{span}(\mathbf{u}_{J})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}})\) respectively, from Lemma 2 we have \((n-q) ( d_{\mathrm{{sp}}} ( \operatorname{span}(\mathbf{u}_{J^{c}}) , \operatorname{span}(\widetilde{\mathbf{u}}_{{\widehat{J}^{c}}}) ) )^{2} = q ( d_{\mathrm{{sp}}} ( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{ \widehat{J}}) ) )^{2}\). This gives us from the above

$$\begin{aligned} \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widehat{J}}) \bigr) \bigr)^{2} & \leq \frac{1}{q} \sum_{j\in {J^{c}}} \frac{ \Vert (\widetilde{M} - M) \mathbf{u}_{{j}} \Vert ^{2}_{2} }{ ( \min_{k\in J} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{M} - M \Vert _{2} )^{2} } \end{aligned}$$
(42)
$$\begin{aligned} & \leq \frac{n-q}{q} \frac{ \Vert \widetilde{M} - M \Vert ^{2}_{2} }{ ( \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2} )^{2} }. \end{aligned}$$
(43)

Combining (41) and (43) gives the first result of the proposition.

The second result can be obtained directly using Corollary 5.2 and observing that due to (39), \(\operatorname{sep}(\lambda _{J}, \widetilde{\lambda}_{{\widehat{J}^{c}}}) \geq \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \|\widetilde{M} - M \|_{2}\) (and analogously \(\operatorname{sep}(\lambda _{J^{c}}, \widetilde{\lambda}_{\widehat{J}}) \geq \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \|\widetilde{M} - M \|_{2}\)). □

Assuming \(q\leq n/2\), it is worth noting that defining \(\epsilon = \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \|\widetilde{M} - M\|_{2}\), the second inequality of the first result in the above proposition becomes \(d_{\mathrm{{sp}}} ( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) ) \leq \frac{\|\widetilde{M} - M\|_{2}}{\|\widetilde{M} - M\|_{2} + 2 \epsilon} \). Thus, with \(\epsilon \rightarrow 0\), this inequality becomes \(d_{\mathrm{{sp}}} ( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) ) < 1\), rendering the result uninformative /redundant. Thus, the higher the separation between \(\lambda _{J}\) and \(\lambda _{J^{c}}\) (relative to \(\|\widetilde{M} - M\|_{2}\)), the tighter will be the upper bound in the result of the proposition.

An interpretation of the result in the above proposition is that a perturbation \(\widetilde{M} - M\) of the matrix M will result in a perturbation in the invariant subspace \(\operatorname{span}(\mathbf{u}_{J})\) such that the distance between the subspace and its perturbed counterpart is bounded by the upper bounds mentioned in the proposition. One key feature of the proposition, however, is that the upper bound in the inequality does not depend on Ĵ. As a consequence, for any other size-q subset of N such that \(\operatorname{span}(\mathbf{u}_{\widetilde{J}})\) is closer to \(\operatorname{span}(\mathbf{u}_{J})\) than \(\operatorname{span}(\mathbf{u}_{\widehat{J}})\) still satisfies the same upper bound. That is, if \(\|\widetilde{M} - M\|_{2} < \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}})\), then

$$\begin{aligned} \begin{aligned} &\min_{\widetilde{J} \in S_{q,n}} d_{\mathrm{sp}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widetilde{J}}) \bigr) \leq \min \biggl( 1, \sqrt{ \frac{n-q}{q}} \biggr) \frac{ \Vert \widetilde{M} - M \Vert _{2} }{ \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2} } \\ & \min_{\widetilde{J} \in S_{q,n}} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}( \widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) \leq \frac{\frac{1}{\sqrt{2q}} \Vert \widetilde{M} - M \Vert _{F}}{\operatorname{sep}(\lambda _{J}, \lambda _{J^{c}}) - \Vert \widetilde{M} - M \Vert _{2}}, \end{aligned} \end{aligned}$$
(44)

where \(S_{q,n}\) is the set of all q-element subsets of \(N = \{1,2,\ldots ,n\}\).

4 Application to null-space perturbation in the context of a graph connection problem

We consider a simple application of the above results in the context of a graph theory problem. Some definitions and basic properties of a weighted, undirected, simple graphs are listed below [10].

  1. 1

    A graph G consists of a set of n vertices \(\mathcal{V}(G) = \{v_{1}, v_{2}, \ldots , v_{n}\}\) and an edge set \(\mathcal{E}(G) \subseteq \mathcal{V}(G) \times _{\text{sym}} \mathcal{V}(G)\) (where ‘×sym’ represent the symmetric Cartesian product so that for the undirected graph the order of the vertices in an edge is irrelevant, making \((v_{k}, v_{l}) = (v_{l}, v_{k})\)). Each edge \((v_{k}, v_{l})\in \mathcal{E}(G)\) is assigned a positive real weight \(A_{kl} (= A_{lk})\). Nonexistent edges are implicitly assumed to have zero edge weight so that \(A_{kl} = 0, \forall (v_{k}, v_{l})\notin \mathcal{E}(G)\). The matrix \(A\in \mathbb{R}^{n\times n}\) is called the weighted adjacency matrix of the graph G and is a symmetric matrix with zero diagonal for an undirected, simple graph.

  2. 2

    The weighted degree matrix D is an \(n\times n\) diagonal matrix in which the kth diagonal element is the sum of the elements in the kth row (equivalently, kth column) of A. Thus \(D_{kk}\) is the sum of the weights of the edges emanating from \(v_{k}\) (also called the degree of the vertex).

  3. 3

    The weighted Laplacian matrix of the graph is defined as \(L = D - A\). An eigenvector of L is an n-dimensional real vector and can be interpreted as a distribution over the vertices (with the kth element of the vector being the value associated to \(v_{k}\in \mathcal{E}(G)\)).

  4. 4

    The eigenvalues of L are nonnegative. The null-space of L for a graph with q disjoint components is q-dimensional, with the null-space spanned by vectors corresponding to distributions that are uniform over the vertices of each of those components. Without loss of generality we index the eigenvalues in an increasing order of their magnitudes so that \(0 = \lambda _{1} = \lambda _{2} = \cdots = \lambda _{q} \leq \lambda _{q+1} \leq \lambda _{q+2} \leq \cdots \leq \lambda _{n}\). The corresponding unit eigenvectors are \(\mathbf{u}_{1}, \mathbf{u}_{2}, \ldots , \mathbf{u}_{n}\). Note that since a graph has at least one connected component, \(\lambda _{1} = 0\) for any graph. Furthermore, without loss of generality, we choose \(\mathbf{u}_{j}\) to be a distribution that is uniformly positive over the vertices if \(G_{j}\) and zero over the rest of the vertices in the graph.

  5. 5

    Define \(J = \{1,2,\ldots ,q\}\), so that \(\operatorname{span}(\mathbf{u}_{J})\) is the null-space of L.

If G has q disjoint components, we define \(G_{j}, j=1,2,\ldots ,q\), to be the subgraph constituting of the vertices and edges in the jth component only. Thus, \(\mathcal{V}(G) = \bigcup_{j=1}^{q} \mathcal{V}(G_{j})\) and \(\mathcal{E}(G) = \bigcup_{j=1}^{q} \mathcal{E}(G_{j})\) (more compactly, we write \(G = \bigcup_{j=1}^{q} G_{j}\)). We also define the collection of these subgraphs as

$$ \text{\textsf{G}}=\{G_{1},G_{2},\ldots , G_{q}\}. $$

We are interested in understanding perturbation of the invariant subspace \(\operatorname{span}(\mathbf{u}_{J})\) (the null-space) of L as new edges are established between the different disjoint components (henceforth also referred to as “clusters”) of the graph. Let the graph constructed by establishing the inter-cluster edges be with Ã, , and its adjacency, degree, and Laplacian matrices respectively. Note that since is constructed by just adding edges between the subgraphs \(\{G_{j}\}_{j=1,2,\ldots ,q}\) of G, all of these subgraphs are induced subgraphs of .

4.1 Computation of \(\|(\widetilde{L} - L) \mathbf{u}_{j}\|_{2}\)

For any induced subgraph \(H \in \widetilde{G}\), we consider the edges that connect vertices in H to vertices not in H (inter-cluster edges). These are edges of the form \((v_{k},v_{l})\) such that \(v_{k}\in \mathcal{V}(H), v_{l}\notin \mathcal{V}(H)\). We define a few quantities involving the weights on such edges.

Definition 5

  1. 1.

    External degree of a vertex relative to a subgraph: Given a subgraph \(H\subseteq \widetilde{G}\) and a vertex \(v_{k}\in \mathcal{V}(H)\), the external degree of \(v_{k}\) relative to H in is defined as the sum of the weights on edges connecting \(v_{k}\) to vertices outside H:

    $$\begin{aligned} {\mathcal{ED}}_{H,\widetilde{G}}(v_{k}) = \sum _{\{l| v_{l} \notin \mathcal{V}(H)\}} \widetilde{A}_{kl}. \end{aligned}$$
    (45)
  2. 2.

    Coupling of a subgraph in a graph: Given an induced subgraph \(H\subseteq \widetilde{G}\), we define the coupling of H in as

    $$\begin{aligned} & {\mathcal{CP}}_{\widetilde{G}}(H) \\ &\quad = { \frac{1}{ \vert \mathcal{V}(H) \vert } \biggl( \sum_{\{k|v_{k}\in \mathcal{V}(H)\}} \bigl( {\mathcal{ED}}_{H,\widetilde{G}}(v_{k}) \bigr)^{2} + \sum_{\{l|v_{l}\notin \mathcal{V}(H)\}} \bigl( {\mathcal{ED}}_{(\widetilde{G}-H),\widetilde{G}}(v_{l}) \bigr)^{2} \biggr) } \\ &\quad = { \frac{1}{ \vert \mathcal{V}(H) \vert } \biggl( \sum_{\{k|v_{k}\in \mathcal{V}(H)\}} \biggl( \sum_{\{l|v_{l}\notin \mathcal{V}(H)\}} \widetilde{A}_{kl} \biggr)^{2} + \sum_{ \{l|v_{l}\notin \mathcal{V}(H)\}} \biggl( \sum _{\{k | v_{k}\in \mathcal{V}(H)\}} \widetilde{A}_{kl} \biggr)^{2} \biggr), } \end{aligned}$$
    (46)

    where \((\widetilde{G}-H)\) is the induced subgraph of constituting of all the vertices not in H. That is, \(\mathcal{V}(\widetilde{G}-H) = \{v\in \mathcal{V}(\widetilde{G}) | v\notin \mathcal{V}(H)\}\) and \(\mathcal{E}(\widetilde{G}-H) = \{(v,w)\in \mathcal{E}(\widetilde{G}) |v,w\notin \mathcal{V}(H)\}\).

  3. 3.

    Maximum external degree of vertices in a subgraph: Given a subgraph \(H\subseteq \widetilde{G}\), the maximum external degree of vertices in H in is defined as the maximum value of the external degrees of vertices in H relative to H in :

    $$\begin{aligned} {\mathcal{MED}}_{\widetilde{G}}(H) = \max_{v\in \mathcal{V}(H)} { \mathcal{ED}}_{H,\widetilde{G}}(v) = \max_{\{k|v_{k}\in \mathcal{V}(H)\}} \sum _{\{l| v_{l}\notin \mathcal{V}(H)\}} \widetilde{A}_{kl}. \end{aligned}$$
    (47)

Note that the computation of the above quantities requires the knowledge of only the weights on edges connecting vertices in H to vertices outside H in (see Fig. 5 for an example).

Figure 5
figure 5

An example graph and induced subgraph H. Weight values on the inter-cluster edges are written symbolically. In this example, \({\mathcal{ED}}_{H,\widetilde{G}}(v) = w_{1} + w_{2}\), \({ \mathcal{CP}}_{\widetilde{G}}(H) = \frac{1}{5} ( ( (w_{1} + w_{2})^{2} + w_{3}^{2} + w_{4}^{2} ) + ( w_{1}^{2} + w_{2}^{2} + (w_{3} + w_{4})^{2} ) )\), and \({\mathcal{MED}}_{ \widetilde{G}}(H) = \max (w_{1} + w_{2}, w_{3}, w_{4} )\)

In the definition of \({\mathcal{CP}}_{\widetilde{G}}\), referring to H as a cluster and considering the rest of the graph another cluster, the quantity within the innermost brackets is the sum of the weights on inter-cluster edges connected to a vertex, which is squared and summed over all the vertices that have at least one inter-cluster edge connected to it. This quantity is then divided by the number of vertices in H. Thus a large subgraph which is weakly connected to the rest of the graph will have a lower coupling value.

The following lemma provides bounds on \({\mathcal{CP}}_{\widetilde{G}}(H)\) in terms of a simpler summation over the inter-cluster edge weights (or square thereof).

Lemma 8

$$\begin{aligned} \frac{2}{ \vert \mathcal{V}(H) \vert } \sum_{\{k,l | v_{k}\in \mathcal{V}(H), \atop v_{l}\notin \mathcal{V}(H)\}} \widetilde{A}_{kl}^{2} \leq { \mathcal{CP}}_{\widetilde{G}}(H) \leq \frac{2}{ \vert \mathcal{V}(H) \vert } \biggl( \sum_{ \{k,l | v_{k}\in \mathcal{V}(H), \atop v_{l}\notin \mathcal{V}(H)\}} \widetilde{A}_{kl} \biggr)^{2}. \end{aligned}$$
(48)

Proof

The proof follows directly using the fact that for a set of positive numbers \(\alpha _{h}, h\in S\), \(\sum_{h\in S} \alpha _{h}^{2} \leq (\sum_{h \in S} \alpha _{h})^{2}\). □

Notations and assumptions for the rest of the paper

In the rest of the paper we assume that G is a graph with q disjoint components \(\text{\textsf{G}} = \{G_{1},G_{2},\ldots ,G_{q}\}\) and is the graph obtained by establishing edges between the components (so that each \(G_{j}\) is an induced subgraph of both G and ). The Laplacian matrices of the two graphs are L and respectively. Since G has q connected components, its null-space is q dimensional (with corresponding eigenvalues \(\lambda _{1}=\lambda _{2}=\cdots =\lambda _{q}\)), for which we choose a basis \(\{\mathbf{u}_{j}\}_{j=1,2\cdots ,q}\) such that the distribution corresponding to \(\mathbf{u}_{j}\) is uniform and positive on the vertices in \(G_{j}\) and zero everywhere else.

A weaker version of the following lemma appears in the author’s prior work [11, 12] and expresses the quantity \(\|(\widetilde{L} - L) \mathbf{u}_{j}\|_{2}\) in terms of the weights on edges connecting vertices in \(G_{j}\) to vertices outside \(G_{j}\) in .

Lemma 9

For all \(j\in \{1,2,\ldots ,q\}\),

$$\begin{aligned} \bigl\Vert (\widetilde{L} - L) \mathbf{u}_{j} \bigr\Vert _{2}^{2} &= {\mathcal{CP}}_{ \widetilde{G}}(G_{j}). \end{aligned}$$
(49)

Proof

Suppose \(v_{k}\in \mathcal{V}(G_{j}) \subseteq \mathcal{V}(G)\). Since \(\widetilde{D}_{kk}\) and \(D_{kk}\) are the degrees of the vertex in the graphs and G respectively, they are equal iff all the neighbors of \(v_{k}\) are in \(G_{j}\). Otherwise \(\widetilde{D}_{kk} - D_{kk}\) is the net outgoing degree of the vertex \(v_{k}\) from the subgraph \(G_{j}\). That is, if \(v_{k}\in \mathcal{V}(G_{j})\), then

$$\begin{aligned} \widetilde{D}_{kk} - D_{kk} & = \sum _{\{l| v_{l}\notin \mathcal{V}(G_{j})\}} \widetilde{A}_{kl}. \end{aligned}$$
(50)

An edge \((v_{k},v_{l})\) exists in both and G (and have the same weight, i.e., \(\widetilde{A}_{kl} = A_{kl}\)) iff \(v_{k}\) and \(v_{l}\) belong to the same subgraph \(G_{j}\). Otherwise \(A_{kl}=0\) (the edge is nonexistent in G). Thus,

$$ \widetilde{A}_{kl} - A_{kl} = \textstyle\begin{cases} \widetilde{A}_{kl},& \text{if } v_{k} \in \mathcal{V}(G_{j}), v_{l} \notin \mathcal{V}(G_{j}), \\ 0, &\text{otherwise.} \end{cases} $$
(51)

Next we consider the vector \(\mathbf{u}_{j}\) (for \(j=1,\ldots ,q\)), which by definition is nonzero and uniform only on vertices in the subgraph \(G_{j}\). Let \(u_{lj}\) be the lth element of the unit vector \(\mathbf{u}_{j}\). Since \(|\mathcal{V}(G_{j})|\) of the elements of the vector are nonzero and uniform, we have

$$ u_{lj} = \textstyle\begin{cases} \frac{1}{\sqrt{ \vert \mathcal{V}(G_{j}) \vert }},& \text{if }v_{l} \in \mathcal{V}(G_{j}), \\ 0, &\text{otherwise.} \end{cases} $$
(52)

Thus the kth element of the vector \((\widetilde{L}-L) \mathbf{u}_{j}\),

$$\begin{aligned} \bigl[(\widetilde{L}-{L}) {\mathbf{u}}_{j} \bigr]_{k} ={} & \sum_{l} (\widetilde{D}_{kl} - \widetilde{A}_{kl} - {D}_{kl} + {A}_{kl}) {u}_{lj} \\ = {}& (\widetilde{D}_{kk} - {D}_{kk}) {u}_{kj} - \sum_{l} ( \widetilde{A}_{kl} - {A}_{kl}) {u}_{lj} \\ & (\text{since }\widetilde{D}\text{ and }{D}\text{ are diagonal matrices}) \\ ={} & \left( \frac{1}{\sqrt{ \vert \mathcal{V}(G_{j}) \vert }} \textstyle\begin{cases} (\widetilde{D}_{kk} - {D}_{kk}),& \text{if }v_{k} \in \mathcal{V}(G_{j}) \\ 0,& \text{otherwise} \end{cases}\displaystyle \right) \\ & {}- \biggl( \frac{1}{\sqrt{ \vert \mathcal{V}(G_{j}) \vert }} \sum_{\{l|v_{l} \in \mathcal{V}(G_{j})\}} (\widetilde{A}_{kl} - {A}_{kl}) \biggr) \quad \text{(using (52))} \\ ={} & \frac{1}{\sqrt{ \vert \mathcal{V}(G_{j}) \vert }} \left( \textstyle\begin{cases} \sum_{\{l| v_{l}\notin \mathcal{V}(G_{j})\}} \widetilde{A}_{kl}, &\text{if }v_{k} \in \mathcal{V}(G_{j}) \\ 0, & \text{otherwise} \end{cases}\displaystyle \right. \\ & {} - \left.\sum_{\{l|v_{l} \in \mathcal{V}(G_{j})\}} \textstyle\begin{cases} \widetilde{A}_{kl},&\text{if $v_{k}\notin \mathcal{V}(G_{j})$} \\ 0, &\text{otherwise} \end{cases}\displaystyle \right) \quad \text{(using (50) and (51))} \\ ={} & \frac{1}{\sqrt{ \vert \mathcal{V}(G_{j}) \vert }} \textstyle\begin{cases} \sum_{\{l| v_{l}\notin \mathcal{V}(G_{j})\}} \widetilde{A}_{kl},& \text{if }v_{k} \in \mathcal{V}(G_{j}) \\ -\sum_{\{l|v_{l} \in \mathcal{V}(G_{j})\}} \widetilde{A}_{kl}, &\text{if }v_{k}\notin \mathcal{V}(G_{j}). \end{cases}\displaystyle \end{aligned}$$
(53)

Thus,

$$\begin{aligned} & \bigl\Vert (\widetilde{L}-{L}) {\mathbf{u}}_{j} \bigr\Vert _{2}^{2} \\ & = \frac{1}{ \vert \mathcal{V}(G_{j}) \vert } \biggl( \sum_{\{k|v_{k} \in \mathcal{V}(G_{j})\}} \biggl( \sum_{\{l| v_{l}\notin \mathcal{V}(G_{j})\}} \widetilde{A}_{kl} \biggr)^{2} + \sum_{\{k|v_{k} \notin \mathcal{V}(G_{j})\}} \biggl( \sum _{\{l|v_{l} \in \mathcal{V}(G_{j})\}} \widetilde{A}_{kl} \biggr)^{2} \biggr). \end{aligned}$$
(54)

 □

Lemma 10

$$\begin{aligned} \Vert \widetilde{L} - L \Vert _{2} \leq 2 \max _{j\in \{1,\ldots ,q\}} { \mathcal{MED}}_{\widetilde{G}}(G_{j}). \end{aligned}$$
(55)

Proof

Suppose \(v_{k} \in \mathcal{V}(G_{\text{\textsf{j}}(k)})\) (where \(\text{\textsf{j}}: \{1,2,\ldots ,|\mathcal{V}(G)|\} \rightarrow \{1,2, \ldots ,q\}\) maps the index of a vertex to the index of the subgraph in G that the vertex belongs to). The sum of the elements of the kth row of \((\widetilde{A} - A)\) is

$$\begin{aligned} \sum_{l} (\widetilde{A}_{kl} - {A}_{kl}) &= \sum_{l} \textstyle\begin{cases} \widetilde{A}_{kl}, & \text{if }v_{l} \notin \mathcal{V}(G_{{j}(k)}) \\ 0, &\text{otherwise} \end{cases}\displaystyle \quad \text{(using (51))} \\ &= \sum_{\{l| v_{l}\notin \mathcal{V}(G_{ \text{\textsf{j}}(k)})\}} \widetilde{A}_{kl} \\ &= {\mathcal{ED}}_{G_{\text{\textsf{j}}(k)},\widetilde{G}} (v_{k}) \quad \text{(Definition5)}. \end{aligned}$$
(56)

Since \((\widetilde{A} - A)\) is a symmetric matrix, its 2-norm is equal to its spectral radius \(\rho (\widetilde{A} - A)\). Furthermore, since all elements of \((\widetilde{A} - A)\) are nonnegative, using the Perron–Frobenius theorem [13], we get

$$\begin{aligned} \Vert \widetilde{A} - A \Vert _{2} ={}& \rho (\widetilde{A} - A) \leq \max_{k\in N} {\mathcal{ED}}_{G_{\text{\textsf{j}}(k)},\widetilde{G}} (v_{k}) \\ ={}& \max_{j\in \{1,\ldots ,q\}} \max_{\{k |\atop v_{k}\in \mathcal{V}(G_{j})\}} { \mathcal{ED}}_{G_{\text{\textsf{j}}(k)}, \widetilde{G}} (v_{k}) \\ & (\text{since maximizing over all vertices in }\widetilde{G}\text{ is same as } \\ &\text{maximizing over the subgraphs }G_{j}\text{ and for each} \\ &\text{subgraph maximizing over the vertices in the subgraph}) \\ ={}& \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}). \end{aligned}$$
(57)

Again, since \((\widetilde{D} - D)\) is a diagonal matrix with positive diagonal elements (due to (50)), its 2-norm is the maximum out of its diagonal elements. That is,

$$\begin{aligned} \Vert \widetilde{D} - D \Vert _{2} ={}& \max _{k\in N} (\widetilde{D}_{kk} - {D}_{kk}) = \max_{k\in N} \sum _{\{l| v_{l} \notin \mathcal{V}(G_{\text{\textsf{j}}(k)})\}} \widetilde{A}_{kl} \quad \text{(using (50))} \\ ={}& \max_{k\in N} {\mathcal{ED}}_{G_{\text{\textsf{j}}(k)},\widetilde{G}} (v_{k}) \\ ={}& \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) \quad \text{(following similar steps as in (56) and (57)).} \end{aligned}$$
(58)

Thus,

$$\begin{aligned} \Vert \widetilde{L} - L \Vert _{2}& = \bigl\Vert ( \widetilde{D} - D) - ( \widetilde{A} - A) \bigr\Vert _{2} \leq \Vert \widetilde{D} - D \Vert _{2} + \Vert \widetilde{A} - A \Vert _{2} \\ & \leq 2 \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{ \widetilde{G}}(G_{j}). \end{aligned}$$

 □

In the following discussions, without loss of generality, we assume that the eigenvalues of are indexed in an increasing order of magnitude \(0 = \widetilde{\lambda}_{1} \leq \widetilde{\lambda}_{2} \leq \cdots \leq \widetilde{\lambda}_{n}\). The corresponding eigenvectors are \(\widetilde{\mathbf{u}}_{1},\widetilde{\mathbf{u}}_{2},\ldots , \widetilde{\mathbf{u}}_{n}\).

4.2 Bounds on null-space perturbation with known spectrum of

The following proposition gives a bound on the perturbation of the null-space of L upon introducing edges between the subgraphs in \(\text{\textsf{G}} = \{G_{1},G_{2},\ldots ,G_{q}\}\) by considering the sub-space distance between the null-space of L and a specific invariant sub-space of .

Proposition 3

Choose \(\widetilde{J} = \{1,2,\ldots ,q\}\). Then

$$\begin{aligned} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widetilde{J}}) \bigr) & \leq \frac{1}{\widetilde{\lambda}_{q+1}} \sqrt{ \frac{1}{q} \sum _{j=1}^{q} {\mathcal{CP}}_{\widetilde{G}}(G_{j}) }. \end{aligned}$$
(59)

Proof

We first note that due to Lemma 9\(\sqrt{{\mathcal{CP}}_{\widetilde{G}}(G_{j})} = \|(\widetilde{L} - L) \mathbf{u}_{{j}}\|_{2}\), \(\forall j\in \{1,2,\ldots ,q\}\). The proof then follows from Proposition 1 by setting \(\kappa _{j} = 0\), \(\forall j=1,2,\ldots q\) and noting that \(\min_{j\in \{1,2,\ldots ,q\} , j'\in \{q+1,q+2, \ldots ,n\}} | \widetilde{\lambda}_{j'} - \lambda _{j}| = \widetilde{\lambda}_{q+1}\). □

The results of Proposition 3 can be re-interpreted by considering G to be the graph obtained by cutting into q-subgraphs. We call the set of subgraphs hence constructed upon performing the cut \(\text{\textsf{G}} = \{G_{1},G_{2},\ldots , G_{q}\}\) a q-cut of . Given a graph , we consider all possible q-cuts of . A q-cut \(\text{\textsf{G}} = \{G_{1},G_{2},\ldots , G_{q}\}\) results in a graph \(G = \bigcup_{j=1}^{q} G_{j}\) with q disjoint components. The following corollary is then a direct consequence of the proposition.

Corollary 6

Given a graph (with Laplacian with eigenvalues \(0 = \widetilde{\lambda}_{1} \leq \cdots \leq \widetilde{\lambda}_{n}\) and corresponding eigenvectors \(\widetilde{\mathbf{u}}_{1},\ldots ,\widetilde{\mathbf{u}}_{n}\)), let \(\mathscr{G}\) be the set of all q-cuts of . We consider a q-cut such that the sum of the couplings of the resultant q subgraphs in is minimum. That is,

$$ \textit{\textsf{G}}^{*} \in {\arg \min}_{\textit{\textsf{G}} \in \mathscr{G}} \sum_{G' \in \textit{\textsf{G}}} {\mathcal{CP}}_{\widetilde{G}} \bigl(G' \bigr). $$
(60)

Let the corresponding graph \(G^{*} = \bigcup_{G'\in \textit{\textsf{G}}^{*}} G'\) have eigenvalues \(0 = \lambda _{1}^{*} = \lambda _{2}^{*} = \cdots = \lambda _{q}^{*} \leq \lambda _{q+1}^{*} \leq \cdots \leq \lambda _{n}^{*}\) and corresponding eigenvectors \(\mathbf{u}^{*}_{1}, \mathbf{u}^{*}_{2}, \ldots , \mathbf{u}^{*}_{n}\). Then

$$\begin{aligned} d_{\mathrm{{sp}}} \bigl( \operatorname{span} \bigl(\{\widetilde{ \mathbf{u}}_{1}, \ldots ,\widetilde{\mathbf{u}}_{q}\} \bigr) , \operatorname{span} \bigl( \bigl\{ \mathbf{u}^{*}_{1}, \ldots ,\mathbf{u}^{*}_{q} \bigr\} \bigr) \bigr) & \leq \frac{1}{\widetilde{\lambda}_{q+1}} \sqrt{ \frac{1}{q} \sum _{G' \in \textit{\textsf{G}}^{*}} {\mathcal{CP}}_{\widetilde{G}} \bigl(G' \bigr) }. \end{aligned}$$
(61)

The interpretation of the above corollary is that the “best” q-cut of a graph (minimizing total inter-cluster coupling, as defined by (60)) results in a graph such that the distance between the nullspace of the cut graph’s Laplacian and the space spanned by the first q eigenvectors of the Laplacian of is bounded above by a quantity proportional to the total inter-cluster coupling (which was minimized in the first place).

4.3 Bounds on null-space perturbation with known spectrum of L

Proposition 4

If \(\max_{j \in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) < \frac{\lambda _{q+1}}{4}\), then

$$\begin{aligned} d_{\mathrm{{sp}}} \bigl( \operatorname{span}(\mathbf{u}_{J}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\widehat{J}}) \bigr) & \leq \frac{ \sqrt{ \frac{1}{q} \sum_{j=1}^{q} {\mathcal{CP}}_{\widetilde{G}}(G_{j}) } }{ \lambda _{q+1} - 2 \max_{k\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{k}) } \\ & \leq \frac{ 2 \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) }{ \lambda _{q+1} - 2 \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) }, \end{aligned}$$
(62)

where \(\widehat{J} = \{1,2,\ldots ,q\} = \{j' | \min_{j\in N} | \widetilde{\lambda}_{j'} - \lambda _{j}| = \widetilde{\lambda}_{j'} \} \).

Proof

Recall that the eigenvalues of the Laplacian L of G are \((0=)\lambda _{1}=\lambda _{2}=\cdots =\lambda _{q} \leq \lambda _{q+1} \leq \cdots \leq \lambda _{n}\). Let \(J = \{1,2,\ldots ,q\}\) so that \({J^{c}} = \{q+1,q+2,\ldots ,n\}\) and \(\operatorname{sep}(\lambda _{J},\lambda _{J^{c}}) = \lambda _{q+1}\).

Using Lemma 10, we have

$$\begin{aligned} \Vert \widetilde{L} - L \Vert _{2} \leq 2 \max _{j\in \{1,\ldots ,q\}} { \mathcal{MED}}_{\widetilde{G}}(G_{j}) < \frac{\lambda _{q+1}}{2} = \frac{\operatorname{sep}(\lambda _{J},\lambda _{J^{c}})}{2}. \end{aligned}$$
(63)

Thus the conditions for Lemma 7 and Proposition 2 hold, and is a separation preserving perturbation of L. Hence, by Lemma 7, there exists a separation preserving partition \(\{\widetilde{\lambda}_{\widehat{J}}, \widetilde{\lambda}_{ \widehat{{J^{c}}}}\}\) of \(\widetilde{\lambda}_{N}\) such that

$$ \widehat{J} = \Bigl\{ j' \big| \min_{j\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert = \min _{j\in J} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert = \widetilde{\lambda}_{j'} \Bigr\} \quad (\text{since }\lambda _{j} = 0, \forall j\in J). $$

Thus, for any \(j'\in \widehat{J}\),

$$\begin{aligned} \widetilde{\lambda}_{j'} &= \min_{j'\in N} \vert \widetilde{\lambda}_{j'} - \lambda _{j} \vert \leq \Vert \widetilde{L} - L \Vert _{2} \quad \text{(due to Corollary3)} \\ & \leq \frac{\lambda _{q+1}}{2} \quad \text{(from (63)).} \end{aligned}$$
(64)

This implies that the elements of \(\widetilde{\lambda}_{\widehat{J}}\) are closer to \(0 (=\lambda _{1}=\lambda _{2}=\cdots =\lambda _{q})\) than they are to \(\lambda _{q+1}\). Since Ĵ has q-elements (due to Lemma 7.2) and is a unique set (by definition), we have \(\widetilde{\lambda}_{\widehat{J}} = \{\widetilde{\lambda}_{1}, \widetilde{\lambda}_{2}, \ldots , \widetilde{\lambda}_{q}\}\) to be the set constituting of the lowest q eigenvalues of . Thus, \(\widehat{J} = \{1,2,\ldots ,q\}\).

Since we showed that \(\|\widetilde{L} - L\|_{2} \leq \frac{1}{2} \operatorname{sep}(\lambda _{J}, \lambda _{J^{c}})\), as direct consequence of Proposition 2, we have the following:

$$\begin{aligned} & \bigl( d_{\mathrm{{sp}}} \bigl( \operatorname{span}( \mathbf{u}_{J}) , \operatorname{span}(\widetilde{ \mathbf{u}}_{\widehat{J}}) \bigr) \bigr)^{2} \\ &\quad \leq \frac{1}{q} \sum_{j\in J} \biggl( \frac{ \Vert (\widetilde{L} - L) \mathbf{u}_{{j}} \Vert _{2} }{ \min_{k\in {J^{c}}} \vert {\lambda}_{k} - \lambda _{j} \vert - \Vert \widetilde{L} - L \Vert _{2} } \biggr)^{ 2}. \\ & \quad \leq \frac{ \frac{1}{q} \sum_{j\in J} {\mathcal{CP}}_{\widetilde{G}}(G_{j}) }{ ( \lambda _{q+1} - 2 \max_{k\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{k}) )^{ 2} } \\ & \qquad \Bigl(\text{using Lemma9 and Lemma10 and the} \\ & \qquad \text{fact that } \min_{k\in {J^{c}}} \vert { \lambda}_{k} - \lambda _{j} \vert = \lambda _{q+1}, \forall j\in \{1,2,\ldots ,n\} \Bigr) \\ & \quad \leq \biggl( \frac{ 2 \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) }{ \lambda _{q+1} - 2 \max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) } \biggr)^{ 2} \\ & \qquad \biggl(\text{using Lemma 9 and 10, } \sum_{j\in J} { \mathcal{CP}}_{\widetilde{G}}(G_{j}) = \sum _{j\in J} \bigl\Vert (\widetilde{L} - L) \mathbf{u}_{{j}} \bigr\Vert _{2}^{2} \\ & \quad \leq q \Vert \widetilde{L} - L \Vert _{2}^{2} \leq q \Bigl(2 \max_{j\in \{1, \ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) \Bigr)^{2} \biggr). \end{aligned}$$

 □

4.4 Example

As an illustration, we consider the graph G shown in Fig. 6 with 12 disjoint components, thus \(q=12\). The graph is generated with \(n = 333\) vertices clustered into 12 components in a randomized manner with only intra-cluster edges. The weight on every edge is chosen to be 1. Figure 6(a) shows an immersion of the graph in \(\mathbb{R}^{2}\) just for he purpose of visualization (the exact coordinates of the vertices have no significance).

Figure 6
figure 6

Graph G (immersed in \(\mathbb{R}^{2}\) for visualization) and its spectrum. Each individual cluster in the graph is \(G_{j}\), \(j=1,2,\ldots ,12\)

We then construct by establishing randomized edges between the components of G. The weight on every inter-cluster edge is also chosen to be 1. Figure 7(a) shows the immersion of the resultant graph.

Figure 7
figure 7

The graph , the spectrum of its Laplacian, and its first 12 eigenvectors (cn) visualized as distribution over the vertices (red is positive, blue is negative)

Direct computation reveals that for these graphs, \(\widetilde{\lambda}_{q+1} = 18.436\) and \(\frac{1}{q} \sum_{j=1}^{q} {\mathcal{CP}}_{\widetilde{G}}(G_{j}) = 0.5417\). The L.H.S. of (59) is \(d_{\mathrm{{sp}}} ( \operatorname{span}(\mathbf{u}_{\{1,2, \ldots ,12\}}) , \operatorname{span}(\widetilde{\mathbf{u}}_{\{1,2, \ldots ,12\}}) ) = 2.516\times 10^{-2}\), while the R.H.S. is \(\frac{ \sqrt{\frac{1}{q} \sum_{j=1}^{q} {\mathcal{CP}}_{\widetilde{G}}(G_{j})} }{ \widetilde{\lambda}_{q+1} } = 3.992\times 10^{-2}\), thus validating the result of Proposition 3.

Again, \(\max_{j\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{j}) = 3\) and \(\frac{\lambda _{q+1}}{4} = 4.6091\), thus satisfying the condition for Proposition 4. The R.H.S. in (62) is \(\frac{ \sqrt{ \frac{1}{q} \sum_{j=1}^{q} {\mathcal{CP}}_{\widetilde{G}}(G_{j}) } }{ \lambda _{q+1} - 2 \max_{k\in \{1,\ldots ,q\}} {\mathcal{MED}}_{\widetilde{G}}(G_{k}) } = 6.036 \times 10^{-2}\), thus validating the result of the proposition.

Since the chosen basis \(\{\mathbf{u}_{j}\}_{j=1,2,\ldots ,q}\) for the null-space of L consists of distributions such that \(\mathbf{u}_{j}\) is uniform and positive over vertices of \(G_{j}\) and zero everywhere else, this basis is not ideal for a visual comparison with \(\{\widetilde{\mathbf{u}}_{j}\}_{j=1,2,\ldots ,q}\). For a visual comparison between \(\operatorname{span}(\mathbf{u}_{\{1,2,\ldots ,q\}})\) and \(\operatorname{span}(\widetilde{\mathbf{u}}_{\{1,2,\ldots ,q\}})\), we choose a basis for the null-space of L that is closest to \(\{\widetilde{\mathbf{u}}_{j}\}_{j=1,2,\ldots ,q}\): Define the \(q\times q\) matrix \(R = ([\mathbf{u}_{1}, \mathbf{u}_{2}, \ldots , \mathbf{u}_{q}] )^{+} [\widetilde{\mathbf{u}}_{1}, \widetilde{\mathbf{u}}_{2}, \ldots , \widetilde{\mathbf{u}}_{q}]\), where \((\cdot )^{+}\) indicates the Moore–Pesrose pseudoinverse. We need to chose a unitary matrix that is close to R. This is given by taking the SVD of \(R = V \Sigma W^{{\dagger }}\) and defining \(R' = V W^{{\dagger }}\). Then a basis for \(\operatorname{span}(\mathbf{u}_{\{1,2,\ldots ,q\}})\) is defined by the columns of \([\mathbf{u}_{1}, \mathbf{u}_{2}, \ldots , \mathbf{u}_{q}] R' =: [ \mathbf{u}'_{1}, \mathbf{u}'_{2}, \ldots , \mathbf{u}'_{q}]\). Figure 8 shows these vectors as distributions over the vertices of G.

Figure 8
figure 8

The basis \(\{\mathbf{u}'_{j}\}_{j=1,2,\ldots ,q}\) of the null-space of L visualized as distributions over the vertices (red is positive, blue is negative). Compare this with Figs. 7(c–n)

Availability of data and materials

Not applicable.

Notes

  1. A preprint version of this articles is posted on the arXiv preprint repository and can be accessed at https://arxiv.org/abs/2103.09413 [5].

References

  1. Davis, C., Kahan, W.M.: The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7(1), 1–46 (1970). https://doi.org/10.1137/0707001

    Article  MathSciNet  MATH  Google Scholar 

  2. Bhatia, R.: Matrix Analysis. Graduate Texts in Mathematics. Springer, Berlin (1996)

    MATH  Google Scholar 

  3. Stewart, G.W.: Error and perturbation bounds for subspaces associated with certain eigenvalue problems. SIAM Rev. 15(4), 727–764 (1973)

    Article  MathSciNet  Google Scholar 

  4. Stewart, G.W., Sun, J.-g.: Matrix Perturbation Theory. Academic Press, San Diego (1990)

    MATH  Google Scholar 

  5. Bhattacharya, S.: On some bounds on the perturbation of invariant subspaces of normal matrices with application to a graph connection problem (2021). Preprint at arXiv:2103.09413

  6. Andruchow, E.: Operators which are the difference of two projections. J. Math. Anal. Appl. 420(2), 1634–1653 (2014). https://doi.org/10.1016/j.jmaa.2014.06.022

    Article  MathSciNet  MATH  Google Scholar 

  7. Golub, G.H., Van Loan, C.F., Van Loan, C.F., Van Loan, P.C.F.: Matrix Computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  8. Damle, A., Sun, Y.: Uniform bounds for invariant subspace perturbations (2020) arXiv:1905.07865

  9. Bauer, F.L., Fike, C.T.: Norms and exclusion theorems. Numer. Math. 2(1), 137–144 (1960)

    Article  MathSciNet  Google Scholar 

  10. Godsil, C., Royle, C.D.G.G., Royle, G.F.: Algebraic Graph Theory. Graduate Texts in Mathematics. Springer, Berlin (2001)

    Book  Google Scholar 

  11. Zhang, L., Sadler, B.M., Blum, R.S., Bhattacharya, S.: Inter-cluster transmission control using graph modal barriers (2020). Preprint arXiv:2010.04790 [cs.RO]

  12. Zhang, L., Sadler, B.M., Blum, R.S., Bhattacharya, S.: Inter-cluster transmission control using graph modal barriers. In: IEEE Transactions on Signal and Information Processing over Networks, pp. 1–1 (2021). https://doi.org/10.1109/TSIPN.2021.3071219

    Chapter  Google Scholar 

  13. Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. Society for Industrial and Applied Mathematics, Philadelphia (1994). https://doi.org/10.1137/1.9781611971262

    Book  MATH  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

SB is the sole author of this paper. The author read and approved the final manuscript.

Corresponding author

Correspondence to Subhrajit Bhattacharya.

Ethics declarations

Competing interests

The author declares that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhattacharya, S. On some bounds on the perturbation of invariant subspaces of normal matrices with application to a graph connection problem. J Inequal Appl 2022, 75 (2022). https://doi.org/10.1186/s13660-022-02809-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13660-022-02809-w

MSC

Keywords