Skip to main content

Advertisement

The alternate direction iterative methods for generalized saddle point systems

Article metrics

  • 33 Accesses

Abstract

The paper studies two splitting forms of generalized saddle point matrix to derive two alternate direction iterative schemes for generalized saddle point systems. Some convergence results are established for these two alternate direction iterative methods. Meanwhile, a numerical example is given to show that the proposed alternate direction iterative methods are much more effective and efficient than the existing one.

Introduction

Consider the generalized saddle-point problem

$$ \mathscr{A}z:= \begin{bmatrix} A & B^{T} \\ -B & 0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}= \begin{bmatrix} f \\ g \end{bmatrix}:=b, $$
(1)

where \(A\in R^{n\times n}\), \(B\in R^{m\times n}\), \(C\in R^{m\times m}\), \(f\in R^{n}\), \(f\in R^{m}\) and \(m\leq n\).

This class of linear systems arises in many scientific and engineering applications such as a mixed finite element approximation of elliptic partial differential equations, optimization, optimal control, structural analysis and electrical networks; see [1,2,3,4,5,6,7,8,9,10,11].

Recently, Benzi et al. [12, 13] studied the linear systems of the form (1) whose coefficient matrix

$$ \mathscr{A}= \begin{bmatrix} A & B^{T} \\ -B & 0 \end{bmatrix}= \begin{bmatrix} A_{1} & 0 & B_{1}^{T} \\ 0 & A_{2} & B_{2}^{T} \\ -B_{1} & -B_{2} & 0 \end{bmatrix} $$
(2)

satisfy all of the assumptions:

  • A=[A100A2], \(B=[B_{1}, B_{2}]\), \(A_{i}\in R^{n_{i}\times n_{i}}\) for \(i=1,2\) and \(n_{1}+n_{2}=n\), and \(B_{i}\in R^{m\times n_{i}}\) for \(i=1,2\);

  • \(A_{i}\) is positive definite (i.e., it has positive definite symmetric part \(H_{i}=(A_{i}+A^{T}_{i})/{2}\)) for \(i=1,2\);

  • \(\operatorname{rank}(B)=m\).

They [12] split the coefficient matrix \(\mathscr{A}\) as

$$ \mathscr{A}=\mathscr{A}_{1}+\mathscr{A}_{2}, $$
(3)

where

$$ \mathscr{A}_{1}= \begin{bmatrix} A_{1} & 0 & B_{1}^{T} \\ 0 & 0 & 0 \\ -B_{1} & 0 & 0 \end{bmatrix},\quad \text{and} \quad \mathscr{A}_{2}= \begin{bmatrix} 0 & 0 & 0 \\ 0 & A_{2} & B_{2}^{T} \\ 0 & -B_{2} & 0 \end{bmatrix}, $$
(4)

which is called dimensional splitting of \(\mathscr{A}\), and proposed the following alternate direction iterative method:

$$ \textstyle\begin{cases} (\alpha I+\mathscr{A}_{1})x^{(k+\frac{1}{2})}=(\alpha I-\mathscr{A} _{2})x^{(k)}+b, \\ (\alpha I+\mathscr{A}_{2})x^{(k+1)}=(\alpha I-\mathscr{A}_{1})x^{(k+ \frac{1}{2})}+b, \end{cases} $$
(5)

which was proved to converge unconditionally for any \(\alpha >0\). Meanwhile, based on dimensional splitting of \(\mathscr{A}\), they [12, 13] proposed the dimensional splitting preconditioner for linear system (1), and applied a Krylov subspace method like restarted GMRES to the preconditioned linear system, and hence established some good results.

In this paper, we propose two types of alternate direction iterative methods: one is that on base of the dimensional splitting (3) the quantitative matrix αI is replaced by two nonnegative diagonal matrices \(\mathscr{D}_{1}\) and \(\mathscr{D}_{2}\) to form a new alternate direction iterative scheme; another is to propose a new splitting of \(\mathscr{A}\), i.e.,

$$ \mathscr{A}=\mathscr{B}_{1}+\mathscr{B}_{2}, $$
(6)

where

$$ \mathscr{B}_{1}= \begin{bmatrix} A_{1} & 0 & 0 \\ 0 & 0 & B_{2}^{T} \\ 0 & -B_{2} & 0 \end{bmatrix}\quad \mbox{and}\quad \mathscr{B}_{2}= \begin{bmatrix} 0 & 0 & B_{1}^{T} \\ 0 & A_{2} & 0 \\ -B_{1} & 0 & 0 \end{bmatrix}, $$
(7)

and apply the two nonnegative diagonal matrices \(\mathscr{D}_{1}\) and \(\mathscr{D}_{2}\) to the new splitting such that another new alternate direction iterative scheme is obtained. Then some convergence results are established for the two alternate direction iterative schemes and a numerical example is given to show that the proposed ADI methods are much more effective and efficient than the existing one.

The paper is organized as follows. Two alternate direction iterative schemes are proposed in Sect. 2. The main convergence results of these two schemes are given in Sect. 3. In Sect. 4, a numerical examples is presented to demonstrate the proposed methods are very effective and efficient in this paper. A conclusion is given in Sect. 5.

The ADI methods

In this section, two alternate direction iterative schemes are proposed based on the previous two splittings (3) and (6). Let

$$ \mathscr{D}_{1}= \begin{bmatrix} 0 & 0 & 0 \\ 0 & \alpha I_{n_{2}} & 0 \\ 0 & 0 & \frac{\alpha }{2}I_{m} \end{bmatrix} \quad \mbox{and}\quad \mathscr{D}_{2}= \begin{bmatrix} \alpha I_{n_{1}} & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & \frac{\alpha }{2}I_{m} \end{bmatrix}, $$
(8)

where \(\alpha >0\) and \(I_{m}\) is the \(m\times m\) identity matrix. Then for the two splittings (3) and (6) one has

$$ \begin{aligned}[b] \mathscr{A}&=( \mathscr{D}_{1}+\mathscr{A}_{1})-(\mathscr{D}_{1}- \mathscr{A}_{2}) \\ &=(\mathscr{D}_{2}+\mathscr{A}_{2})-( \mathscr{D}_{2}-\mathscr{A}_{1}) \\ &=(\mathscr{D}_{1}+\mathscr{B}_{1})-( \mathscr{D}_{1}-\mathscr{B}_{2}) \\ &=(\mathscr{D}_{2}+\mathscr{B}_{2})-( \mathscr{D}_{2}-\mathscr{B}_{1}), \end{aligned} $$
(9)

which form the following two alternate direction iterative schemes.

Given an initial guess \(x^{(0)}\), for \(k=0,1,2,\ldots \) , until \(\{x^{(k)}\}\)converges, compute

$$\begin{aligned}& \textstyle\begin{cases} (\mathscr{D}_{1}+\mathscr{A}_{1})x^{(k+\frac{1}{2})}=(\mathscr{D}_{1}- \mathscr{A}_{2})x^{(k)}+b, \\ (\mathscr{D}_{2}+\mathscr{A}_{2})x^{(k+1)}=(\mathscr{D}_{2}- \mathscr{A}_{1})x^{(k+\frac{1}{2})}+b, \end{cases}\displaystyle \quad \mbox{and}\quad \end{aligned}$$
(10)
$$\begin{aligned}& \textstyle\begin{cases} (\mathscr{D}_{1}+\mathscr{B}_{1})x^{(k+\frac{1}{2})}=(\mathscr{D}_{1}- \mathscr{B}_{2})x^{(k)}+b, \\ (\mathscr{D}_{2}+\mathscr{B}_{2})x^{(k+1)}=(\mathscr{D}_{2}- \mathscr{B}_{1})x^{(k+\frac{1}{2})}+b, \end{cases}\displaystyle \end{aligned}$$
(11)

where \(\mathscr{D}_{1}\)and \(\mathscr{D}_{2}\)are defined in (8).

Eliminating \(x^{(k+\frac{1}{2})}\) in iterations (10) and (11), we obtain the stationary schemes

$$\begin{aligned}& x^{(k+1)}=\mathscr{L}x^{(k)}+f, \quad k=1,2,\ldots , \quad \mbox{and} \end{aligned}$$
(12)
$$\begin{aligned}& x^{(k+1)}=\mathscr{T}x^{(k)}+g,\quad k=1,2, \ldots , \end{aligned}$$
(13)

where

$$ \mathscr{L}=(\mathscr{D}_{2}+\mathscr{A}_{2})^{-1}( \mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1}(\mathscr{D} _{1}- \mathscr{A}_{2}) $$
(14)

and

$$ \mathscr{T}=(\mathscr{D}_{2}+\mathscr{B}_{2})^{-1}( \mathscr{D}_{2}- \mathscr{B}_{1}) (\mathscr{D}_{1}+ \mathscr{B}_{1})^{-1}(\mathscr{D} _{1}- \mathscr{B}_{2}) $$
(15)

are the iteration matrices of the ADI iterations (12) and (13), respectively. It is easy to see that (14) and (15), respectively, are similar to the matrices

$$ \hat{\mathscr{L}}=(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{A}_{2}) (\mathscr{D} _{2}+\mathscr{A}_{2})^{-1} $$
(16)

and

$$ \hat{\mathscr{T}}=(\mathscr{D}_{2}- \mathscr{B}_{1}) (\mathscr{D}_{1}+ \mathscr{B}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{B}_{2}) (\mathscr{D} _{2}+\mathscr{B}_{2})^{-1}. $$
(17)

As is shown in [8], the iteration matrix \(\mathscr{L}\) is induced by the unique splitting \(\mathscr{A}=\mathscr{P}-\mathscr{Q}\) with \(\mathscr{P}\) nonsingular, i.e., \(\mathscr{L}=\mathscr{P}^{-1} \mathscr{Q}=I-\mathscr{P}^{-1}\mathscr{A}\). Furthermore, \(f= \mathscr{P}^{-1}b\). The matrices \(\mathscr{P}\) and \(\mathscr{Q}\) are given by

$$ \mathscr{P}=\frac{1}{\alpha }(\mathscr{D}_{1}+ \mathscr{A}_{1}) ( \mathscr{D}_{2}+\mathscr{A}_{2}), \qquad \mathscr{Q}=\frac{1}{\alpha }(\mathscr{D}_{2}- \mathscr{A}_{1}) ( \mathscr{D}_{1}-\mathscr{A}_{2}). $$
(18)

Also, the iteration matrix \(\mathscr{T}\) is induced by the unique splitting \(\mathscr{A}=\mathscr{M}-\mathscr{N}\) with

$$ \mathscr{M}=\frac{1}{\alpha }(\mathscr{D}_{1}+ \mathscr{B}_{1}) ( \mathscr{D}_{2}+\mathscr{B}_{2}) \quad \mbox{nonsingular},\qquad \mathscr{N}=\frac{1}{\alpha }( \mathscr{D}_{2}- \mathscr{B}_{1}) (\mathscr{D}_{1}- \mathscr{B}_{2}), $$
(19)

i.e., \(\mathscr{T}=\mathscr{M}^{-1}\mathscr{N}=I-\mathscr{M}^{-1} \mathscr{A}\). Furthermore, \(g=\mathscr{M}^{-1}b\). We often refer to \(\mathscr{P}\) or \(\mathscr{M}\) as the preconditioner.

The convergence of the ADI methods

In this section, some convergence results on the ADI methods will be established. First, the following lemmas will used in this section.

Lemma 1

Let \(A=M-N\in C^{n\times n}\)withAandMnonsingular and let \(T=NM^{-1}\). Then \(A-TAT^{*}=(I-T)(AA^{-*}M^{*}+N)(I-T^{*})\).

The proof is similar to the proof of Lemma 5.30 in [1].

Lemma 2

Let \(A\in R^{n\times n}\)be symmetric and positive definite. If \(A=M-N\)withMnonsingular is a splitting such that \(M+N\)has a nonnegative definite symmetric part, then \(\|T\|_{A}=\|A^{-1/2}TA^{1/2}\|_{2}\leq 1\), where \(T=NM^{-1}\).

Proof

It follows from Lemma 1 that

$$ \begin{aligned}[b] A-TAT^{T}&=(I-T) \bigl(M^{T}A^{-T}A+N\bigr) \bigl(I-T^{T}\bigr) \\ &=(I-T) \bigl(M^{T}+N\bigr) \bigl(I-T^{T}\bigr). \end{aligned} $$
(20)

Since

$$ \begin{aligned}[b] 2\bigl(M^{T}+N \bigr)&=2\bigl(M^{T}+M-A\bigr) \\ &=\bigl(M+N^{T}\bigr)+\bigl(M^{T}+N\bigr) \\ &=(M+N)+(M+N)^{T} \\ &\succeq 0, \end{aligned} $$
(21)

it follows from (20) that \(A-TAT^{T}\succeq 0\) and thus

$$ A\succeq TAT^{T}. $$
(22)

From (22), we have \(I\succeq (A^{-1/2}TA^{1/2})(A^{-1/2}TA ^{1/2})^{T}\succeq 0\). Therefore,

$$ \Vert T \Vert _{A}= \bigl\Vert A^{-1/2}TA^{1/2} \bigr\Vert _{2}=\sqrt{\rho \bigl[\bigl(A^{-1/2}TA^{1/2} \bigr) \bigl(A ^{-1/2}TA^{1/2}\bigr)^{T}\bigr]} \leq 1. $$

This completes the proof. □

Lemma 3

Let \(\mathscr{A}_{i}\), \(\mathscr{B}_{i}\)and \(\mathscr{D}_{i}\)be defined in (4) and (8) for \(i=1,2\). If \({A}_{i}\)has positive definite symmetric part \(H_{i}\)and \(0<\alpha \leq 2\lambda _{\mathrm{min}}(H_{i})\)with \(\lambda _{\mathrm{min}}(H _{i})\)the smallest eigenvalue of \(H_{i}\), then

$$ \bigl\Vert (\mathscr{D}_{j}- \mathscr{A}_{i}) (\mathscr{D}_{i}+\mathscr{A}_{i})^{-1} \bigr\Vert _{2}\leq 1 \quad \textit{and}\quad \bigl\Vert ( \mathscr{D}_{j}-\mathscr{B}_{i}) (\mathscr{D}_{i}+ \mathscr{B}_{i})^{-1} \bigr\Vert _{2}\leq 1, $$
(23)

where \(j=2\)if \(i=1\)and \(j=1\)if \(i=2\).

Proof

We only prove the former inequality in (23) and the same method can yield the latter one. Let \(M_{i}=\mathscr{D}_{i}+\mathscr{A}_{i}\) and \(N_{i}=-\mathscr{D}_{j}+\mathscr{A}_{i}\). Then we have

$$ C_{i}:=M_{i}-N_{i}=\mathscr{D}_{i}+ \mathscr{D}_{j}=\alpha \operatorname{diag}(I_{n _{1}},I_{n_{1}},I_{m})= \alpha I\succ 0, $$

where I is the \((n_{1}+n_{2}+m)\times (n_{1}+n_{2}+m)\) identity matrix, and

$$ M_{i}+N_{i}=2\mathscr{A}_{i}+ \mathscr{D}_{i}-\mathscr{D}_{j}. $$

When \(i=1\) and \(j=2\)

$$ M_{1}+N_{1}=2\mathscr{A}_{1}+ \mathscr{D}_{1}-\mathscr{D}_{2}= \begin{bmatrix} 2A_{1}-\alpha I_{n_{1}} & 0 & 2B_{1}^{T} \\ 0 & \alpha I_{n_{2}} & 0 \\ -2B_{1} & 0 & 0 \end{bmatrix}. $$

Noting \(0<\alpha \leq 2\lambda _{\mathrm{min}}(H_{i})\), \(2H_{i}-\alpha I_{n_{i}}=(A^{T}_{i}+A_{i})-\alpha I_{n_{i}}\succeq 0\). Thus

$$ \bigl[(M_{1}+N_{1})^{T}+(M_{1}+N_{1}) \bigr]/2= \begin{bmatrix} (A^{T}_{i}+A_{i})-\alpha I_{n_{i}} & 0 & 0 \\ 0 & \alpha I_{n_{2}} & 0 \\ 0 & 0 & 0 \end{bmatrix}\succeq 0, $$

which shows that \(M_{1}+N_{1}\) has a nonnegative definite symmetric part. Similarly, \(M_{2}+N_{2}\) also has a nonnegative definite symmetric part. Thus, \(M_{i}+N_{i}\) has a nonnegative definite symmetric part for \(i=1,2\). Let \(T_{i}=N_{i}M_{i}^{-1}\). Then it follows from Lemma 2 that

$$ \Vert T_{i} \Vert _{C_{i}}= \bigl\Vert C_{i}^{-1/2}TC_{i}^{1/2} \bigr\Vert _{2}= \Vert T \Vert _{2}\leq 1. $$

Consequently, \(\|T_{i}\|_{2}=\|N_{i}M_{i}^{-1}\|_{2}=\|(\mathscr{D} _{j}-\mathscr{A}_{i})(\mathscr{D}_{i}+\mathscr{A}_{i})^{-1}\|_{2} \leq 1\) for \(i=1,2\). This completes the proof. □

Theorem 1

Consider problem (1) and assume that \(\mathscr{A}\)satisfies the assumptions above. Then \(\mathscr{A}\)is nonsingular. Further, if \(0<\alpha \leq 2\delta \)with \(\delta =\min \{ \lambda _{\mathrm{min}}(H_{1}),\lambda _{\mathrm{min}}(H_{2})\}\), then \(\|\hat{\mathscr{L}}\|_{2}\leq 1\)and \(\|\hat{\mathscr{T}}\|_{2}\leq 1\).

Proof

The proof of the nonsingularity of \(\mathscr{A}\) can be found in [10]. Since \(0<\alpha \leq 2\delta =2\min \{\lambda _{ \mathrm{min}}(H_{1}),\lambda _{\mathrm{min}}(H_{2})\}\), Lemma 3 shows that (23) hold for \(i=1\), \(j=2\) and \(i=2\), \(j=1\). As a result,

$$\begin{aligned}& \begin{aligned} &\begin{aligned} \Vert \hat{\mathscr{L}} \Vert _{2}&= \bigl\Vert (\mathscr{D}_{2}- \mathscr{A}_{1}) ( \mathscr{D}_{1}+\mathscr{A}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1} \bigr\Vert _{2} \\ &\leq \bigl\Vert (\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigr\Vert _{2} \bigl\Vert (\mathscr{D}_{1}- \mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1} \bigr\Vert _{2} \\ &\leq 1, \end{aligned} \\ &\begin{aligned} \Vert \hat{\mathscr{T}} \Vert _{2}&= \bigl\Vert (\mathscr{D}_{2}-\mathscr{B}_{1}) ( \mathscr{D}_{1}+\mathscr{B}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{B}_{2}) ( \mathscr{D}_{2}+ \mathscr{B}_{2})^{-1} \bigr\Vert _{2} \\ &\leq \bigl\Vert (\mathscr{D}_{2}-\mathscr{B}_{1}) ( \mathscr{D}_{1}+ \mathscr{B}_{1})^{-1} \bigr\Vert _{2} \bigl\Vert (\mathscr{D}_{1}- \mathscr{B}_{2}) ( \mathscr{D}_{2}+\mathscr{B}_{2})^{-1} \bigr\Vert _{2} \\ &\leq 1. \end{aligned} \end{aligned} \end{aligned}$$
(24)

This completes the proof. □

Theorem 2

Consider problem (1) and assume that \(\mathscr{A}\)satisfies the assumptions above. If \(0<\alpha \leq 2\delta \)with \(\delta =\min \{\lambda _{\mathrm{min}}(H_{1}),\lambda _{\mathrm{min}}(H _{2})\}\), then the iterations (10) and (11) are convergent; that is, \(\rho (\mathscr{L})<1\)and \(\rho (\mathscr{T})<1\).

Proof

Firstly, we prove \(\rho (\mathscr{L})<1\). Since \(\mathscr{L}(\alpha )\) is similar to \(\hat{\mathscr{L}}\), \(\rho (\mathscr{L})=\rho ( \hat{\mathscr{{L}}})\). Let λ is an eigenvalue of \(\hat{\mathscr{{L}}}(\alpha )\) satisfying \(|\lambda |=\rho ( \hat{\mathscr{{L}}})\) and x is the corresponding eigenvector with \(\|x\|_{2}=1\) (note that it must have \(x\neq 0\)). Then \(\hat{\mathscr{{L}}}x=\lambda x\) and consequently,

$$ \begin{aligned}[b] \lambda &=x^{*}\hat{ \mathscr{L}}x=x^{*}(\mathscr{D}_{2}-\mathscr{A} _{1}) (\mathscr{D}_{1}+\mathscr{A}_{1})^{-1}( \mathscr{D}_{1}- \mathscr{A}_{2}) (\mathscr{D}_{2}+ \mathscr{A}_{2})^{-1}x \\ &=u^{*}v, \end{aligned} $$
(25)

where \(u=(\mathscr{D}_{1}+\mathscr{A}^{*}_{1})^{-1}(\mathscr{D}_{2}- \mathscr{A}^{*}_{1})x\) and \(v=(\mathscr{D}_{1}-\mathscr{A}_{2})( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x\). Using the Cauchy–Schwarz inequality,

$$ \vert \lambda \vert ^{2}\leq u^{*}u\cdot v^{*}v. $$
(26)

The equality in (26) holds if and only if \(u=kv\), where \(k\in \mathbb {C}\). Also, Lemma 3 yields

$$\begin{aligned}& \begin{aligned} &\begin{aligned} u^{*}u&=x^{*}(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigl(\mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl( \mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)x \\ &\leq \max_{ \Vert x \Vert _{2}=1}x^{*}(\mathscr{D}_{2}- \mathscr{A}_{1}) ( \mathscr{D}_{1}+\mathscr{A}_{1})^{-1} \bigl(\mathscr{D}_{1}+\mathscr{A}^{*} _{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)x \\ &\leq \bigl\Vert (\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigr\Vert _{2}^{2} \\ &\leq 1, \end{aligned} \\ &\begin{aligned} v^{*}v&=x^{*}\bigl( \mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D} _{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x \\ &\leq \max_{ \Vert x \Vert _{2}=1}x^{*}\bigl( \mathscr{D}_{2}+\mathscr{A}^{*} _{2} \bigr)^{-1}\bigl(\mathscr{D}_{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}- \mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x \\ &\leq \bigl\Vert (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1} \bigr\Vert _{2}^{2} \\ &\leq 1. \end{aligned} \end{aligned} \end{aligned}$$
(27)

As a result, if \(u\neq kv\), then it follows from (26) and (27) that

$$ \rho ^{2}\bigl[\mathscr{L}(\alpha )\bigr]=\rho ^{2}\bigl[\hat{\mathscr{{L}}}(\alpha )\bigr]= \vert \lambda \vert ^{2}< u^{*}u\cdot v^{*}v\leq 1; $$
(28)

if \(u=kv\) and \(u^{*}u\cdot v^{*}v<1\), then

$$ \rho ^{2}\bigl[\mathscr{L}(\alpha )\bigr]=\rho ^{2}\bigl[\hat{\mathscr{{L}}}(\alpha )\bigr]= \vert \lambda \vert ^{2}=u^{*}u\cdot v^{*}v< 1. $$
(29)

In what follows we will prove by contradiction that \(u=kv\) and \(u^{*}u\cdot v^{*}v=1\) do not hold simultaneously.

Assume that \(u=kv\) and \(u^{*}u\cdot v^{*}v=1\). Since \(u^{*}u\leq 1\) and \(v^{*}v\leq 1\), \(|k|=u^{*}u=v^{*}v=1\). Then it follows from (27) that

$$\begin{aligned}& \begin{aligned} &\begin{aligned} u^{*}u&=x^{*}(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigl(\mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl( \mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)x \\ &=\rho \bigl[(\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+ \mathscr{A}_{1})^{-1}\bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl( \mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)\bigr] \\ &= 1, \end{aligned} \\ &\begin{aligned} v^{*}v&=x^{*}\bigl( \mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D} _{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x \\ &=\rho \bigl[\bigl(\mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D}_{1}- \mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1}\bigr] \\ &= 1. \end{aligned} \end{aligned} \end{aligned}$$
(30)

Noting \(\|x\|_{2}=1\), (30) implies that x is the eigenvector of \((\mathscr{D}_{2}-\mathscr{A}_{1})(\mathscr{D}_{1}+\mathscr{A}_{1})^{-1}( \mathscr{D}_{1}+\mathscr{A}^{*}_{1})^{-1}(\mathscr{D}_{2}-\mathscr{A} ^{*}_{1})\) and \((\mathscr{D}_{2}+\mathscr{A}^{*}_{2})^{-1}( \mathscr{D}_{1}-\mathscr{A}^{*}_{2})(\mathscr{D}_{1}-\mathscr{A}_{2})( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}\) corresponding to their having the same eigenvalue, 1, i.e.,

$$ \begin{aligned} &(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+\mathscr{A}_{1})^{-1} \bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}-\mathscr{A} ^{*}_{1}\bigr)x=x, \\ &\bigl(\mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D}_{1}- \mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1}x=x. \end{aligned} $$
(31)

Since

$$ (\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+\mathscr{A}_{1})^{-1}\bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}-\mathscr{A} ^{*}_{1}\bigr) = \begin{bmatrix} E & 0 & F^{T} \\ 0 & 0 & 0 \\ F & 0 & G \end{bmatrix}, $$
(32)

where E, F and G denote nonzero matrices, the former equation in (31) can be written as

$$ \begin{bmatrix} F & 0 & F^{T} \\ 0 & 0 & 0 \\ F & 0 & G \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{2} \\ x_{3} \end{bmatrix}= \begin{bmatrix} x_{1} \\ x_{2} \\ x_{3} \end{bmatrix}, $$

which indicates \(x_{2}=0\). Therefore, \(x=[x^{*}_{1},0,x_{3}^{*}]^{*}\). Let \(y=(\mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x\). Then \(x=(\mathscr{D} _{2}+\mathscr{A}_{2})y\). The latter equation in (31) becomes

$$ \bigl(\mathscr{D}_{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2})y=\bigl( \mathscr{D}_{2}+\mathscr{A}^{*}_{2}\bigr) ( \mathscr{D}_{2}+\mathscr{A}_{2})y, $$
(33)

and consequently

$$ \bigl[\mathscr{D}_{2}^{2}- \mathscr{D}_{1}^{2}+\alpha \bigl(\mathscr{A}^{*}_{2}+ \mathscr{A}_{2}\bigr)\bigr]y=0, $$
(34)

that is,

$$ \begin{bmatrix} \alpha ^{2}I_{n_{1}} & 0 & 0 \\ 0 & \alpha (A^{*}_{2}+{A}_{2}-\alpha I_{n_{2}}) & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} y_{1} \\ y_{2} \\ y_{3} \end{bmatrix}=0, $$
(35)

which indicates \(y_{1}=0\). Therefore, \(y=[0,y_{2}^{*},y_{3}^{*}]^{*}\). Also, \(x=(\mathscr{D}_{2}+\mathscr{A}_{2})y\). Then

$$ \begin{bmatrix} x_{1} \\ 0 \\ x_{3} \end{bmatrix}= \begin{bmatrix} \alpha I_{n_{1}} & 0 & 0 \\ 0 & {A}_{2} & B_{2}^{T} \\ 0 & -B_{2} & \frac{\alpha }{2} I_{m} \end{bmatrix} \begin{bmatrix} 0 \\ y_{2} \\ y_{3} \end{bmatrix}= \begin{bmatrix} 0 \\ {A}_{2}y_{2}+B_{2}^{T}y_{3} \\ -B_{2}y_{2}+\frac{\alpha }{2} y_{3} \end{bmatrix}, $$
(36)

which shows that

$$ x_{1}=y_{1}=0,\qquad {A}_{2}y_{2}+B_{2}^{T}y_{3}=0, \quad \mbox{and}\quad x_{3}=-B_{2}y_{2}+ \frac{\alpha }{2} y_{3}. $$
(37)

Since \(u=kv\),

$$ \bigl(\mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}- \mathscr{A}^{*}_{1} \bigr)x=k(\mathscr{D}_{1}-\mathscr{A}_{2}) (\mathscr{D} _{2}+\mathscr{A}_{2})^{-1}x, $$
(38)

which can be written as

$$ \bigl(\mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr) (\mathscr{D}_{2}+\mathscr{A}_{2})y=k\bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1}\bigr) ( \mathscr{D}_{1}-\mathscr{A}_{2})y $$
(39)

for \(x=(\mathscr{D}_{2}+\mathscr{A}_{2})y\). Further, (39) becomes

$$ \bigl[\bigl(\mathscr{D}_{2}^{2}-k \mathscr{D}_{1}^{2}\bigr)+(\mathscr{D}_{2}+k \mathscr{D}_{1})\mathscr{A}_{2}-\mathscr{A}^{*}_{1}( \mathscr{D}_{2}+k \mathscr{D}_{1})-(1-k) \mathscr{A}^{*}_{1}\mathscr{A}_{2} \bigr]y=0, $$
(40)

i.e.,

$$ \begin{bmatrix} \alpha ^{2}I_{n_{1}}-\alpha A_{1}^{*} & (k-1)B_{1}B_{2}^{T} & \frac{(1+k) \alpha }{2}B_{1}^{T} \\ 0 & k\alpha (A_{2}-\alpha I_{n_{2}}) & k\alpha B_{2}^{T} \\ -\alpha B_{1} & -\frac{(1+k)\alpha }{2}B_{1} & \frac{(1-k)\alpha ^{2}}{4} I_{m} \end{bmatrix} \begin{bmatrix} 0 \\ y_{2} \\ y_{3} \end{bmatrix}= \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}, $$
(41)

i.e.,

$$ \textstyle\begin{cases} (k-1)B_{1}B_{2}^{T}y_{2}+\frac{(1+k)\alpha }{2}B_{1}^{T}y_{3}=0, \\ k\alpha (A_{2}-\alpha I_{n_{2}})y_{2}+k\alpha B_{2}^{T}y_{3}=0, \\ -\frac{(1+k)\alpha }{2}B_{1}y_{2}+\frac{(1-k)\alpha ^{2}}{4}y_{3}=0. \end{cases} $$
(42)

Here, we assert \(k\neq 1\). Otherwise, assume \(k=1\). Then (25) shows \(\lambda =u^{*}v=v^{*}v=1\) for \(u=kv\) and \(v^{*}v=1\). Note that λ is an eigenvalue of \(\hat{\mathscr{L}}\) which is similar to \(\mathscr{L}(\alpha )\). Thus \(\hat{\mathscr{L}}\) and \(\mathscr{L}=[( \mathscr{D}_{1}+\mathscr{A}_{1})(\mathscr{D}_{2}+\mathscr{A}_{2})]^{-1}[( \mathscr{D}_{2}-\mathscr{A}_{1})(\mathscr{D}_{1} -\mathscr{A}_{2})]\) have the same eigenvalue, 1. Let w be the eigenvector of \(\mathscr{L}\) corresponding to the eigenvalue 1 (note that necessarily \(w\neq 0\)). One has

$$ \mathscr{L}w=\bigl[(\mathscr{D}_{1}+ \mathscr{A}_{1}) (\mathscr{D}_{2}+ \mathscr{A}_{2}) \bigr]^{-1}\bigl[(\mathscr{D}_{2}-\mathscr{A}_{1}) (\mathscr{D} _{1}-\mathscr{A}_{2})\bigr]w=w $$
(43)

and consequently

$$ \bigl[(\mathscr{D}_{2}-\mathscr{A}_{1}) (\mathscr{D}_{1} -\mathscr{A}_{2})-( \mathscr{D}_{1}+\mathscr{A}_{1}) (\mathscr{D}_{2}+ \mathscr{A}_{2})\bigr]w=- \alpha \mathscr{A}w=0. $$
(44)

Since \(\mathscr{A}\) is nonsingular, (44) yields \(w=0\), which contradicts that w is an eigenvector of \(\mathscr{L}(\alpha )\). Thus, \(k\neq 1\) and \(1-k\neq 0\). From the third equation in (42), one has

$$ y_{3}=\kappa B_{1}y_{2} $$
(45)

with \(\kappa :=\frac{2(1+k)}{\alpha (1-k)}\). Then it follows from the second equation in (37) that

$$ \mathscr{J}y_{2}=\bigl(A_{2}+\kappa B_{1}^{T}B_{1}\bigr)y_{2}=0, $$
(46)

where \(\mathscr{J}=A_{2}+\kappa B_{1}^{T}B_{1}\). Note \(|k|=1\) and \(k\neq 1\). Let \(k=\cos \theta +i\sin\theta \), where \(i=\sqrt{-1}\), \(\theta \in R\), \(\theta \neq 2t\pi \) and t is an integer. Then

$$ \kappa := \frac{2(1+k)}{\alpha (1-k)}= \frac{2[(1+\cos \theta )+i\sin\theta ]}{\alpha [(1-\cos \theta )-i\sin \theta ]}= \frac{2i}{\alpha }\tan \frac{\theta }{2} $$
(47)

is either pure imaginary or zero. As a result, \(\mathscr{J}^{*}+ \mathscr{J}=A_{2}^{T}+A_{2}\succ 0\) for \(A_{2}\) is positive definite. Thus, \(\mathscr{J}\) is positive definite and hence nonsingular. Equation (46) indicates \(y_{2}=0\) and thus (45) shows that \(y_{3}=0\). Then it follows from the third equation in (37) that \(x_{3}=0\). Therefore, \(x=[0,0,0]^{*}\), which contradicts that x is an eigenvector of \(\hat{\mathscr{L}}(\alpha )\) with \(\|x\|_{2}=1\). By the proof above, it is easy to see that \(u=kv\) and \(u^{*}u\cdot v^{*}v=1\) do not hold simultaneously. Therefore, \(\rho [\mathscr{L}(\alpha )]=| \lambda |<1\) and consequently, the iteration (10) converges.

By the same method, we can obtain \(\rho (\mathscr{T})<1\). Therefore, iterations (10) and (11) are both convergent. This completes the proof. □

A numerical example

A numerical example is given in this section to show that the proposed alternate direction iterative methods are very effective.

Example 1

Consider problem (1) and assume that \(\mathscr{A}\) is shown in (2), where \(A_{1}=A_{2}=\operatorname{tri}(1,1,-1)\in R^{n \times n}\), \(B_{1}=B_{2}=I_{n}\in R^{n\times n}\), an \(n\times n\) identity matrix and \(b=(1,1,\ldots ,1)^{T}\in R^{2n}\).

We conduct numerical experiments to compare the performance of the three alternate direction iterative schemes (5), (10) and (11) for the problem (1). The former scheme (5) written as Algorithm 1 (A1) was proposed denoted by Benzi et al. in [12, 13], while the latter schemes (10) and (11) written by Algorithm 2 (A2) and Algorithm 3 (A3) are proposed in this paper. These three algorithms were coded in Matlab, and all computations were performed on a HP dx7408 PC (Intel core E4500 CPU, 2.2 GHz, 1 GB RAM) with Matlab 7.9 (R2009b).

The stopping criterion is defined as

$$ \mathrm{RERE}=\frac{ \Vert x^{k+1}-x^{k} \Vert _{2}}{\max \{1, \Vert x^{k} \Vert _{2}\}}=< 10^{-6}. $$

Numerical results are presented in Table 1. In particular, we report in Fig. 1 the change of RE of A1, A2 and A3 when \(n=1000\) with the iteration number increasing.

Figure 1
figure1

When \(n=1000\) the change of RE of A1, A2 and A3 with the iteration number increasing

Table 1 Performance of A1, A2, and A3 with different n

From Table 1, we can make the following observations. (i) A2 (i.e., Algorithm 2) generally has much smaller iteration number than A1 and A3 (Algorithm 1 and Algorithm 3) when \(n=500\), \(n=1000\) and \(n=1500\); (ii) A3 has much less computing time than A2 and A1. Thus, both A2 and A3 are generally superior to A1 in terms of iteration number and computing time. Therefore, the proposed methods are more effective and efficient than the existing method.

Figure 1 shows that RE generated by A3 quickly converges to 0 with the iteration number increasing when \(n=1000\). Therefore, A2 is superior to A1 and A3 in terms of iteration number.

Conclusions

In this paper we propose two alternate direction iterative methods for generalized saddle-point systems based on two splitting forms of generalized saddle-point matrix, and then establish some convergence theorems for these two iterative methods. Finally, we present a numerical example to demonstrate that the proposed alternate direction iterative methods are superior to the existing one.

References

  1. 1.

    Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. Academic Press, New York (1979)

  2. 2.

    Bai, Z.-Z., Golub, G.H., Ng, M.K.: On successive-overrelaxation acceleration of the Hermitian and skew-Hermitian splitting iterations. Numer. Linear Algebra Appl. 17, 319–335 (2007)

  3. 3.

    Bai, Z.-Z., Golub, G.H.: Accelerated Hermitian and skew-Hermitian splitting iteration methods for saddle-point problems. IMA J. Numer. Anal. 27, 1–23 (2007)

  4. 4.

    Bai, Z.-Z., Golub, G.H., Lu, L.-Z., Yin, J.-F.: Block triangular and skew-Hermitian splitting methods for positive-definite linear systems. SIAM J. Sci. Comput. 26, 844–863 (2005)

  5. 5.

    Bai, Z.-Z., Golub, G.H., Pan, J.-Y.: Preconditioned Hermitian and skew-Hermitian splitting methods for non-Hermitian positive semidefinite linear systems. Numer. Math. 98, 1–32 (2004)

  6. 6.

    Bai, Z.-Z., Golub, G.H., Ng, M.K.: On inexact Hermitian and skew-Hermitian splitting methods for non-Hermitian positive definite linear systems. Linear Algebra Appl. 428, 413–440 (2008)

  7. 7.

    Li, L., Huang, T.-Z., Liu, X.-P.: Modified Hermitian and skew-Hermitian splitting methods for non-Hermitian positive-definite linear systems. Numer. Linear Algebra Appl. 14, 217–235 (2007)

  8. 8.

    Benzi, M., Szyld, D.B.: Existence and uniqueness of splittings for stationary iterative methods with applications to alternating methods. Numer. Math. 76, 309–321 (1997)

  9. 9.

    Benzi, M., Gander, M., Golub, G.H.: Optimization of the Hermitian and skew-Hermitian splitting iteration for saddle-point problems BIT Numer. Math. 43, 881–900 (2003)

  10. 10.

    Benzi, M., Golub, G.H.: A preconditioner for generalized saddle point problems. SIAM J. Matrix Anal. Appl. 26, 20–41 (2004)

  11. 11.

    Benzi, M., Golub, G.H., Liesen, J.: Numerical solution of saddle point problems. Acta Numer. 14, 1–137 (2005)

  12. 12.

    Benzi, M., Ng, M., Niu, Q., Wang, Z.: A relaxed dimensional factorization preconditioner for the incompressible Navier–Stokes equations. J. Comput. Phys. 230, 6185–6202 (2011)

  13. 13.

    Benzi, M., Guo, X.-P.: A dimensional split preconditioner for Stokes and linearized Navier–Stokes equations. Appl. Numer. Math. 61, 66–76 (2011)

Download references

Acknowledgements

The authors would like to thank the anonymous referees for their valuable comments and suggestions, which actually stimulated this work.

Availability of data and materials

Not applicable.

Funding

The work was supported by the National Natural Science Foundations of China (11601409, 11201362), the Natural Science Foundation of Shaanxi Province of China (2016JM1009), the Natural Science Foundation of Department of Shaanxi Province of China (2017JK0344), the Key Projects of Social Science Planning of Gansu Province (ZD007) and 2018 Strategic Research Projects of the Scientific Research Projects of Institutions of Higher Learning of Gansu Province (2018f-20).

Author information

All authors contributed equally to this work. All authors read and approved the final manuscript.

Correspondence to Cheng-yi Zhang.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Luo, S., Cui, A. & Zhang, C. The alternate direction iterative methods for generalized saddle point systems. J Inequal Appl 2019, 285 (2019) doi:10.1186/s13660-019-2235-z

Download citation

MSC

  • 65F10
  • 15A15
  • 15F10

Keywords

  • Alternate direction iterative method
  • Generalized saddle point system
  • Convergence