Additive Schwarz Methods

77. Additive Schwarz Methods#

In this lecture

we define additive and multiplicative sub-space correction methods
prove the Additive Schwarz Lemma as tool for the analysis of additive Schwarz methods

The method is named after Hermann Schwarz, who used this technique for proving the existence of solutions of PDEs.

Literature: Xu, Jinchao: Iterative methods by space decomposition and subspace correction. SIAM Rev. 34 (1992), no. 4, 581–613 https://epubs.siam.org/doi/10.1137/1034116

77.1. Block-Jacobi and general additive Schwarz preconditioners#

Let \(e_i\) be the \(i^{th}\) unit vector. The diagonal entry of the matrix \(A\) is \(a_{ii} = e_i^T A e_i\). We can represent the Jacobi-preconditioner as

\[ C_{Jac}^{-1} = (\operatorname{diag} A)^{-1} = \sum_{i=1}^N e_i \frac{1}{e_i^T A e_i} e_i^T \]

A block-Jacobi preconditioner is

\[\begin{split} C_{bJac} = \left( \begin{array}{cccc} A_{11} & 0 & \ldots & 0 \\ 0 & A_{22} & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & A_{mm} \end{array} \right), \end{split}\]

where we have \(m\) diagonal blocks, of sizes \(N_i\).

For \(i = 1 \ldots m\) we define embedding matrices \(E_i\) consisting of the unit-vectors corresponding to the \(i^{th}\) block:

\[\begin{split} E_i = \left( \begin{array}{c} 0 \\ I_{N_i} \\ 0 \end{array} \right) = \left( e_{i,1}, \ldots, e_{i,N_i} \right) \in {\mathbb R}^{N \times N_i} \end{split}\]

We can write the diagonal blocks as

\[ A_{ii} = E_i^T A E_i \in {\mathbb R}^{N_i \times N_i} \]

and the whole Block-Jacobi preconditioner as

\[ C_{bJac} = \sum_{i=1}^m E_i A_{ii} E_i^T, \]

and the preconditioning action

\[ C_{bJac}^{-1} = \sum_{i=1}^m E_i A_{ii}^{-1} E_i^T \]

Next, we let the \(E_i\) be more general full rank, typically tall rectangular matrices. Let \(A_{i} = E_i^T A E_i \in {\mathbb R}^{N_i}\). Then we call the preconditioner defined by the action

\[ C_{ASM}^{-1} = \sum_{i=1}^m E_i A_{i}^{-1} E_i^T \]

an additive Schwarz preconditioner \(C_{ASM}\). In general we don’t have the simple representation as for the block-Jacobi preconditioner for \(C_{ASM}\) itself.

The action

\[ C_{ASM}^{-1} : {\mathbb R}^N \rightarrow {\mathbb R}^N : r \mapsto w \]

can be split into the following steps:

\(\qquad\) \(r_i = E_i^T r \qquad\) extract residual on sub-space
\(\qquad\) \(w_i = A_{i}^{-1}r_i \qquad\) solve problem on sub-space
\(\qquad\) \(w = \sum_{i=1}^m E_i w_i \qquad\) embed and combine sub-space solution

77.2. Sub-spaces of \({\mathbb R}^N\) and of Hilbert-spaces#

The definition and implementation of the ASM - preconditioner happens in linear algebra, i.e. in \({\mathbb R}^N\). But for the analysis of the preconditioner we reformulate it in finite element space \(V_h\) and sub-spaces.

We define the so called Galerkin isomorphismus

\[\begin{eqnarray*} G : {\mathbb R}^N & \rightarrow & V_h \\ \underline{u} & \mapsto & u_h = \sum_{i=1}^N u_i \varphi_i. \end{eqnarray*}\]

Every finite element function \(u_h\) can be identified with its coefficient vector \(\underline u\) with respect to the basis functions \(\{\varphi_i\}\).

For \(\underline u, \underline v \in {\mathbb R}^N\), there holds

\[ \underline v^T A \underline u = A(G \underline u, G \underline v), \]

where \(A\) is the Galerkin discretization matrix with entries \(A_{ji} = A(\varphi_i, \varphi_j)\).

The adjoint of the Galerkin isomorphism is a mapping from the dual of \(V_h\), i.e. from the linear forms on \(V_h\), into \({\mathbb R}^N\):

\[\begin{eqnarray*} G^\ast : V_h^\ast & \rightarrow & {\mathbb R}^N \\ r(.) & \mapsto & \big(r(\varphi_i) \big)_{i=1\ldots N} \end{eqnarray*}\]

The embedding matrices \(E_i \subset {\mathbb R}^{N \times N_i}\) define sub-spaces

\[ V_i = G \operatorname{range} \{ E_i \} = \{ G E \underline v : \underline v \in {\mathbb R}^{N_i} \} \subset V_h \]

The ASM preconditioner can be reformulated by means of variational problems on sub-spaces:

Lemma: Let \(r(.) \in V_h^\ast\). Then

\[ w = G C_{ASM}^{-1} G^* r \; \; \in V_h \]

is given as

\[ w = \sum_{i=1}^m w_i \quad \text{with} \quad w_i \in V_i : A(w_i,v_i) = r(v_i) \; \forall \, v_i \in V_i \]

Proof: Express \(w_i = G E_i \underline w_i\) and \(v_i = G E_i \underline v_i\) with \(\underline w_i, \underline v_i \in {\mathbb R}^{N_i}\). Then the defining equation of \(w_i\) reads as

\[ A(G E_i \underline w_i, G E_i \underline v_i) = r(G E_i \underline v_i) \qquad \forall \, \underline v_i \in {\mathbb R}^{N_i} \]

and thus as

\[ \underline v_i^T E_i^T A E_i \underline w_i = v_i^T E_i^T G^* r, \]

i.e. \(w_i = A_i^{-1} E_i^T (G^* r)\).

Finally, \(w = \sum w_i = \sum G E_i \underline w_i = G \sum E_i \underline w_i = G \sum E_i A_i^{-1} E_i^T (G^* r)\).

Lemma Let \(\underline{w}_i = E_i A_{i}^{-1} E_i^T A \underline{v}\), and \(w_i = G E_i \underline{w}_i\) and \(v = G \underline v\). Then

\[ w_i = P_i v \]

with \(P_i\) the \(A(.,.)\) orthogonal projector onto \(V_i\):

\[ P_i : V_h \rightarrow V_i : A(P_i v, z) = A(v,z) \quad \forall \, v \in V_h, z \in V_i \]

Proof: Follows from the proof of the Lemma above using \(r(\cdot) = A(v,\cdot)\).

Lemma: There holds

\[ G \, C_{ASM}^{-1} A = \sum_{i=1}^M P_i \, G \]

This implies that the condition number \(\kappa_A \{C_{ASM}^{-1} A\}\) equals \(\kappa_A \{ \sum P_i \}\). The ideal case is that all sub-spaces are \(A\)-orthogonal, then \(\sum P_i = Id\).

Lemma: For the error propagation matrix there holds

\[ G \; (I - \tau C_{ASM}^{-1} A) = (I - \tau \sum_{i=1}^M P_i) \; G \]

A block-Gauss-Seidel iteration leads to the so called multiplicative Schwarz preconditioner, with the error propagation operator

\[ M_{MSM} = (I-P_m) \, \ldots \, (I-P_2) (I-P_1) \]

As a product of \(A\)-orthogonal projectors it has \(||M_{MSM}||_A \leq 1\). Its \(\left<.,.\right>_A\)-adjoint is the product in reverse ordering.

77.3. The Additive Schwarz Lemma#

We have defined the action of the ASM preconditioner. To analyze the condition number via bounds of the Rayleigh quotient

\[ \gamma_1 \leq \frac{ u^T A u } { u^T C_{ASM} u } \leq \gamma_2 \]

it is of useful to have an representation of the quadratic form generated by \(C_{ASM}\) itself. This is given by the Additive Schwarz Lemma. Since it was reformulated and reproven by many authors it is also called Lemma of many parents. In some variants it is also know as Lions’s Lemma, or fictitious space lemma by Nepomnyashchikh.

We assume that \(E_i\) are embedding matrices as above such that

\[ \sum_{i=1}^m \operatorname{range} \{ E_i \} = {\mathbb R}^N \]

This means that every \(u \in {\mathbb R}^N\) can be decomposed into vectors in the range of \(E_i\), not necessarily unique.

Theorem: ASM-lemma, Linear Algebra version:

For all \(u \in {\mathbb R}^N\) there holds:

\[ u^T C_{ASM} u = \inf_{u_i \in {\mathbb R}^{N_i} \atop u = \sum E_i u_i} \sum_{i=1}^m u_i^T A_{i} u_i \]

Theorem ASM-lemma, sub-space version:

For all \(u \in V_h\) there holds:

\[ u^T C_{ASM} u = \inf_{u_i \in V_i \atop G u = \sum u_i} \sum_{i=1}^m \| u_i \|_A^2 \]

Proof: We prove the version in linear algebra notation, the other follows using the Galerkin isomorphism as above.

Fix \(u \in {\mathbb R}^N\). The form \(u^T C_{ASM} u\) is given as the minimum of the constraint minimization problem

\[ \inf_{u_i \in {\mathbb R}^{N_i} \atop u = \sum E_i u_i} \sum_{i=1}^m u_i^T A_{i} u_i. \]

The feasible set is non-empty and convex, and the goal function is convex. Thus, the unique minimizer can be found by the method of Lagrange multipliers. The Lagrange function is

\[ L( (u_i), \lambda) = \sum u_i^T A_i u_i + \lambda^T (u - \sum E_i u_i) \]

There exists a unique Lagrange multiplier \(\lambda\) such that \(L\) has a stationary point \(((u_i), \lambda)\).

Differentiating \(L\) with respect an individual \(u_i\) we get

\[ 2 A_i u_i - E_i^T \lambda = 0 \]

and thus

\[ u_i = \tfrac{1}{2} A_i^{-1} E_i^T \lambda. \]

Differentiating with respect to \(\lambda\) we recover the constraint

\[ u = \sum_i E_i u_i. \]

Inserting leads to

\[ u = \tfrac{1}{2} \sum_i E_i A_i^{-1} E_i^T \lambda \]

We find the definition of the preconditiong action

\[ u = \tfrac{1}{2} C_{ASM}^{-1} \lambda \]

Since for every \(u\) there exists a unique \(\lambda\), \(C_{ASM}^{-1}\) is indeed the inverse of a regular matrix \(C_{ASM}\), and we can write

\[ \lambda = 2 C_{ASM} u \]

The value of the minimum is

\[\begin{eqnarray*} \sum u_i^T A_i u_i & = & \sum (\tfrac{1}{2} A_i^{-1} E_i^T \lambda) A_i (\tfrac{1}{2} A_i^{-1} E_i^T \lambda) \\ & = & \tfrac{1}{4} \lambda^T \big( \sum E_i A_i^{-1} E_i^T \big) \lambda \\ & = & \tfrac{1}{4} \lambda^T C_{ASM}^{-1} \lambda \\ & = & u^T C_{ASM} C_{ASM}^{-1} C_{ASM} u \\ & = & u^T C_{ASM} u \end{eqnarray*}\]

\(\Box\)

77.4. The upper-bound by the overlap#

Typically, proving the estimate

\[ \| u \|_{C_{ASM}}^2 = \inf_{u_i \in V_i \atop u = \sum u_i} \sum \| u_i \|_A^2 \leq \frac{1}{\gamma_1} \| u \|_A^2 \]

is the hard part of the analysis, which has to be shown by explicitly constructing a decomposition. The other estimate often follows immediately from finite overlap of sub-spaces:

We define the overlap matrix \({\mathcal O} \in {\mathbb R}^{m \times m}\) with entries

\[ {\mathcal O}_{i,j} := \sup_{u_i \in V_i, v_j \in V_j} \frac{A(u_i,v_j)}{\| u_i \|_A \, \| v_j \|_A} \]

By the Cauchy-Schwarz inequality there is \(0 \leq {\mathcal O}_{i,j} \leq 1\). The value \({\mathcal O}_{i,j}\) leads to the so called strengthened Cauchy-Schwarz inequality

\[ | A(u_i, v_j) | \leq {\mathcal O}_{i,j} \, \| u_i \|_A \, \| v_j \|_A \]

If two sub-spaces are \(A\)-orthogonal, then \({\mathcal O}_{i,j} = 0\).

Lemma There holds

\[ \| u \|_A^2 \leq \rho({\mathcal O}) \, \| u \|_{C_{ASM}}^2, \]

where \(\rho({\mathcal O})\) is the spectral radius of \({\mathcal O}\).

Proof: Let \(u \in V_h\) be given, and let \(u = \sum u_i\) be an arbitrary sub-space decomposition. Then

\[\begin{eqnarray*} \| u \|_A^2 & = & A(\sum_i u_i, \sum_j u_i) \\ & = & \sum_{i,j} A(u_i, u_j) \\ & \leq & \sum_{i,j} {\mathcal O}_{i,j} \, \| u_i \|_A \, \| u_j \|_A \end{eqnarray*}\]

Next, we define \(z \in {\mathbb R}^N\) as \(z_i = \| u_i \|_A\). We use

\[ z^T {\mathcal O} z \leq \rho( {\mathcal O} ) \| z \|_2^2 \]

to bound

\[ \| u \|_A^2 \leq \rho( {\mathcal O} ) \, \sum_i \| u_i \|_A^2. \]

Since the estimate holds for any decomposition, it holds also for the one leading to the minimal right hand side, what is exactly \(\| u \|_{C_{ASM}}^2\).