94. Introduction to Non-overlapping Domain Decomposition#
We want to use a big parallel computer to solve discretized PDEs. The idea is to decompose the domain into sub-domain, and every processor has information only about its sub-domain, and the interfaces to the neighbor domains.
It is complicated to split a complex geometry into sub-domains. In particular if the mesh is locally refined, a well balanced decomposition of the work-load is hard to reach.
Graph-based mesh partitioning is a practical approach to domain decomposition. Such algorithms aim in partition the mesh into sub-domains of approximately equal number of elements, such that the interfaces between sub-domains are minimal.
A big parallel cluster is built such that every processor has access to its local memory. Data between processors is exchanged by a communication network. Modern processors have typically several cores which can work independently. In the following we usually mean core when talking about processors. Since a processor core has a similar amount of physical transistors as a fixed amount of memory, the last decades have shown approximately a constant amount of memory per processor core, in the order of 2-8 GB per core.
Thus we may think of a fixed number of unknowns per processor (=sub-domain), in the range of \(10^4\) to \(10^6\) variables per core. At least for the lower range, direct solvers for sub-domain problems are competitive.
For larger and larger clusters, the communication network becomes more and more the bottle-neck. For big computers, the costs for an efficient network is in the same order of magnitude as the costs for the individual nodes.
The goal of such domain decomposition methods is to reduce the communication effort between the sub-problems. The costs for sending information consists of latency time (i.e. how long it takes to get the communication started), and band-width (i.e. how much data per second one can send when the communication is established). When performing a multigrid preconditioner in parallel, one has to communicate on every level. Since the number of unknowns decrease geometrically for coarser levels, the amount of data exchange also decreases geometrically. Here, the latency becomes more and more relevant.
In domain decomposition methods, the data exchanges is blocked to one (or a few) communication steps per preconditioning step. This may be the advantage when compared to optimal order multi-grid methods. Multi-grid techniques may be used for the local operations.
94.1. Domain Decomposition with Lagrange parameters#
The domain \(\Omega\) is split into non-overlapping sub-domains \(\Omega_i\). Let \(\gamma_{ij} := \overline{\Omega_i} \cap \overline{\Omega_j}\) be the interface between sub-domains \(\Omega_i\) and \(\Omega_j\). We are posing local problems, and enforcing continuity of the field by a Lagrange parameter: find \(u_i \in H^1(\Omega_i)\) and \(\lambda_{ij} \in H^{-1/2} (\gamma_{ij})\) such that
When we know the Lagrange parameter, we can find the sub-domain solutions via the equation
The Lagrange parameter comes into the formulation exactly like Neumann boundary data. Thus, its physical meaning is exactly the Neumann data (normal derivative \(\tfrac{\partial u}{\partial n}\)) at the sub-domain boundaries. For the beginning, we have added the \(L_2\)-term \(\int_{\Omega_i} u_i v_i\) to avoid singular local problems. This local kernels must be taken into account when the \(L_2\)-term is missing, and even if it is small.
We have chosen the space for the Lagrange parameter \(H^{-1/2}\). To analyze the formulation and design preconditioners, we have to study such spaces.