101. Distributed Meshes and Spaces#

Setting up the client and the world-communicator:

from ipyparallel import Cluster
c = await Cluster(engines="mpi", usethreads=True).start_and_connect(n=4, activate=True)
c.ids
Starting 4 engines with <class 'ipyparallel.cluster.launcher.MPIEngineSetLauncher'>
[0, 1, 2, 3]
%%px
from mpi4py import MPI
comm = MPI.COMM_WORLD

The master generates a mesh, which is then distributed within the team of processors. The master process calls the graph partitioning library metis, which assigns a process id to each element. Then, elements are sent to the process with according rank.

The master process does not keep elements itself, it is kept free for special administrative work.

The distribution is done for the Netgen mesh. Parallel uniform refinement of the Netgen-mesh is also possible.

%%px
from ngsolve import *

if comm.rank == 0:
    ngmesh = unit_square.GenerateMesh(maxh=0.1)
    print ("global num els =", len(ngmesh.Elements2D()))
    ngmesh.Distribute(comm)
else:
    ngmesh = netgen.meshing.Mesh.Receive(comm)

for l in range(2):
    ngmesh.Refine()
    
mesh = Mesh(ngmesh)
print ("process", comm.rank, "got elements:", mesh.GetNE(VOL))
[stdout:0] global num els = 232
process 0 got elements: 0
[stdout:3] process 3 got elements: 1184
[stdout:2] process 2 got elements: 1280
[stdout:1] process 1 got elements: 1248

The collective communication reduce combines data from each process to one global value. Default reduction operation is summation. Only the root process gets the result. Alternatively, use ‘allreduce’ to broadcast the result to all team members:

%%px
sumup = comm.reduce(mesh.GetNE(VOL))
print ("summing up num els: ", sumup)
[stdout:2] summing up num els:  None
[stdout:1] summing up num els:  None
[stdout:0] summing up num els:  3712
[stdout:3] summing up num els:  None

We can retriev the mesh variable from each node of the cluster. The master process returns the global mesh, each worker only its local part. The list ‘meshes’ obtains the list of meshes.

from ngsolve import *
from ngsolve.webgui import Draw
meshes = c[:]['mesh']
for i,m in enumerate(meshes):
    print ("mesh of rank", i)
    Draw (m)
mesh of rank 0
mesh of rank 1
mesh of rank 2
mesh of rank 3

101.1. Distributed finite element spaces#

We can define finite element spaces on the distributed mesh. Every process only defines dofs on its subset of elements:

%%px
fes = H1(mesh, order=2)
print ("ndof local =", fes.ndof, ", ndof global =", fes.ndofglobal)
[stdout:2] ndof local = 2657 , ndof global = 7585
[stdout:0] ndof local = 0 , ndof global = 7585
[stdout:3] ndof local = 2481 , ndof global = 7585
[stdout:1] ndof local = 2593 , ndof global = 7585
%%px
sumlocdofs = comm.reduce (fes.ndof)
if comm.rank == 0:
    print ("sum of local dofs:", sumlocdofs)
[stdout:0] sum of local dofs: 7731

The sum of local dofs is larger than the global number of dofs, since dofs at interface nodes are counted multiplel times.

We can define distributed grid-functions. Global operations like the Integrate function performs local integration, and sum up the result:

%%px
gfu = GridFunction(fes)
gfu.Set(x*y)
print ("integrate:", Integrate(gfu, mesh))
[stdout:1] integrate: 0.24999999999999895
[stdout:2] integrate: 0.24999999999999895
[stdout:3] integrate: 0.24999999999999895
[stdout:0] integrate: 0.24999999999999895

We can retrieve the gridfunction to the local Python scope:

gfus = c[:]['gfu']
print ("integrate locally:", Integrate(gfus[0], gfus[0].space.mesh))

Draw (gfus[0], min=0,max=1)
Draw (gfus[1], min=0,max=1);
integrate locally: 0.24999999999999872

We can use a piece-wise constant space to visualize the mesh partitioning:

%%px
gfl2 = GridFunction(L2(mesh, order=0))
gfl2.vec[:] = comm.rank
Draw (c[:]['gfl2'][0]);

101.2. The ParallelDofs class#

The back-bone of connection of dofs is the ParallelDofs class. It is provided by a distributed finite element space, based on the connectivity of the mesh. The pardofs object knows with which other processes the dof is shared. We can ask for all dof numbers shared with a particular process, and obtain a list of local dof nrs. The ordering is consistent for both partners.

%%px
pardofs = fes.ParallelDofs()
for otherp in range(comm.size):
    print ("with process", otherp, "I share dofs", \
           list(pardofs.Proc2Dof(otherp)))
[stdout:0] with process 0 I share dofs []
with process 1 I share dofs []
with process 2 I share dofs []
with process 3 I share dofs []
[stdout:1] with process 0 I share dofs []
with process 1 I share dofs []
with process 2 I share dofs [12, 24, 34, 39, 45, 48, 50, 51, 87, 119, 144, 157, 172, 177, 180, 229, 293, 295, 349, 351, 376, 378, 408, 411, 423, 432, 434, 435, 438, 721, 785, 787, 841, 843, 868, 870, 900, 903, 915, 924, 926, 927, 930, 1115, 1116, 1303, 1304, 1449, 1450, 1523, 1524, 1611, 1612, 1637, 1638, 1653, 1654]
with process 3 I share dofs [1, 13, 25, 37, 47, 48, 55, 89, 121, 153, 175, 184, 230, 232, 296, 298, 364, 368, 418, 420, 422, 676, 722, 724, 788, 790, 856, 860, 910, 912, 914, 943, 944, 1125, 1126, 1313, 1314, 1501, 1502, 1627, 1628]
[stdout:3] with process 0 I share dofs []
with process 1 I share dofs [14, 29, 38, 45, 49, 50, 52, 53, 54, 55, 56, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 702, 781, 782, 831, 832, 869, 870, 888, 889, 892, 899, 900, 903, 904, 907, 908, 911, 912, 915, 916]
with process 2 I share dofs [15, 30, 40, 46, 48, 50, 51, 57, 58, 59, 60, 61, 62, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 704, 785, 786, 842, 843, 873, 874, 883, 884, 893, 896, 897, 919, 920, 923, 924, 927, 928, 931, 932, 935, 936, 939, 940]
with process 3 I share dofs []
[stdout:2] with process 0 I share dofs []
with process 1 I share dofs [1, 12, 23, 33, 43, 47, 49, 50, 53, 54, 55, 56, 57, 58, 59, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 692, 733, 734, 795, 796, 851, 852, 904, 905, 925, 934, 935, 937, 938, 953, 954, 957, 958, 961, 962, 965, 966, 969, 970, 973, 974, 977, 978]
with process 2 I share dofs []
with process 3 I share dofs [11, 22, 32, 34, 41, 47, 48, 91, 120, 144, 149, 166, 179, 241, 299, 301, 352, 354, 358, 360, 398, 400, 427, 431, 433, 732, 792, 794, 847, 849, 855, 857, 895, 897, 927, 931, 933, 1145, 1146, 1317, 1318, 1459, 1460, 1487, 1488, 1587, 1588, 1663, 1664]
c.shutdown(hub=True)