[Dune] Parallel AMG on up to 65536 cores on Hector

Eike Mueller E.Mueller at bath.ac.uk
Fri Feb 24 17:00:48 CET 2012


Dear Dune-list,

I have now started some highly parallel runs on Hector where my first 
goal is to get the solver to scale to 65536 cores (the maximal available 
core count in Phase 3 is ~90,000). So far I have done some weak scaling 
runs on 64, 512, 4096 and 32768 cores.

I have not tuned anything, I use the ISTL Overlapping CG solver backend 
with the parallel AMG preconditioner (with an SSOR point smoother). I am 
not using SuperLU to solve the coarse level problem.
On the smaller machine (up to 800 cores), which I have used so far, this 
already gave quite good results.

Basically, as compute time on Hector is expensive, I would be interested 
in whether anybody already has experience with the ideal setup for the 
parallel AMG for very large core counts, which I could use as a starting 
point.

The two main questions are:

* Will using SuperLU help (or be essential)?
* Will using ParMETIS help (or be essential) (and do I need to use Metis 
in addition to ParMETIS, or will ParMETIS alone be enough?)?

The first three runs (on 64, 512 and 4096 cores) look ok, with the time 
per iteration increasing from 0.6s to 0.65s to 1.1s between 64 and 512 
and 4096 cores (and on 8 cores I get 0.59s). The 32768 run does not 
complete in 10 minutes, but manages to get to the point where it has 
built the coarse grid matrices. This, however, takes 48.7s instead of 
6.5s on 4096 cores, so it has effectively stopped scaling as 48.7/6.5 is 
not very far from 8.

In the largest run I use 4096 x 4096 x 1024 = 1.8E10 degrees of freedom.

I observed that for the 4096 and 32768 core runs I get this warning message:
'Stopped coarsening because of rate breakdown 32768/32768=1<1.2
and the hierarchy is built up to 9 level only.'
I guess this is potentially a problem if I do not use SuperLU.

I have not compiled with ParMetis support, which is why I get this 
message as well:
'Successive accumulation of data on coarse levels only works with 
ParMETIS installed.  Fell back to accumulation to one domain on coarsest 
level'

Thank you very much for any ideas,

Eike





More information about the Dune mailing list