[dune-pdelab] Scalability issue in overlapping solvers

Peter Bastian Peter.Bastian at iwr.uni-heidelberg.de
Tue Jan 11 14:12:38 CET 2022


Dear Aswin,

admittedly your results are a bit strange.

- The speedup in assembly (results in your previous mail) seem to be ok.

- Your processor Intel Core™ i7-10700 CPU @ 2.90GHz has only two memory 
channels,
so I would not expect a speedup much beyond 2 but, this might explain 
the numbers when you
go from from 2 to 4 processors. I am also not sure about the Turbo mode. 
Can you switch it off?
On the other hand, the assembly seems to be fine.

- In your old mail with the inexact solver in additive Schwarz you had 
also included one processor
but there was hardly any speedup from 1 to 2.

You might go back to that ISTLBackend_OVLP_BCGS_ILU0 Solver and try 
different problem sizes
(e.g. such that the local problem in one process fits into cache) and 
inner iterations (to increase the
ration of computation to communication) and see what you get.

You might also want to try the ISTLBackend_BCGS_AMG_SSOR solver. This 
one can use also only one subdomain.

Best regards,

Peter


Am 11.01.22 um 13:09 schrieb Aswin vs:
> Sir,
> As you suggested, I am currently looking into the GenEO examples given 
> in the dune-composite-master. Still I couldn't see any scalability in 
> the test example.
>
> In the test Example01b I have edited the line Partitioning[0] = 2 and 
> Partitioning[0] = 4 for 2 and 4 numbers of processors, respectively. 
> If I change it to Partitioning[0] = 8 then I am getting the error " 
> YaspGrid does not support degrees of freedom shared by more than 
> immediately neighboring subdomains." Then I changed the refineBaseGrid 
> = 2, but it had the same error.
>
> Presently, I am using the desktop having Intel Core™ i7-10700 CPU @ 
> 2.90GHz × 1 processor, that has 8/16 number of cores/threads.
>
> The outputs of the simple runs are given below for your reference. In 
> each case, I found the rate is also not strictly decreasing. The  
> number of iterations is increasing and the overall time is also  
> increasing.
>
> Can you suggest to me how to run these examples efficiently? Also I 
> would like to know how to print the dim(V_H) as you mentioned in the 
> paper.
>
> Thank you for your time and consideration,
>
> $ mpirun -np 2 ./Example01b
> *NumRegions
> 1
> Number of Regions = 1
> maxPly = 23
> periodic = 000
> === Building Geometry
> Elements in x direction
> Number of elements per processor: 1925
> Number of nodes per processor: 2592
> Grid transformation complete
> Grid view set up
> Piecewise quadratic serendipity elements
> Starting solve with Geneo Preconditioner
> Eigenvalue threshhold: 0.00769231
> Process 1 picked 6 eigenvectors
> Process 0 picked 2 eigenvectors
> Matrix setup
> Global basis size B=8
> Matrix setup finished: M=0.0490446
> Geneo setup time 23.4082
> === Dune::CGSolver
> Min eigv estimate: 0.0676291
> Max eigv estimate: 2.99961
> Condition estimate: 44.3538
> === rate=0.534884, T=2.99187, TIT=0.135994, IT=22
> Solver: CG
> Preconditioner: GenEO
> Subdomain Solver: UMFPack
> =================
> Solver Converged: 1Solution of my Problem = 0.000166984
>
>
> $ mpirun -np 4 ./Example01b
> *NumRegions
> 1
> Number of Regions = 1
> maxPly = 23
> periodic = 000
> === Building Geometry
> Elements in x direction
> Number of elements per processor: 1050
> Number of nodes per processor: 1512
> Grid transformation complete
> Grid view set up
> Piecewise quadratic serendipity elements
> Starting solve with Geneo Preconditioner
> Eigenvalue threshhold: 0.00909091
> Process 2 picked 6 eigenvectors
> Process 3 picked 6 eigenvectors
> Process 1 picked 6 eigenvectors
> Process 0 picked 1 eigenvectors
> Matrix setup
> Global basis size B=19
> Matrix setup finished: M=0.119121
> Geneo setup time 24.9791
> === Dune::CGSolver
> Min eigv estimate: 0.0249189
> Max eigv estimate: 2.9999
> Condition estimate: 120.386
> === rate=0.769396, T=5.08444, TIT=0.110531, IT=46
> Solver: CG
> Preconditioner: GenEO
> Subdomain Solver: UMFPack
> =================
> Solver Converged: 1Solution of my Problem = 0.000166984
>
>
>
>
> Thank you.
>
> On Mon, Jan 3, 2022 at 11:41 PM Linus Seelinger 
> <linus.seelinger at iwr.uni-heidelberg.de> wrote:
>
>     Hi Aswin,
>
>
>     first of all, I think you might be mislead by how periodic
>     boundaries are handled in DUNE. Periodic boundaries (at least
>     using YASPGrid) require a parallel run (i.e. more than one MPI
>     rank), since they essentially use the same overlapping
>     communication framework that otherwise handles "regular" overlaps
>     between subdomains. Think of a 2D grid, gluing together the
>     periodic ends; in the resulting cylinder shape, subdomains at
>     periodic boundaries are just regular neighbors and can be handled
>     accordingly.
>
>
>     Long story short, I think the sequential (-np 1) run does not give
>     you a correct solution (write to VTK and check the output to
>     confirm) and is therefore not a good reference. The other runs
>     look not as bad regarding scalability.
>
>
>     If you still need better scalability, you might look into more
>     advanced methods. The appropriate choice then depends a lot on the
>     particular problem you want to solve, so more detail from you
>     would be helpful.
>
>
>     One example might be GenEO, which we could scale up to around
>     16000 cores (see https://doi.org/10.1007/978-3-030-43229-4_11 ).
>     Might be a bit overkill though depending on what you want to do.
>
>
>     Best,
>
>
>     Linus
>
>
>     Am Montag, 3. Januar 2022, 11:09:14 CET schrieb Aswin vs:
>
>     > Hello,
>
>     >
>
>     > Can somebody suggest me how to get good scalability while using the
>
>     > overlapping solvers in DUNE. I tried the following test example
>     in DUNE
>
>     > PDELab, but not getting good scalability.
>
>     > Thank you.
>
>     >
>
>     > $  mpirun -np 1 ./test-heat-instationary-periodic
>
>     > .
>
>     > .
>
>     > .
>
>     > STAGE 2 time (to):   1.5000e-02.
>
>     > === matrix setup skipped (matrix already allocated)
>
>     > === matrix assembly (max) 0.9944 s
>
>     > === residual assembly (max) 0.4861 s
>
>     > === solving (reduction: 1e-10) === Dune::BiCGSTABSolver
>
>     > === rate=0.8397, T=12.57, TIT=0.09524, IT=132
>
>     > 12.68 s
>
>     > ::: timesteps           2 (2)
>
>     > ::: nl iterations     565 (565)
>
>     > ::: lin iterations    565 (565)
>
>     > ::: assemble time    8.0477e+00 (8.0477e+00)
>
>     > ::: lin solve time   5.3414e+01 (5.3414e+01)
>
>     >
>     ---------------------------------------------------------------------------------------
>
>     > $ mpirun -np 2 ./testheat-instationary-periodic
>
>     > .
>
>     > .
>
>     > .
>
>     > STAGE 2 time (to):   1.5000e-02.
>
>     > === matrix setup skipped (matrix already allocated)
>
>     > === matrix assembly (max) 0.5053 s
>
>     > === residual assembly (max) 0.2465 s
>
>     > === solving (reduction: 1e-10) === Dune::BiCGSTABSolver
>
>     > === rate=0.9268, T=26.95, TIT=0.08895, IT=303
>
>     > 27.05 s
>
>     > ::: timesteps           2 (2)
>
>     > ::: nl iterations    1254 (1254)
>
>     > ::: lin iterations   1254 (1254)
>
>     > ::: assemble time    4.0910e+00 (4.0910e+00)
>
>     > ::: lin solve time   1.1201e+02 (1.1201e+02)
>
>     >
>     ---------------------------------------------------------------------------------------
>
>     > $ mpirun -np 4 ./testheat-instationary-periodic
>
>     > .
>
>     > .
>
>     > .
>
>     > STAGE 2 time (to):   1.5000e-02.
>
>     > === matrix setup skipped (matrix already allocated)
>
>     > === matrix assembly (max) 0.271 s
>
>     > === residual assembly (max) 0.1318 s
>
>     > === solving (reduction: 1e-10) === Dune::BiCGSTABSolver
>
>     > === rate=0.9232, T=26.02, TIT=0.0894, IT=291
>
>     > 26.11 s
>
>     > ::: timesteps           2 (2)
>
>     > ::: nl iterations    1249 (1249)
>
>     > ::: lin iterations   1249 (1249)
>
>     > ::: assemble time    2.1746e+00 (2.1746e+00)
>
>     > ::: lin solve time   1.1165e+02 (1.1165e+02)
>
>     >
>     ---------------------------------------------------------------------------------------
>
>     > $ mpirun -np 8 ./testheat-instationary-periodic
>
>     > .
>
>     > .
>
>     > .
>
>     > STAGE 2 time (to):   1.5000e-02.
>
>     > === matrix setup skipped (matrix already allocated)
>
>     > === matrix assembly (max) 0.1772 s
>
>     > === residual assembly (max) 0.08259 s
>
>     > === solving (reduction: 1e-10) === Dune::BiCGSTABSolver
>
>     > === rate=0.9288, T=30.81, TIT=0.09751, IT=316
>
>     > 30.89 s
>
>     > ::: timesteps           2 (2)
>
>     > ::: nl iterations    1329 (1329)
>
>     > ::: lin iterations   1329 (1329)
>
>     > ::: assemble time    1.3485e+00 (1.3485e+00)
>
>     > ::: lin solve time   1.2796e+02 (1.2796e+02)
>
>     >
>
>     >
>
>     >
>
>
>
>     _______________________________________________
>     dune-pdelab mailing list
>     dune-pdelab at lists.dune-project.org
>     https://lists.dune-project.org/mailman/listinfo/dune-pdelab
>
>
> _______________________________________________
> dune-pdelab mailing list
> dune-pdelab at lists.dune-project.org
> https://lists.dune-project.org/mailman/listinfo/dune-pdelab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dune-project.org/pipermail/dune-pdelab/attachments/20220111/fdf4f55a/attachment-0001.htm>


More information about the dune-pdelab mailing list