[dune-pdelab] Fwd: Fwd: solver fails to reset correctly after FMatrixError (singular matrix)

Fri Jul 26 17:08:18 CEST 2019

On Thu, Jul 25, 2019 at 05:28:33PM +0200, Shubhangi Gupta wrote:
> Hi Markus,
> 
> Thanks a lot for your advice.
> 
> I corrected the implementation of the mpiguard as per your suggestion (both,
> in the main time loop, and in the ovlpistlsolverbackend). Two notable things
> I observe:
> 
> 1. The mpiguard **seems** to work on my local machine... as in, I have run
> my simulations for a number of parameter sets, and my linear solver hasn't
> frozen *yet*. But, the mpiguard doesn't work on the copy of the code on our
> university server!
>

That is a bit weired.

> 2. It seems that the mpiguard is making the code slower ... can this be?
> 
> Also, yes, I agree that my linear system could be ill-condition (or weird,
> as you put it). I have a complicated setting with rather extreme properties
> taken from the Black Sea cores.. But, I think the linear/nonlinear solvers
> shouldn't fail partially, and communication failure between processes is
> certainly not a good sign for the solvers in general... or? I would expect
> the solver to simply not converge overall if the linear system is
> incorrect... not freeze halfway and stop communicating.

Well like you experienced, sometimes you have the choice of making everything slow
or catering for every corner case.

Anyway there is a better solution to our case that I came up with.

Use one of the AMG backends with ILU as the smoother, e.g. ISTLBackend_BCGS_AMG_ILU0.
Here the factorization will be done when setting up the AMG and the exception
will be thrown there. This is done in ISTLBackend_AMG::apply, where you change

       //only construct a new AMG if the matrix changes
        if (reuse==false || firstapply==true){
          amg.reset(new AMG(oop, criterion, smootherArgs, oocc));
          firstapply = false;
          stats.tsetup = watch.elapsed();
          stats.levels = amg->maxlevels();
          stats.directCoarseLevelSolver=amg->usesDirectCoarseLevelSolver();
        }

to

       //only construct a new AMG if the matrix changes
        if (reuse==false || firstapply==true){
	  MPIGuard guard;
          amg.reset(new AMG(oop, criterion, smootherArgs, oocc));
	  guard.finalize();
          firstapply = false;
          stats.tsetup = watch.elapsed();
          stats.levels = amg->maxlevels();
          stats.directCoarseLevelSolver=amg->usesDirectCoarseLevelSolver();
        }

Your code does not need to use any MPIGuard but only catches the exception that is
thrown on all processors. You can also undo the changes to OverlappingPreconditioner::apply
as the exception is not thrown in the preconditioner's apply method anymore.
The performance penalty should be much smaller now.

Markus
-- 
Dr. Markus Blatt - HPC-Simulation-Software & Services http://www.dr-blatt.de
Pedettistr. 38, 85072 Eichstätt, Germany
Tel.: +49 (0) 160 97590858