[dune-pdelab] Fwd: Fwd: solver fails to reset correctly after FMatrixError (singular matrix)
Markus Blatt
markus at dr-blatt.de
Wed Jul 24 11:25:55 CEST 2019
Please always reply to the list. Free consulting is only available there.
the solution to your problems is at the bottom. Please also read the rest
as you seem to use MPGuard the wrong way
On Wed, Jul 24, 2019 at 09:42:01AM +0200, Shubhangi Gupta wrote:
> Hi Markus,
>
> Thanks a lot for your reply! I am answering your questions below...
>
> 1. Does at the highest level mean outside the try clause? That might be wrong as it will throw if something went wrong. It needs to be inside the try clause.
>
> By highest level, I meant **inside** the try clause.
I really have no experience with MPIGuard. Maybe someone else can tell us where
it throws.
but I think you are using it wrong.
>
> Dune::MPIGuard guard;
>
This would be outside the try clause. But that might be right as MPIGuard
throws during finalize.
> bool exceptionCaught = false;
>
> while( time < t_END ){
>
> try{
>
Personally I would have initialize the MPIGuard here, but maybe reactivating
but it seems like your approach is valid too as you reactivate.
> // reactivate the guard for the next critical operation
> guard.reactivate();
>
> osm.apply( time, dt, uold, unew );
>
> exceptionCaught = false;
>
Here you definitely need to tell it that you passed the critial section:
guard.finalize();
> }catch ( Dune::Exception &e ) {
> exceptionCaught = true;
>
> // tell the guard that you successfully passed a critical
> operation
> guard.finalize();
This is too late! You have already experienced any exception there might be.
>
> unew = uold;
>
> dt *= 0.5;
>
> osm_tch.getPDESolver().discardMatrix();
>
> continue;
> }
>
> uold = unew;
> time += dt;
> }
>
> 2. freezes means deadlock (stopping at an iteration and never finishing)? That will happen in your code if the MPIGuard is before the try clause.
>
> Yes, freezes means stopping at the iteration and never finishing it.
>
> So first, this was happening right after FMatrixError (singular matrix).
> The osm froze without initiating Newton solver... After I put the MPIGuard,
> this problem was solved... Newton solver restarts as it should... But now
> the freezing happens with the linear solver (BiCGStab, in this case). Nils
> said to solve this I will have to put the MPIGuard also on lower levels
> (inside newton and linear solver...). I, on the other hand, prefer to not
> touch the dune core code and risk introducing more errors along the way...
>
That is because in your case different processor will work with different
timesteps and that cannot work as the linear system is utterly wrong.
> 3. ....have you tried the poor-man's solution, below? ...
>
> Yes, I tried that, but the problem is if the apply step doesn't finish, then
> nothing really happens...
>
Finally I understand. Your are using Dune::PDELab::ISTLBackend_BCGS_AMG_SSOR<IGO>.
You must have a very weired linear system as this bug can only appear when
inverting the diagonal block in the application of one step SSOR. Personally
I would say that your linear system is incorrect/not very sane.
The bug is in PDELab that does not expect an exception
during the application of the preconditioner. It has to be fixed there in
file ovlpistlsolverbackend.hh OverlappingPreconditioner::apply
MPIGuard guard;
prec.apply(Backend::native(v),Backend::native(dd));
guard.finalize(true);
and probably many more. In addition this construct is also need in the
constructor of AMG as it can happen if ILU is used as the smoother-
Please make your patch available afterwards.
HTH
Markus
--
Dr. Markus Blatt - HPC-Simulation-Software & Services http://www.dr-blatt.de
Pedettistr. 38, 85072 Eichstätt, Germany, USt-Id: DE279960836
Tel.: +49 (0) 160 97590858
More information about the dune-pdelab
mailing list