[dune-pdelab] Fwd: Fwd: solver fails to reset correctly after FMatrixError (singular matrix)
Dmitry Mazilkin
dmm16 at tu-clausthal.de
Wed Jul 10 15:26:51 CEST 2019
Hello all Dune developers,
According to the description
> I just talked to Jö. We guess that the problem is, that the exception
> is only thrown on one rank, say rank X. All other ranks do not know
> that rank X failed and proceed as usual, at some point all these ranks
> waiting for communication of rank X. That is the deadlock that you see
we've got very similar behavior, which is described here
https://gitlab.dune-project.org/pdelab/dune-pdelab/issues/130
we got the bug using:
ISTLBackend_OVLP_GMRES_ILU0
Alexander3
Newton
Best regards,
Dmitry
On 10.07.19 15:21, Markus Blatt wrote:
> On Wed, Jul 10, 2019 at 02:39:09PM +0200, Nils-Arne Dreier wrote:
>> I just talked to Jö. We guess that the problem is, that the exception is
>> only thrown on one rank, say rank X. All other ranks do not know that
>> rank X failed and proceed as usual, at some point all these ranks
>> waiting for communication of rank X. That is the deadlock that you see.
>>
>> You may want to have a look at Dune::MPIGuard in
>> dune/common/parallel/mpiguard.hh. It makes it possible to propagate the
>> error state to all ranks.
>>
>
> One could also argue that if this happens in OneStepMethod of PDELab then
> PDELab (in the long run) should make sure that the behaviour is consistent
> across all processors...
>
> Just my 2 cents.
>
> Markus
>
--
Dmitry Mazilkin
Institut für Mathematik, Raum 314
Erzstraße 1, 38678 Clausthal-Zellerfeld
More information about the dune-pdelab
mailing list