[dune-pdelab] Fwd: Fwd: solver fails to reset correctly after FMatrixError (singular matrix)

Shubhangi Gupta sgupta at geomar.de
Wed Jul 24 10:36:41 CEST 2019


Hi Nils,

Thanks a lot! I managed to install blackchannel-ulfm. While building 
Dune with following CMake opts:

CMAKE_FLAGS="
-DCMAKE_C_COMPILER='/usr/bin/gcc'
-DCMAKE_CXX_COMPILER='/usr/bin/g++-7'
-DCMAKE_Fortran_COMPILER='/usr/bin/gfortran'
-DBLACKCHANNEL_INCLUDE_DIR='/usr/local/include'
-DBLACKCHANNEL_LIBRARIES='/usr/local/lib'
-DCMAKE_CXX_FLAGS_RELEASE='-O3 -DNDEBUG -g0 -Wno-deprecated-declarations 
-funroll-loops'
-DCMAKE_BUILD_TYPE=Release
-DDUNE_SYMLINK_TO_SOURCE_TREE=1
"

I get the following message:

   Manually-specified variables were not used by the project:

     BLACKCHANNEL_INCLUDE_DIR
     BLACKCHANNEL_LIBRARIES


How can I check (or force) whether Dune indeed finds the blackchannel ?

Thanks again, and warm wishes, Shubhangi


On 23.07.19 16:45, Jö Fahlke wrote:
> Am Di, 23. Jul 2019, 15:26:39 +0200 schrieb Shubhangi Gupta:
>> Sorry, I am still struggling with this issue... and my BiCGStab solver is
>> freezing a lot more often so I cant ignore this ..
>>
>> About the ULFM... you sent me the following link:
>>
>> https://gitlab.dune-project.org/exadune/blackchannel-ulfm
> That is a (more-or-less) standard cmake buildsystem, i.e. it works outside of
> dune.  Try something like this (untested, replace the "..." as needed):
> ```sh
> git clone https://gitlab.dune-project.org/exadune/blackchannel-ulfm
> mkdir build
> ( cd build && cmake ../blackchannel-ulfm -DCMAKE_INSTALL_PREFIX=... )
> make -C build install
> ```
>
> Then, in your Dune opts file, you may need to set
> `-DBLACKCHANNEL_INCLUDE_DIR=.../include -DBLACKCHANNEL_LIBRARIES=.../lib` (see
> [1]) in the `CMAKE_FLAGS` and Dune should pick the library up when
> reconfiguring.
>
> [1]: https://gitlab.dune-project.org/core/dune-common/blob/edef55ec9ed40617d12648d6ec95cbfc7120c676/cmake/modules/FindBlackChannel.cmake
>
> Regards,
> Jö.
>
>> Sorry if this is a trivial question, but how should I compile this? With
>> dune-build? and how should I include this in my code?
>>
>> Thanks, and warm wishes, Shubhangi
>>
>>
>> On 12.07.19 13:38, Nils-Arne Dreier wrote:
>>> Hi Shubhangi,
>>>
>>> you have to call the MPIGuard::finalize() method after that point, where
>>> the exception might be thrown and before the next communication is
>>> performed. From the information, you provided, I guess that the
>>> exception is thrown in the smoother of the AMG. Which makes things
>>> slightly complicated. Maybe AMG::mgc is a good starting point.
>>>
>>> By the way: If you use the ULFM things I described previously you can
>>> use the MPIGuard on the coarsest level and don't need to call
>>> MPIGuard::finalize() after every critical section.
>>>
>>> Regards
>>> Nils
>>>
>>> On 11.07.19 14:56, Shubhangi Gupta wrote:
>>>> Dear Jö and Nils,
>>>>
>>>> Thanks a lot for your replies.
>>>>
>>>> I actually tried putting the mpiguard within the time loop (at the
>>>> highest level) just to see what happens... Indeed, the one step method
>>>> now proceeds as it should, but the BiCGSTab freezes... So yeah, as Jö
>>>> mentioned, the mpiguard needs to be introduced inside the
>>>> ISTL-solver... I am not very sure how and where exactly though! Any
>>>> ideas?
>>>>
>>>> Thanks again, and warm wishes, Shubhangi
>>>>
>>>> On 10.07.19 14:52, Jö Fahlke wrote:
>>>>> Am Mi, 10. Jul 2019, 14:39:09 +0200 schrieb Nils-Arne Dreier:
>>>>>> Hi Shubhangi,
>>>>>>
>>>>>> I just talked to Jö. We guess that the problem is, that the
>>>>>> exception is
>>>>>> only thrown on one rank, say rank X. All other ranks do not know that
>>>>>> rank X failed and proceed as usual, at some point all these ranks
>>>>>> waiting for communication of rank X. That is the deadlock that you see.
>>>>>>
>>>>>> You may want to have a look at Dune::MPIGuard in
>>>>>> dune/common/parallel/mpiguard.hh. It makes it possible to propagate the
>>>>>> error state to all ranks.
>>>>> It should be mentioned that MPIGuard probably cannot be used at a
>>>>> high level,
>>>>> it would probably need to be introduced into the ISTL-Solver
>>>>> (BiCGSTab, AMG,
>>>>> SSOR) and/or PEDLab (the parallel scalar product, Newton) for this to
>>>>> work.
>>>>> Not sure where exactly.
>>>>>
>>>>> Regards,
>>>>> Jö.
>>>>>
>>>>>> There is also a merge request for dune-common, which adapts the
>>>>>> MPIGuard
>>>>>> such that you don't need to check for an error state before
>>>>>> communicating, making use of the ULFM proposal for MPI. You can find it
>>>>>> here:
>>>>>> https://gitlab.dune-project.org/core/dune-common/merge_requests/517
>>>>>>
>>>>>> If you don't have a MPI implementation that provides a *working* ULFM
>>>>>> implementation, you may want to use the blackchannel-ulfm lib:
>>>>>> https://gitlab.dune-project.org/exadune/blackchannel-ulfm
>>>>>>
>>>>>> I hope that helps.
>>>>>>
>>>>>> Kind regards
>>>>>> Nils
>>>>>>
>>>>>> On 10.07.19 14:07, Shubhangi Gupta wrote:
>>>>>>> Hi Jö,
>>>>>>>
>>>>>>> So, since you asked about the number of ranks... I tried running the
>>>>>>> simulations again on 2 processes and 1 process. I get the same problem
>>>>>>> with 2, but not with 1.
>>>>>>>
>>>>>>> On 10.07.19 13:33, Shubhangi Gupta wrote:
>>>>>>>> Hi Jö,
>>>>>>>>
>>>>>>>> Yes, I am running it MPI-parallel, on 4 ranks.
>>>>>>>>
>>>>>>>> On 10.07.19 13:32, Jö Fahlke wrote:
>>>>>>>>> Are you running this MPI-parallel?  If yes, how many ranks?
>>>>>>>>>
>>>>>>>>> Regards, Jö.
>>>>>>>>>
>>>>>>>>> Am Mi, 10. Jul 2019, 11:55:45 +0200 schrieb Shubhangi Gupta:
>>>>>>>>>> Dear pdelab users,
>>>>>>>>>>
>>>>>>>>>> I am currently experiencing a rather strange problem during
>>>>>>>>>> parallel
>>>>>>>>>> solution of my finite volume code. I have written a short outline
>>>>>>>>>> of my code
>>>>>>>>>> below for reference.
>>>>>>>>>>
>>>>>>>>>> At some point during computation, if dune throws an error, the code
>>>>>>>>>> catches
>>>>>>>>>> this error, resets the solution vector to the old value, halves the
>>>>>>>>>> time
>>>>>>>>>> step size, and tries to redo the calculation (osm.apply()).
>>>>>>>>>>
>>>>>>>>>> However, if I get the error "FMatrixError: matrix is singular", the
>>>>>>>>>> solver
>>>>>>>>>> seems to freeze. Even the initial defect is not shown! (See the
>>>>>>>>>> terminal
>>>>>>>>>> output below.) I am not sure why this is so, and I have not
>>>>>>>>>> experienced this
>>>>>>>>>> issue before.
>>>>>>>>>>
>>>>>>>>>> I will be very thankful if someone can help me figure out a way
>>>>>>>>>> around this
>>>>>>>>>> problem.
>>>>>>>>>>
>>>>>>>>>> Thanks, and warm wishes, Shubhangi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *// code layout*
>>>>>>>>>>
>>>>>>>>>>         ...UG grid, generated using gmsh, GV, ...
>>>>>>>>>>
>>>>>>>>>>         typedef
>>>>>>>>>> Dune::PDELab::QkDGLocalFiniteElementMap<GV::Grid::ctype, double,
>>>>>>>>>> 0, dim, Dune::PDELab::QkDGBasisPolynomial::lagrange> FEMP0;
>>>>>>>>>>         FEMP0 femp0;
>>>>>>>>>>         typedef
>>>>>>>>>> Dune::PDELab::GridFunctionSpace<GV,FEMP0,Dune::PDELab::P0ParallelConstraints,Dune::PDELab::ISTL::VectorBackend<>>
>>>>>>>>>>
>>>>>>>>>> GFS0;
>>>>>>>>>>         GFS0 gfs0(gv,femp0);
>>>>>>>>>>         typedef Dune::PDELab::PowerGridFunctionSpace<
>>>>>>>>>> GFS0,num_of_vars,
>>>>>>>>>> Dune::PDELab::ISTL::VectorBackend<Dune::PDELab::ISTL::Blocking::fixed>,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Dune::PDELab::EntityBlockedOrderingTag> GFS_TCH;
>>>>>>>>>>
>>>>>>>>>>         ... LocalOperator LOP lop, TimeLocalOperator TOP top,
>>>>>>>>>> GridOperator GO
>>>>>>>>>> go, InstationaryGridOperator IGO igo, ...
>>>>>>>>>>
>>>>>>>>>>         typedef Dune::PDELab::ISTLBackend_BCGS_AMG_SSOR<IGO> LS;
>>>>>>>>>>         LS ls(gfs,50,1,false,true);
>>>>>>>>>>         typedef Dune::PDELab::Newton< IGO, LS, U > PDESOLVER;
>>>>>>>>>>         PDESOLVER pdesolver( igo, ls );
>>>>>>>>>> Dune::PDELab::ImplicitEulerParameter<double> method;
>>>>>>>>>>
>>>>>>>>>>         Dune::PDELab::OneStepMethod< double, IGO, PDESOLVER, U, U >
>>>>>>>>>> osm( method,
>>>>>>>>>> igo, pdesolver );
>>>>>>>>>>
>>>>>>>>>>         //TIME-LOOP
>>>>>>>>>>         while( time < t_END - 1e-8){
>>>>>>>>>>                 try{
>>>>>>>>>>                     //PDE-SOLVE
>>>>>>>>>>                     osm.apply( time, dt, uold, unew );
>>>>>>>>>>                     exceptionCaught = false;
>>>>>>>>>>                 }catch ( Dune::Exception &e ) {
>>>>>>>>>>                     //RESET
>>>>>>>>>>                     exceptionCaught = true;
>>>>>>>>>>                     std::cout << "Catched Error, Dune reported error:
>>>>>>>>>> " << e <<
>>>>>>>>>> std::endl;
>>>>>>>>>>                     unew = uold;
>>>>>>>>>>                     dt *= 0.5;
>>>>>>>>>> osm.getPDESolver().discardMatrix();
>>>>>>>>>>                     continue;
>>>>>>>>>>                 }
>>>>>>>>>>                 uold = unew;
>>>>>>>>>>                 time += dt;
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *// terminal output showing FMatrixError...*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      time = 162.632 , time+dt = 164.603 , opTime = 180 , dt  :
>>>>>>>>>> 1.97044
>>>>>>>>>>
>>>>>>>>>>      READY FOR NEXT ITERATION.
>>>>>>>>>> _____________________________________________________
>>>>>>>>>>      current opcount = 2
>>>>>>>>>> ****************************
>>>>>>>>>> TCH HYDRATE:
>>>>>>>>>> ****************************
>>>>>>>>>> TIME STEP [implicit Euler]     89 time (from):   1.6263e+02 dt:
>>>>>>>>>> 1.9704e+00
>>>>>>>>>> time (to):   1.6460e+02
>>>>>>>>>> STAGE 1 time (to):   1.6460e+02.
>>>>>>>>>>       Initial defect:   2.1649e-01
>>>>>>>>>> Using a direct coarse solver (SuperLU)
>>>>>>>>>> Building hierarchy of 2 levels (inclusive coarse solver) took
>>>>>>>>>> 0.2195
>>>>>>>>>> seconds.
>>>>>>>>>> === BiCGSTABSolver
>>>>>>>>>>      12.5        6.599e-11
>>>>>>>>>> === rate=0.1733, T=1.152, TIT=0.09217, IT=12.5
>>>>>>>>>>       Newton iteration  1.  New defect:   3.4239e-02.  Reduction
>>>>>>>>>> (this):
>>>>>>>>>> 1.5816e-01.  Reduction (total):   1.5816e-01
>>>>>>>>>> Using a direct coarse solver (SuperLU)
>>>>>>>>>> Building hierarchy of 2 levels (inclusive coarse solver) took 0.195
>>>>>>>>>> seconds.
>>>>>>>>>> === BiCGSTABSolver
>>>>>>>>>>        17        2.402e-11
>>>>>>>>>> === rate=0.2894, T=1.655, TIT=0.09738, IT=17
>>>>>>>>>>       Newton iteration  2.  New defect:   3.9906e+00.  Reduction
>>>>>>>>>> (this):
>>>>>>>>>> 1.1655e+02.  Reduction (total):   1.8434e+01
>>>>>>>>>> Using a direct coarse solver (SuperLU)
>>>>>>>>>> Building hierarchy of 2 levels (inclusive coarse solver) took
>>>>>>>>>> 0.8697
>>>>>>>>>> seconds.
>>>>>>>>>> === BiCGSTABSolver
>>>>>>>>>> Catched Error, Dune reported error: FMatrixError
>>>>>>>>>> [luDecomposition:/home/sgupta/dune_2_6/source/dune/dune-common/dune/common/densematrix.hh:909]:
>>>>>>>>>>
>>>>>>>>>> matrix is singular
>>>>>>>>>> _____________________________________________________
>>>>>>>>>>      current opcount = 2
>>>>>>>>>> ****************************
>>>>>>>>>> TCH HYDRATE:
>>>>>>>>>> ****************************
>>>>>>>>>> TIME STEP [implicit Euler]     89 time (from):   1.6263e+02 dt:
>>>>>>>>>> 9.8522e-01
>>>>>>>>>> time (to):   1.6362e+02
>>>>>>>>>> STAGE 1 time (to):   1.6362e+02.
>>>>>>>>>>
>>>>>>>>>> *... nothing happens here... the terminal appears to freeze...*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Dr. Shubhangi Gupta
>>>>>>>>>> Marine Geosystems
>>>>>>>>>> GEOMAR Helmholtz Center for Ocean Research
>>>>>>>>>> Wischhofstraße 1-3,
>>>>>>>>>> D-24148 Kiel
>>>>>>>>>>
>>>>>>>>>> Room: 12-206
>>>>>>>>>> Phone: +49 431 600-1402
>>>>>>>>>> Email:sgupta at geomar.de
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> dune-pdelab mailing list
>>>>>>>>>> dune-pdelab at lists.dune-project.org
>>>>>>>>>> https://lists.dune-project.org/mailman/listinfo/dune-pdelab
>>>>>> _______________________________________________
>>>>>> dune-pdelab mailing list
>>>>>> dune-pdelab at lists.dune-project.org
>>>>>> https://lists.dune-project.org/mailman/listinfo/dune-pdelab
>> -- 
>> Dr. Shubhangi Gupta
>> Marine Geosystems
>> GEOMAR Helmholtz Center for Ocean Research
>> Wischhofstraße 1-3,
>> D-24148 Kiel
>>
>> Room: 12-206
>> Phone: +49 431 600-1402
>> Email: sgupta at geomar.de
>>
>>
>> _______________________________________________
>> dune-pdelab mailing list
>> dune-pdelab at lists.dune-project.org
>> https://lists.dune-project.org/mailman/listinfo/dune-pdelab

-- 
Dr. Shubhangi Gupta
Marine Geosystems
GEOMAR Helmholtz Center for Ocean Research
Wischhofstraße 1-3,
D-24148 Kiel

Room: 12-206
Phone: +49 431 600-1402
Email: sgupta at geomar.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dune-project.org/pipermail/dune-pdelab/attachments/20190724/d668ca4f/attachment.htm>


More information about the dune-pdelab mailing list