[Dune] GMSH reader fails in parallel case

Andreas Dedner a.s.dedner at warwick.ac.uk
Mon Jul 14 21:21:34 CEST 2014


As I said an example would be nice to e able to help you figure this out and
if you open a flyspray then we keep a easy to find record of this thread
and
that can help others.
So I can't check but I think you need to pass an MPI communicator
for the grid construction. If you don''t you get a lot of serial
alugrids that are
not connected - you might in some cases want that. Of course calling the
loadbalancing
should then work non the less. So there is probably a bug here.
So the empty constructor something lijke
MPI_COMM_WORLD as argument.
You also need to pass that to the gmsh reader. I am not quite sure what
the correct procedure is
because I don't use it but perhaps somebody else can pass on that
information.
There seems to be a read method taking a gridfactory and the gridfactory
from alu has
a constructor taking a communicator so perhaps that is the right way of
doing it?
Andreas



On 14/07/14 17:13, Sacconi, Andrea wrote:
> Hi all,
>
> following Andreas's suggestions, I added these lines of code:
>
>    HostGridType* gridPtr(nullptr);
>     if(rank==0)
> 	gridPtr=Dune::GmshReader<HostGridType>::read(FileName);
>     else gridPtr = new HostGridType();
>
> so only the process with rank 0 reads the file, while the others initialise an empty grid.
> Then I call:
>
> grid.loadBalance();
>
> but unfortunately this error message appears:
>
> as7211 at macomp000:~/dune-2.3.1/dune-bulk/src$ mpirun -n 2 dune_bulk
> Reading 3d Gmsh grid...
> version 2.2 Gmsh file detected
> file contains 3323 nodes
> file contains 19216 elements
> number of real vertices = 3322
> number of boundary elements = 3036
> number of elements = 15981
>
> Created parallel ALUGrid<3,3,simplex,nonconforming> from macro grid file ''. 
>
> [macomp000:5707] *** An error occurred in MPI_Allgather
> [macomp000:5707] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
> [macomp000:5707] *** MPI_ERR_TRUNCATE: message truncated
> [macomp000:5707] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
>
> So, it appears that the file has been read (by the process 0) and the grid initialised correctly. The problem is, process 0 freezes at the end of the reading bit. If you comment the line with load balancing, nothing appears on the screen, because the process 0 is frozen.
> Any ideas about this issue? I'm very confused.
>
> Thanks again!
> Andrea
> __________________________________________________________
>
> Andrea Sacconi
> PhD student, Applied Mathematics
> AMMP Section, Department of Mathematics, Imperial College London,
> London SW7 2AZ, UK
> a.sacconi11 at imperial.ac.uk
>
> ________________________________________
> From: dune-bounces+a.sacconi11=imperial.ac.uk at dune-project.org [dune-bounces+a.sacconi11=imperial.ac.uk at dune-project.org] on behalf of Oliver Sander [sander at igpm.rwth-aachen.de]
> Sent: 14 July 2014 15:37
> To: dune at dune-project.org
> Subject: Re: [Dune] GMSH reader fails in parallel case
>
> Am 14.07.2014 14:54, schrieb Andreas Dedner:
>> There has never been a clear decision I think on how the grid readers should work in the
>> case that the macro grid is not pre distributed. In ALU the idea is that the grid is distributed
>> but in a way that one process has all the elements and the others are empty. Consequently
>> only process zero should read the gmsh file and the others should generate an empty grid.
>> Now I remember that UG does it differently (requiring) that all process read the full macro grid.
>> As I said a place were we need to fix the semantics. DGF does it the ALU way that is why that works
>> and gmshreader does in the UG way...
>>
>> The simplest way to avoid the issue is to surround the call to the gmshreader by if (rank==0)
>> and to construct empty ALUGrids in the else part - but then I assume UG would not be happy....
>>
> I don't think so.  UGGrid contains extra code to handle that case.  I don't really know how much
> testing it got, though.
> --
> Oliver
>
>> Andreas
>>
>> PS: it would help if you could open a flyspray task with a report, a test program, I could then add
>> my 5 cent from above. This would increase the chances that we would actually discuss this
>> at the developer meeting in September.
>>
>>
>> On 14/07/14 13:00, Sacconi, Andrea wrote:
>>> Hi DUNErs,
>>>
>>> I would like to ask you a question about the GMSH reader for parallel computation (with open-MPI 1.6.5). I am using AlugridSimplex <3,3> for a standard Poisson problem.
>>> Everything is fine in the sequential case, while in the parallel case I get the error reported below.
>>>
>>> as7211 at macomp01:~/dune-2.3.1/dune-bulk/src$ mpirun -n 2 dune_bulk
>>> Reading 3d Gmsh grid...
>>> Reading 3d Gmsh grid...
>>> version 2.2 Gmsh file detected
>>> version 2.2 Gmsh file detected
>>> file contains 3323 nodes
>>> file contains 3323 nodes
>>> file contains 19216 elements
>>> file contains 19216 elements
>>> terminate called after throwing an instance of 'Dune::GridError'
>>> [macomp01:03890] *** Process received signal ***
>>> [macomp01:03890] Signal: Aborted (6)
>>> [macomp01:03890] Signal code:  (-6)
>>> [macomp01:03890] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7ff363b7b340]
>>> [macomp01:03890] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39) [0x7ff3637dbf79]
>>> [macomp01:03890] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7ff3637df388]
>>> [macomp01:03890] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0x7ff3643056b5]
>>> [macomp01:03890] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e836) [0x7ff364303836]
>>> [macomp01:03890] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e863) [0x7ff364303863]
>>> [macomp01:03890] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5eaa2) [0x7ff364303aa2]
>>> [macomp01:03890] [ 7]
>>> dune_bulk(_ZN4Dune16ALU3dGridFactoryINS_7ALUGridILi3ELi3ELNS_18ALUGridElementTypeE0ELNS_21ALUGridRefinementTypeE1EP19ompi_communicator_tEEE12insertVertexERKNS_11FieldVectorIdLi3EEE+0x1b2) [0x616c82]
>>>
>>> Any idea about how to use make only the master process read the grid, and not all the processes? In any case, how can the issue be fixed?
>>> By the way, if I use DGF reader everything runs fine, both in the sequential and parallel case.
>>>
>>> Thanks in advance!
>>> Andrea
>>> __________________________________________________________
>>>
>>> Andrea Sacconi
>>> PhD student, Applied Mathematics
>>> AMMP Section, Department of Mathematics, Imperial College London,
>>> London SW7 2AZ, UK
>>> a.sacconi11 at imperial.ac.uk
>>> _______________________________________________
>>> Dune mailing list
>>> Dune at dune-project.org
>>> http://lists.dune-project.org/mailman/listinfo/dune
>>
>> _______________________________________________
>> Dune mailing list
>> Dune at dune-project.org
>> http://lists.dune-project.org/mailman/listinfo/dune
>
>
> _______________________________________________
> Dune mailing list
> Dune at dune-project.org
> http://lists.dune-project.org/mailman/listinfo/dune





More information about the Dune mailing list