[Dune] UG grid problem: ID overflow DDD_HdrConstructor

Eike Mueller E.Mueller at bath.ac.uk
Thu Nov 29 19:36:55 CET 2012


Hi Oliver,

thank you very much for trying this out and for the patch. I now rebuilt UG with the latest patch file you sent me this morning 
and I still get the segfault. However, this only occurs if I specify the with_ddd_maxprocbits flag, if I do not set this (i.e. 
only use DDD_GID=long, as you do), then it runs fine (I have only done a small run so far, and haven't tried 7 refinement steps, 
so can not say anything about that other error you get).
I tried both with_ddd_maxprocbits=20 and 16, but it does not work in any of these cases. The default is 2^9=512, and 
unfortunately that's not enough for me, I would need at least 2^16=65536.
As for the error message you get when reading the .vtu files, could that be because they are written out in 
Dune::VTK::appendedbase64 format? I also get an error message when I open them in paraview

ERROR: In /home/kitware/ParaView3/Utilities/BuildScripts/ParaView-3.6/ParaView3/VTK/IO/vtkXMLUnstructuredDataReader.cxx, line 522
vtkXMLUnstructuredGridReader (0xa16c918): Cannot read points array from Points in piece 0.  The data array in the element may be 
too short.

When I opened .vtu files produced by my main solver code on HECToR, paraview actually crashed, and this was before I modified 
any of the UG settings. I could fix this by writing data out in Dune::VTK::ascii format instead. Could this be a big/little 
endian issue? HECToR is 64bit, but my local desktop, where I run paraview to look at the output is 32bit, not sure if that has 
any impact.

Eike

Oliver Sander wrote:
> Hi Eike,
> I tried your example with DDD_GID==long, on my laptop where 
> sizeof(long)==8 and sizeof(uint)==4.
> I started the program with
> 
> mpirun -np 6 ./testsphericalgridgenerator sphericalshell_cube_6.dgf 6
> 
> Besides a few DDD warnings that I have never seen before, it works like 
> a charm.
> 
> What version of UG are you using?  I'll send you the very latest patch 
> file just
> to be sure.
> 
> The programs runs, but paraview gives me an error message when trying to 
> open
> the output file.  Does that happen to you, too?
> 
> I didn't try the with_ddd_maxprocbits setting.  Does your program crash 
> if you
> do _not_ set this?
> 
> For the worst case: is it possible to get a temporary account on your 
> hector computer?
> 
> best,
> Oliver
> 
> Am 28.11.2012 10:28, schrieb Eike Mueller:
>> Hi Oliver,
>>
>> thanks a lot, that would be great. My desktop is a 32bit machine, 
>> where sizeof(long) = sizeof(int) = 4, so I'm not sure if recompiling 
>> everything with GID=long there will make a difference.
>>
>> Eike
>>
>> Oliver Sander wrote:
>>> Thanks for the backtrace.  I'll try and see whether I can reproduce 
>>> the crash
>>> on my machine. If that's not possible things will be a bit difficult :-)
>>> -- 
>>> Oliver
>>>
>>> Am 26.11.2012 18:39, schrieb Eike Mueller:
>>>> Hi Markus and Oliver,
>>>>
>>>> to get to the bottom of this I recompiled everything (UG+Dune+my 
>>>> code) with -O0 -g, and that way I was able to get some more 
>>>> information out of the core dump. On 1 processor it runs fine now, 
>>>> but when running on 6, this is what I get, looks like it crashes in 
>>>> loadBalance(), but I can't make sense of what's happening inside UG. 
>>>> It always seems to crash inside ifcreate.c, either in line 482 or 489:
>>>>
>>>> Program terminated with signal 11, Segmentation fault.
>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>> (tmpcpl=0x1462ad0,
>>>>     ifId=1) at if/ifcreate.c:489
>>>> 489                     ifHead->nItems++;
>>>> (gdb) backtrace
>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>> (tmpcpl=0x1462ad0,
>>>>     ifId=1) at if/ifcreate.c:489
>>>> #1  0x0000000000a44dae in UG::D3::IFRebuildAll () at if/ifcreate.c:1059
>>>> #2  0x0000000000a44e71 in UG::D3::IFAllFromScratch () at 
>>>> if/ifcreate.c:1097
>>>> #3  0x0000000000a4bd89 in UG::D3::DDD_XferEnd () at xfer/cmds.c:869
>>>> #4  0x0000000000a65b5c in UG::D3::TransferGridFromLevel 
>>>> (theMG=0x1441880,
>>>>     level=0) at trans.c:835
>>>> #5  0x0000000000a5df4b in UG::D3::lbs (argv=0x7fffffffa390 "0",
>>>>     theMG=0x1441880) at lb.c:659
>>>> #6  0x0000000000a0bd83 in UG::D3::LBCommand (argc=4, 
>>>> argv=0x7fffffffab90)
>>>>     at commands.c:10658
>>>> #7  0x00000000004a1e2d in Dune::UG_NS<3>::LBCommand (argc=4,
>>>>     argv=0x7fffffffab90) at ../../../dune/grid/uggrid/ugwrapper.hh:979
>>>> #8  0x00000000004a8df9 in Dune::UGGrid<3>::loadBalance (this=0x134e490,
>>>>     strategy=0, minlevel=0, depth=2, maxLevel=32, minelement=1)
>>>>     at uggrid.cc:556
>>>> #9  0x00000000004077bf in Dune::UGGrid<3>::loadBalance (this=0x134e490)
>>>>     at 
>>>> /home/n02/n02/eike/work/Library/Dune2.2/include/dune/grid/uggrid.hh:738
>>>> #10 0x0000000000400928 in main (argc=3, argv=0x7fffffffb5a8)
>>>>     at testsphericalgridgenerator.cc:65
>>>>
>>>> but I sometimes also get:
>>>>
>>>> Core was generated by `./testsphericalgridgenerator 
>>>> sphericalshell_cube_6.dgf 4'.
>>>> Program terminated with signal 11, Segmentation fault.
>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>> (tmpcpl=0x1462ad0,
>>>>     ifId=1) at if/ifcreate.c:482
>>>> 482                             ifAttr->nAB    = ifAttr->nBA   = 
>>>> ifAttr->nABA   = 0;
>>>> (gdb) backtrace
>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>> (tmpcpl=0x1462ad0,
>>>>     ifId=1) at if/ifcreate.c:482
>>>> #1  0x0000000000a44dae in UG::D3::IFRebuildAll () at if/ifcreate.c:1049
>>>> #2  0x0000000000a44e71 in UG::D3::IFRebuildAll () at if/ifcreate.c:1057
>>>> #3  0x0000000000a4bd89 in UG::D3::DDD_XferEnd () at xfer/cmds.c:850
>>>> #4  0x0000000000a65b5c in UG::D3::TransferGridFromLevel 
>>>> (theMG=0x1441880,
>>>>     level=0) at trans.c:824
>>>> #5  0x0000000000a5df4b in UG::D3::lbs (argv=0x7fffffffa390 "0",
>>>>     theMG=0x1441880) at lb.c:644
>>>> #6  0x0000000000a0bd83 in UG::D3::LBCommand (argc=4, 
>>>> argv=0x7fffffffab90)
>>>>     at commands.c:10644
>>>> #7  0x00000000004a1e2d in Dune::UG_NS<3>::LBCommand (argc=0,
>>>>     argv=0x7fffffffa420) at ../../../dune/grid/uggrid/ugwrapper.hh:977
>>>> #8  0x00000000004a8df9 in Dune::UGGrid<3>::loadBalance (this=0x134e490,
>>>>     strategy=0, minlevel=0, depth=2, maxLevel=32, minelement=1)
>>>>     at uggrid.cc:554
>>>> #9  0x00000000004077bf in 
>>>> std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count 
>>>> (this=0x134e490, __in_chrg=<optimized out>)
>>>>     at /opt/gcc/4.6.3/snos/include/g++/bits/shared_ptr_base.h:550
>>>> #10 0x0000000000400928 in main (argc=0, argv=0x520)
>>>>     at testsphericalgridgenerator.cc:66
>>>>
>>>> I've tried different load balancing strategies, but for all I get a 
>>>> segfault.
>>>>
>>>> Cheers,
>>>>
>>>> Eike
>>>>
>>>>
>>>> Markus Blatt wrote:
>>>>> On Mon, Nov 26, 2012 at 01:57:15PM +0000, Eike Mueller wrote:
>>>>>> thanks a lot for the patch, unfortunately I still get a segfault 
>>>>>> when I run on HECToR.
>>>>>>
>>>>>
>>>>> I feared that, but it was still worth a shot. The change probably
>>>>> interferes with the memory allocation in ddd.
>>>>>
>>>>> Markus
>>>>
>>>>
>>>
>>
>>
> 


-- 
Dr Eike Mueller
Research Officer

Department of Mathematical Sciences
University of Bath
Bath BA2 7AY, United Kingdom

+44 1225 38 5633
e.mueller at bath.ac.uk
http://people.bath.ac.uk/em459/




More information about the Dune mailing list