[Dune] UG grid problem: ID overflow DDD_HdrConstructor

Oliver Sander sander at igpm.rwth-aachen.de
Fri Nov 30 11:53:01 CET 2012


Okay, with maxprocbits = 16 I do see a crash.  2 processors and 1 level 
is enough.

I'm in a bit of a hurry right now, but I'll have a look at it later.

best,
Oliver

Am 29.11.2012 19:36, schrieb Eike Mueller:
> Hi Oliver,
>
> thank you very much for trying this out and for the patch. I now 
> rebuilt UG with the latest patch file you sent me this morning and I 
> still get the segfault. However, this only occurs if I specify the 
> with_ddd_maxprocbits flag, if I do not set this (i.e. only use 
> DDD_GID=long, as you do), then it runs fine (I have only done a small 
> run so far, and haven't tried 7 refinement steps, so can not say 
> anything about that other error you get).
> I tried both with_ddd_maxprocbits=20 and 16, but it does not work in 
> any of these cases. The default is 2^9=512, and unfortunately that's 
> not enough for me, I would need at least 2^16=65536.
> As for the error message you get when reading the .vtu files, could 
> that be because they are written out in Dune::VTK::appendedbase64 
> format? I also get an error message when I open them in paraview
>
> ERROR: In 
> /home/kitware/ParaView3/Utilities/BuildScripts/ParaView-3.6/ParaView3/VTK/IO/vtkXMLUnstructuredDataReader.cxx, 
> line 522
> vtkXMLUnstructuredGridReader (0xa16c918): Cannot read points array 
> from Points in piece 0.  The data array in the element may be too short.
>
> When I opened .vtu files produced by my main solver code on HECToR, 
> paraview actually crashed, and this was before I modified any of the 
> UG settings. I could fix this by writing data out in Dune::VTK::ascii 
> format instead. Could this be a big/little endian issue? HECToR is 
> 64bit, but my local desktop, where I run paraview to look at the 
> output is 32bit, not sure if that has any impact.
>
> Eike
>
> Oliver Sander wrote:
>> Hi Eike,
>> I tried your example with DDD_GID==long, on my laptop where 
>> sizeof(long)==8 and sizeof(uint)==4.
>> I started the program with
>>
>> mpirun -np 6 ./testsphericalgridgenerator sphericalshell_cube_6.dgf 6
>>
>> Besides a few DDD warnings that I have never seen before, it works 
>> like a charm.
>>
>> What version of UG are you using?  I'll send you the very latest 
>> patch file just
>> to be sure.
>>
>> The programs runs, but paraview gives me an error message when trying 
>> to open
>> the output file.  Does that happen to you, too?
>>
>> I didn't try the with_ddd_maxprocbits setting.  Does your program 
>> crash if you
>> do _not_ set this?
>>
>> For the worst case: is it possible to get a temporary account on your 
>> hector computer?
>>
>> best,
>> Oliver
>>
>> Am 28.11.2012 10:28, schrieb Eike Mueller:
>>> Hi Oliver,
>>>
>>> thanks a lot, that would be great. My desktop is a 32bit machine, 
>>> where sizeof(long) = sizeof(int) = 4, so I'm not sure if recompiling 
>>> everything with GID=long there will make a difference.
>>>
>>> Eike
>>>
>>> Oliver Sander wrote:
>>>> Thanks for the backtrace.  I'll try and see whether I can reproduce 
>>>> the crash
>>>> on my machine. If that's not possible things will be a bit 
>>>> difficult :-)
>>>> -- 
>>>> Oliver
>>>>
>>>> Am 26.11.2012 18:39, schrieb Eike Mueller:
>>>>> Hi Markus and Oliver,
>>>>>
>>>>> to get to the bottom of this I recompiled everything (UG+Dune+my 
>>>>> code) with -O0 -g, and that way I was able to get some more 
>>>>> information out of the core dump. On 1 processor it runs fine now, 
>>>>> but when running on 6, this is what I get, looks like it crashes 
>>>>> in loadBalance(), but I can't make sense of what's happening 
>>>>> inside UG. It always seems to crash inside ifcreate.c, either in 
>>>>> line 482 or 489:
>>>>>
>>>>> Program terminated with signal 11, Segmentation fault.
>>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>>> (tmpcpl=0x1462ad0,
>>>>>     ifId=1) at if/ifcreate.c:489
>>>>> 489                     ifHead->nItems++;
>>>>> (gdb) backtrace
>>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>>> (tmpcpl=0x1462ad0,
>>>>>     ifId=1) at if/ifcreate.c:489
>>>>> #1  0x0000000000a44dae in UG::D3::IFRebuildAll () at 
>>>>> if/ifcreate.c:1059
>>>>> #2  0x0000000000a44e71 in UG::D3::IFAllFromScratch () at 
>>>>> if/ifcreate.c:1097
>>>>> #3  0x0000000000a4bd89 in UG::D3::DDD_XferEnd () at xfer/cmds.c:869
>>>>> #4  0x0000000000a65b5c in UG::D3::TransferGridFromLevel 
>>>>> (theMG=0x1441880,
>>>>>     level=0) at trans.c:835
>>>>> #5  0x0000000000a5df4b in UG::D3::lbs (argv=0x7fffffffa390 "0",
>>>>>     theMG=0x1441880) at lb.c:659
>>>>> #6  0x0000000000a0bd83 in UG::D3::LBCommand (argc=4, 
>>>>> argv=0x7fffffffab90)
>>>>>     at commands.c:10658
>>>>> #7  0x00000000004a1e2d in Dune::UG_NS<3>::LBCommand (argc=4,
>>>>>     argv=0x7fffffffab90) at 
>>>>> ../../../dune/grid/uggrid/ugwrapper.hh:979
>>>>> #8  0x00000000004a8df9 in Dune::UGGrid<3>::loadBalance 
>>>>> (this=0x134e490,
>>>>>     strategy=0, minlevel=0, depth=2, maxLevel=32, minelement=1)
>>>>>     at uggrid.cc:556
>>>>> #9  0x00000000004077bf in Dune::UGGrid<3>::loadBalance 
>>>>> (this=0x134e490)
>>>>>     at 
>>>>> /home/n02/n02/eike/work/Library/Dune2.2/include/dune/grid/uggrid.hh:738 
>>>>>
>>>>> #10 0x0000000000400928 in main (argc=3, argv=0x7fffffffb5a8)
>>>>>     at testsphericalgridgenerator.cc:65
>>>>>
>>>>> but I sometimes also get:
>>>>>
>>>>> Core was generated by `./testsphericalgridgenerator 
>>>>> sphericalshell_cube_6.dgf 4'.
>>>>> Program terminated with signal 11, Segmentation fault.
>>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>>> (tmpcpl=0x1462ad0,
>>>>>     ifId=1) at if/ifcreate.c:482
>>>>> 482                             ifAttr->nAB    = ifAttr->nBA   = 
>>>>> ifAttr->nABA   = 0;
>>>>> (gdb) backtrace
>>>>> #0  0x0000000000a438b0 in UG::D3::IFCreateFromScratch 
>>>>> (tmpcpl=0x1462ad0,
>>>>>     ifId=1) at if/ifcreate.c:482
>>>>> #1  0x0000000000a44dae in UG::D3::IFRebuildAll () at 
>>>>> if/ifcreate.c:1049
>>>>> #2  0x0000000000a44e71 in UG::D3::IFRebuildAll () at 
>>>>> if/ifcreate.c:1057
>>>>> #3  0x0000000000a4bd89 in UG::D3::DDD_XferEnd () at xfer/cmds.c:850
>>>>> #4  0x0000000000a65b5c in UG::D3::TransferGridFromLevel 
>>>>> (theMG=0x1441880,
>>>>>     level=0) at trans.c:824
>>>>> #5  0x0000000000a5df4b in UG::D3::lbs (argv=0x7fffffffa390 "0",
>>>>>     theMG=0x1441880) at lb.c:644
>>>>> #6  0x0000000000a0bd83 in UG::D3::LBCommand (argc=4, 
>>>>> argv=0x7fffffffab90)
>>>>>     at commands.c:10644
>>>>> #7  0x00000000004a1e2d in Dune::UG_NS<3>::LBCommand (argc=0,
>>>>>     argv=0x7fffffffa420) at 
>>>>> ../../../dune/grid/uggrid/ugwrapper.hh:977
>>>>> #8  0x00000000004a8df9 in Dune::UGGrid<3>::loadBalance 
>>>>> (this=0x134e490,
>>>>>     strategy=0, minlevel=0, depth=2, maxLevel=32, minelement=1)
>>>>>     at uggrid.cc:554
>>>>> #9  0x00000000004077bf in 
>>>>> std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count 
>>>>> (this=0x134e490, __in_chrg=<optimized out>)
>>>>>     at /opt/gcc/4.6.3/snos/include/g++/bits/shared_ptr_base.h:550
>>>>> #10 0x0000000000400928 in main (argc=0, argv=0x520)
>>>>>     at testsphericalgridgenerator.cc:66
>>>>>
>>>>> I've tried different load balancing strategies, but for all I get 
>>>>> a segfault.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Eike
>>>>>
>>>>>
>>>>> Markus Blatt wrote:
>>>>>> On Mon, Nov 26, 2012 at 01:57:15PM +0000, Eike Mueller wrote:
>>>>>>> thanks a lot for the patch, unfortunately I still get a segfault 
>>>>>>> when I run on HECToR.
>>>>>>>
>>>>>>
>>>>>> I feared that, but it was still worth a shot. The change probably
>>>>>> interferes with the memory allocation in ddd.
>>>>>>
>>>>>> Markus
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>





More information about the Dune mailing list