[Dune] Segmentation Fault in loadBalance()
Marco Cisternino
marco.cisternino at optimad.it
Wed Oct 3 12:58:44 CEST 2012
Good morning,
I'm experiencing something weird in calling loadBalance() method on a
ALUCubeGrid<3,3>.
I build my coarse grid reading from a file with rank 0 and then I call
the first loadBalance() to distribute the grid among the other processors.
In this case loadBalance gives no problem.
Then I refine the grid locally, marking the cells to be refined, calling
preAdapt(), mapping my data to a persistent container, calling adapt(),
mapping back from the persistent container and then calling postAdapt.
At the end of local refinement procedure I call loadBalance again.
If the refined grid is not unbalanced (globally refining or locally
refining without getting an unbalanced grid) loadBalance works fine.
Let me sketch a four elements example (in every element the rank of the
processor owing it)
------ ------
| 0 | 1 |
------ ------ Coarse grid
| 0 | 1 |
------ ------
------ ------ ------ ------
| 0 | 0 | 1 | 1 |
------ ------ ------ ------
| 0 | 0 | 1 | 1 |
------ ------ ------ ------ Globally Refined
| 0 | 0 | 1 | 1 |
------ ------ ------ ------
| 0 | 0 | 1 | 1 |
------ ------ ------ ------
------ ------ ------ ------
| 0 | 0 | 1 | 1 |
------ ------ ------ ------
| 0 | 0 | 1 | 1 |
------ ------ ------ ------ Locally Refined (balanced grid)
| | |
0 1
| | |
------ ------ ------ ------
But if I refine the grid getting an unbalanced grid,
------ ------ ------ ------
| | 1 | 1 |
0 ------ ------
| | 1 | 1 |
------ ------ ------ ------ Locally Refined (unbalanced grid)
| | 1 | 1 |
0 ------ ------
| | 1 | 1 |
------ ------ ------ ------
loadBalance yields a Segmentation Fault and exactly:
std::bad_alloc'
what(): std::bad_alloc
[marco-laptop:27837] *** Process received signal ***
[marco-laptop:27837] *** Process received signal ***
[marco-laptop:27837] Signal: Segmentation fault (11)
[marco-laptop:27837] Signal code: Address not mapped (1)
[marco-laptop:27837] Failing at address: 0x3b
[marco-laptop:27837] [ 0] [0xb77c9410]
[marco-laptop:27837] [ 1] [0xb77c9400]
[marco-laptop:27837] [ 2] /lib/tls/i686/cmov/libc.so.6(abort+0x182)
[0xb7443a82]
[marco-laptop:27837] [ 3]
/usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x14f)
[0xb768d52f]
[marco-laptop:27837] [ 4] /usr/lib/libstdc++.so.6(+0xbd465) [0xb768b465]
[marco-laptop:27837] [ 5] /usr/lib/libstdc++.so.6(+0xbd4a2) [0xb768b4a2]
[marco-laptop:27837] [ 6] /usr/lib/libstdc++.so.6(+0xbd5e1) [0xb768b5e1]
[marco-laptop:27837] [ 7] /usr/lib/libstdc++.so.6(_Znwj+0x7f) [0xb768bc5f]
[marco-laptop:27837] [ 8] /usr/lib/libstdc++.so.6(_Znaj+0x1d) [0xb768bd3d]
[marco-laptop:27837] [ 9]
./dune_foo(_ZN12ALUGridSpace12LoadBalancer8DataBase11repartitionERNS_14MpAccessGlobalENS1_6methodERSt6vectorIiSaIiEEi+0x537)
[0x864702d]
[marco-laptop:27837] [10]
./dune_foo(_ZN12ALUGridSpace12LoadBalancer8DataBase11repartitionERNS_14MpAccessGlobalENS1_6methodE+0x49)
[0x8646a11]
[marco-laptop:27837] [11]
./dune_foo(_ZN12ALUGridSpace13GitterDunePll20repartitionMacroGridERNS_12LoadBalancer8DataBaseE+0x36)
[0x8648f58]
[marco-laptop:27837] [12]
./dune_foo(_ZN12ALUGridSpace9GitterPll29loadBalancerGridChangesNotifyEv+0x397)
[0x8634e27]
[marco-laptop:27837] [13]
./dune_foo(_ZN12ALUGridSpace13GitterDunePll15duneLoadBalanceEv+0x18)
[0x864ab0a]
[marco-laptop:27837] [14]
./dune_foo(_ZN4Dune19ALU3dGridCommHelperILNS_20ALU3dGridElementTypeE7EP19ompi_communicator_tE11loadBalanceERNS_9ALU3dGridILS1_7ES3_EE+0x3d)
[0x84f7c48]
[marco-laptop:27837] [15]
./dune_foo(_ZN4Dune9ALU3dGridILNS_20ALU3dGridElementTypeE7EP19ompi_communicator_tE11loadBalanceEv+0x11)
[0x84f000d]
[marco-laptop:27837] [16] ./dune_foo(main+0x957) [0x84dd584]
[marco-laptop:27837] [17]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xb742cbd6]
[marco-laptop:27837] [18] ./dune_foo() [0x84dc991]
[marco-laptop:27837] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 27837 on node marco-laptop
exited on signal 11 (Segmentation fault).
If I don't care about my data, avoiding the mapping to/from the
persistent container the error is different:
[marco-laptop:28539] *** Process received signal ***
[marco-laptop:28539] Signal: Segmentation fault (11)
[marco-laptop:28539] Signal code: Address not mapped (1)
[marco-laptop:28539] Failing at address: 0xe64e57e4
[marco-laptop:28538] *** Process received signal ***
[marco-laptop:28538] Signal: Segmentation fault (11)
[marco-laptop:28538] Signal code: Address not mapped (1)
[marco-laptop:28538] Failing at address: 0x9b7b000
[marco-laptop:28539] [ 0] [0xb77df410]
[marco-laptop:28539] [ 1]
./dune_foo(libparmetis__Adaptive_Partition+0x46) [0x8721246]
[marco-laptop:28539] [ 2] ./dune_foo(ParMETIS_V3_AdaptiveRepart+0x273)
[0x8721993]
[marco-laptop:28539] [ 3]
./dune_foo(_ZN15ALUGridParMETIS31CALL_ParMETIS_V3_AdaptiveRepartEPiS0_S0_S0_S0_S0_S0_S0_S0_S0_PfS1_S1_S0_S0_S0_PP19ompi_communicator_t+0x81)
[0x8658d0f]
[marco-laptop:28539] [ 4]
./dune_foo(_ZN12ALUGridSpace12LoadBalancer8DataBase11repartitionERNS_14MpAccessGlobalENS1_6methodERSt6vectorIiSaIiEEi+0xaf9)
[0x8642e6f]
[marco-laptop:28539] [ 5]
./dune_foo(_ZN12ALUGridSpace12LoadBalancer8DataBase11repartitionERNS_14MpAccessGlobalENS1_6methodE+0x49)
[0x8642291]
[marco-laptop:28539] [ 6]
./dune_foo(_ZN12ALUGridSpace13GitterDunePll20repartitionMacroGridERNS_12LoadBalancer8DataBaseE+0x36)
[0x86447d8]
[marco-laptop:28539] [ 7]
./dune_foo(_ZN12ALUGridSpace9GitterPll29loadBalancerGridChangesNotifyEv+0x397)
[0x86306a7]
[marco-laptop:28539] [ 8]
./dune_foo(_ZN12ALUGridSpace13GitterDunePll15duneLoadBalanceEv+0x18)
[0x864638a]
[marco-laptop:28539] [ 9]
./dune_foo(_ZN4Dune19ALU3dGridCommHelperILNS_20ALU3dGridElementTypeE7EP19ompi_communicator_tE11loadBalanceERNS_9ALU3dGridILS1_7ES3_EE+0x3d)
[0x84f496d]
[marco-laptop:28539] [10]
./dune_foo(_ZN4Dune9ALU3dGridILNS_20ALU3dGridElementTypeE7EP19ompi_communicator_tE11loadBalanceEv+0x11)
[0x84ed1f1]
[marco-laptop:28539] [11] ./dune_foo(main+0x957) [0x84dafa4]
[marco-laptop:28539] [12]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xb7442bd6]
[marco-laptop:28539] [13] ./dune_foo() [0x84da3b1]
[marco-laptop:28539] *** End of error message ***
[marco-laptop:28538] [ 0] [0xb7715410]
[marco-laptop:28538] [ 1]
./dune_foo(libparmetis__Adaptive_Partition+0x46) [0x8721246]
[marco-laptop:28538] [ 2] ./dune_foo(ParMETIS_V3_AdaptiveRepart+0x273)
[0x8721993]
[marco-laptop:28538] [ 3]
./dune_foo(_ZN15ALUGridParMETIS31CALL_ParMETIS_V3_AdaptiveRepartEPiS0_S0_S0_S0_S0_S0_S0_S0_S0_PfS1_S1_S0_S0_S0_PP19ompi_communicator_t+0x81)
[0x8658d0f]
[marco-laptop:28538] [ 4]
./dune_foo(_ZN12ALUGridSpace12LoadBalancer8DataBase11repartitionERNS_14MpAccessGlobalENS1_6methodERSt6vectorIiSaIiEEi+0xaf9)
[0x8642e6f]
[marco-laptop:28538] [ 5]
./dune_foo(_ZN12ALUGridSpace12LoadBalancer8DataBase11repartitionERNS_14MpAccessGlobalENS1_6methodE+0x49)
[0x8642291]
[marco-laptop:28538] [ 6]
./dune_foo(_ZN12ALUGridSpace13GitterDunePll20repartitionMacroGridERNS_12LoadBalancer8DataBaseE+0x36)
[0x86447d8]
[marco-laptop:28538] [ 7]
./dune_foo(_ZN12ALUGridSpace9GitterPll29loadBalancerGridChangesNotifyEv+0x397)
[0x86306a7]
[marco-laptop:28538] [ 8]
./dune_foo(_ZN12ALUGridSpace13GitterDunePll15duneLoadBalanceEv+0x18)
[0x864638a]
[marco-laptop:28538] [ 9]
./dune_foo(_ZN4Dune19ALU3dGridCommHelperILNS_20ALU3dGridElementTypeE7EP19ompi_communicator_tE11loadBalanceERNS_9ALU3dGridILS1_7ES3_EE+0x3d)
[0x84f496d]
[marco-laptop:28538] [10]
./dune_foo(_ZN4Dune9ALU3dGridILNS_20ALU3dGridElementTypeE7EP19ompi_communicator_tE11loadBalanceEv+0x11)
[0x84ed1f1]
[marco-laptop:28538] [11] ./dune_foo(main+0x957) [0x84dafa4]
[marco-laptop:28538] [12]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xb7378bd6]
[marco-laptop:28538] [13] ./dune_foo() [0x84da3b1]
[marco-laptop:28538] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 28539 on node marco-laptop
exited on signal 11 (Segmentation fault).
My parameters in alugrid.cfg are 0,1.2,14.
Could anyone help me to understand what is happening, please?? Sincerely
I have no idea!
Thanks a lot for any hint!
Best regards,
Marco
More information about the Dune
mailing list