[Dune] Hades Parallel crash with parmetis
Jö Fahlke
jorrit at jorrit.de
Thu Jul 1 12:06:33 CEST 2010
Am Wed, 30. Jun 2010, 13:21:24 +0200 schrieb Aleksejs Fomins:
> I have compiled ALUGrid-1.22 with ParMetis-3.1.1, then compiled Dune
> and Hades.
> When I tried to launch a parallel job it crashed with anything more
> than 1 core.
Do you really mean core, or do you rather mean process?
> Then I recompiled ALUGrid with Metis-4.0, recompiled Dune and Hades and
> parallel computations were working for 8 nodes.
>
> What could be the reason?
I have no idea what the reason might be, and I suspekt neither does any of the
other developers. Here are some things you can try to narrow the problem down
a bit:
* Don't give us logs with eight processes, if the failure already happens
with two processes. The logs with two processes should be simpler.
* What were the compiler flags used for compilation? I suspekt that
optimization was in effect, since the assertion failed in a function which
does not appear in the stack trace. If so, please try without optimization
(CXXFLAGS="-g -O0").
* You might want to try a different MPI implementation, it might provide more
helpful error messages.
> The error is at the end of log.txt file
> Also, when I run the job for the first time, I received this error2.txt
Is this with parmetis-3.1.1 or with metis-4.0?
Bye,
Jö.
--
This is the first age that's paid much attention to the future, which
is a little ironic since we may not have one.
-- Arthur C Clarke
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <https://lists.dune-project.org/pipermail/dune/attachments/20100701/c55b03c6/attachment.sig>
More information about the Dune
mailing list