[Dune] Hades Parallel crash with parmetis

Jö Fahlke jorrit at jorrit.de
Thu Jul 1 12:06:33 CEST 2010


Am Wed, 30. Jun 2010, 13:21:24 +0200 schrieb Aleksejs Fomins:
> I have compiled ALUGrid-1.22 with ParMetis-3.1.1, then compiled Dune
> and Hades.
> When I tried to launch a parallel job it crashed with anything more
> than 1 core.

Do you really mean core, or do you rather mean process?

> Then I recompiled ALUGrid with Metis-4.0, recompiled Dune and Hades and
> parallel computations were working for 8 nodes.
> 
> What could be the reason?

I have no idea what the reason might be, and I suspekt neither does any of the
other developers.  Here are some things you can try to narrow the problem down
a bit:

 * Don't give us logs with eight processes, if the failure already happens
   with two processes.  The logs with two processes should be simpler.

 * What were the compiler flags used for compilation?  I suspekt that
   optimization was in effect, since the assertion failed in a function which
   does not appear in the stack trace.  If so, please try without optimization
   (CXXFLAGS="-g -O0").

 * You might want to try a different MPI implementation, it might provide more
   helpful error messages.

> The error is at the end of log.txt file
> Also, when I run the job for the first time, I received this error2.txt

Is this with parmetis-3.1.1 or with metis-4.0?

Bye,
Jö.

-- 
This is the first age that's paid much attention to the future, which
is a little ironic since we may not have one.
-- Arthur C Clarke
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <https://lists.dune-project.org/pipermail/dune/attachments/20100701/c55b03c6/attachment.sig>


More information about the Dune mailing list