[Dune-devel] CI failure on dune-grid master debian 11 gcc-9-20

Oliver Sander oliver.sander at tu-dresden.de
Mon Sep 28 08:55:58 CEST 2020


I just noticed the following in

 https://gitlab.dune-project.org/kilian.weishaupt/dune-grid/-/jobs/175885


[1601275211.221338] [runner-d307b235-project-936-concurrent-0:11175:0]       mm_posix.c:162  UCX  ERROR Not enough memory to write total of 4292720 bytes. Please check that /dev/shm or the directory you specified has more available memory.
3773[1601275211.221802] [runner-d307b235-project-936-concurrent-0:11175:0]        uct_mem.c:132  UCX  ERROR failed to allocate 4292720 bytes using md posix for mm_recv_desc: Out of memory
3774[1601275211.222012] [runner-d307b235-project-936-concurrent-0:11175:0]          mpool.c:191  UCX  ERROR Failed to allocate memory pool (name=mm_recv_desc) chunk: Out of memory
3775[1601275211.222397] [runner-d307b235-project-936-concurrent-0:11175:0]       mm_iface.c:644  UCX  ERROR failed to get the first receive descriptor
3776[runner-d307b235-project-936-concurrent-0:11175] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:291  Error: Failed to create UCP worker
3777[runner-d307b235-project-936-concurrent-0:11175] [[28993,1],1] selected pml ob1, but peer [[28993,1],0] on runner-d307b235-project-936-concurrent-0 selected pml ucx
3778--------------------------------------------------------------------------
3779MPI_INIT has failed because at least one MPI process is unreachable
3780from another.  This *usually* means that an underlying communication
3781plugin -- such as a BTL or an MTL -- has either not loaded or not
3782allowed itself to be used.  Your MPI job will now abort.
3783You may wish to try to narrow down the problem;
3784 * Check the output of ompi_info to see which BTL/MTL plugins are
3785   available.
3786 * Run your application with MPI_THREAD_SINGLE


@CI_gurus can you please have a look!

Thanks,
Oliver


On 27.09.20 09:10, Oliver Sander wrote:
> Dear Dune,
> 
> the CI system for the master branch seems to fail with the debian 11 gcc-9-20 image:
> 
>   https://gitlab.dune-project.org/core/dune-grid/-/pipelines/29754
> 
> It reports a run-time failure in some YaspGrid-related tests.  Any idea
> about the possible causes of this?
> 
> Best regards,
> Oliver
> 
> 
> _______________________________________________
> Dune-devel mailing list
> Dune-devel at lists.dune-project.org
> https://lists.dune-project.org/mailman/listinfo/dune-devel
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5198 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.dune-project.org/pipermail/dune-devel/attachments/20200928/b0f6e65d/attachment.bin>


More information about the Dune-devel mailing list