[Dune] Contribution to CollectiveCommunication

Mon Jan 26 14:55:09 CET 2015

Dear Dune,

I have followed Markus'es advice and did some research on the matter.
Indeed, in most cases what is needed for unstructured grids is a nearest
neighbour communication, and for that purpose MPI_Alltoallv overkills
significantly simply because it uses arrays of size equal to the number
of processors, which is unnecessary.

However, there is a solution for exactly this problem in new standard
MPI 3.0 (since 2012 I think) called MPI_Neighbor_Alltoallv, which only
communicates to nearest neighbours using the communicator provided by
MPI_Dist_graph_create_adjacent, which allows to specify the nearest
neighbours of each process in a distributed way.

So, I have written another wrapper class, which does what I want and is
scalable for the multiprocessor case. The function communicate_neighbors
in the attached file takes an input buffer, an array of neighbour ranks
and an array of block sizes. It then performs communication and returns
an output buffer, an array of ranks received from and an array of
blocks. I have also provided the std::vector interface. The reason I
prefer the vector interface is because the user does not need to know
how much memory will be reserved after communication a priori.

I have commented the functions, so it should be easy to figure out the
exact input-output structure directly from the source file

I hope this makes people more happy.

Best,
Aleksejs

On 22/01/15 16:23, Aleksejs Fomins wrote:
> Dear Markus,
> 
> Thanks again for your reply.
> 
> I am indeed writing the communication interface for the curviliear grid.
> 
> Up to now I was under the impression that even for sparse
> communication, MPI_Alltoallv was the better way to go since
> 1) I provide to the method all the information I know.
> 2) A supercomputer would be able to optimize the MPI implementation in
> accordance to its internal architecture.
> 
> After your reply, I have read a few more publications on this issue,
> and now I see another two points
> 3) MPI_Alltoallv memory usage per process increases linearly with
> process number, which is wasteful for large architectures, since the
> amount of memory per process stays the same
> 4) Most communications in PDE are nearest neighbor, so the
> communicator memory usage need not scale with process number
> 
> I would like to comment, that there are use cases for all-to-all
> communication, for example all Boundary Integral (BI) techniques
> require each process boundary to communicate to each other. In our
> electromagnetic code we use a BI technique to truncate the
> computational domain. At the moment it is implemented internally.
> 
> It is a good point that the DataHandle interface only exhibits nearest
> neighbor communication. I of course agree that I should use a better
> communication paradigm if it is available.
> 
> What would you suggest? I could implement a sparse communication
> pattern in terms of MPI_Send and MPI_Recv. However, I have read that
> there was an effort to design such compact patterns in MPI-3 precisely
> for scalability reasons. Could you suggest any such pattern I could use?
> 
> Regards,
> Aleksejs
> 
> 
> 
> 
> On 22/01/15 13:58, Markus Blatt wrote:
>> Hi,
> 
>> On Thu, Jan 22, 2015 at 09:50:49AM +0100, Aleksejs Fomins wrote:
>>> Dear Dune,
>>>
>>> Following our discussions on POD communication, I have written a
>>> small utility, which might find its place among
>>> CollectiveCommunication methods if people want it. The motivation
>>> is that there are numerous gather and scatter methods in there,
>>> but no wrapper for MPI_Alltoallv, as far as I can tell.
>>>
> 
>> probably because nobody needed it yet.
> 
>>> [... MPI] For example, if process 1 would like to send 2 elements
>>> to process 0 3 elements to process 1 (self) 5 elements to process
>>> 2 then "in" would have length 10, and lengthIn={2,3,5} on process
>>> 1
>>>
> 
>> This question might be really stupid, but up until now I was under
>> the impression that you are working on implementing the
>> communication interface for curvilinear-grid. Where would you need
>> such a functionality for this?
> 
>> Just to be sure (ignore it if I am pointing out the obvious): 
>> Please note that from outside it might seem as  the communication
>> in the grid interface is sending from  all to all processors. But
>> for large numbers of processors there is alway a quite small number
>> of processors where a process might send to or receive from. I am
>> sure that you agree that MPI_Alltoallv is not the weapon of choice
>> here.
> 
>> Markus
> 
> 
> 
> 
>> _______________________________________________ Dune mailing list 
>> Dune at dune-project.org 
>> http://lists.dune-project.org/mailman/listinfo/dune
> 
> 
> _______________________________________________
> Dune mailing list
> Dune at dune-project.org
> http://lists.dune-project.org/mailman/listinfo/dune
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-allcommunicate.cc
Type: text/x-c++src
Size: 5973 bytes
Desc: not available
URL: <https://lists.dune-project.org/pipermail/dune/attachments/20150126/b2503419/attachment.cc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: allcommunication.hh
Type: text/x-c++hdr
Size: 10808 bytes
Desc: not available
URL: <https://lists.dune-project.org/pipermail/dune/attachments/20150126/b2503419/attachment.hh>