[Dune-devel] New assembly mode for BCRSMatrix
Markus Blatt
markus at dr-blatt.de
Thu Aug 1 13:02:23 CEST 2013
Hi,
On Wed, Jul 24, 2013 at 11:55:02AM +0200, Steffen Müthing wrote:
> Hi everybody,
>
> as there won't be a GSoC project for the new BCRSMatrix assembly mode, Peter hired a student in Heidelberg
> (Dominic Kempf) to implement it instead (PDELab *really* needs this to get memory usage under control).
>
it is verx nice to see that there is so much interest in this and
people already took the time to implement a prototype. Given the
reactions and willingness to donate mentoring time for this project, I
did not exspect this.
> We now have a prototype that is working extremely well - what's left IMO is ironing out the details like naming and API.
> Here is what we've got right at the moment: There is a new build mode (currently creatively called "mymode") that
> builds the matrix in a way that is similar to PETSc / Trilinos. The new mode has two parameters: The average number
> of nonzeroes per matrix row and an overflow, which is given as a fraction of the total number of nonzeroes in the matrix.
>
What are the advantages of having yet another build mode instead of
implementing it similar to the MTL with an inserter object.
What I liked about their approach was:
1. It does not clutter the actual matrix interface. Instead the
creation is clearly separated from the rest.
2. Code might be reused for other matrices to. Even using an inserter
for a dense matrix to facilitate generic programing is
possible. Further more resusing code to set up PETSc matrices seems
like a viable further extension.
> Example: You have a matrix with 1 000 rows and 25 entries on average per row (25 000 entries in total). Specifying
> an overflow of .05 gives you a total capacity for 25 000 + 1 250 = 26 250 nonzero entries. If you try to add more entries
> than that, you get an exception.
>
Actually, I think specifying an overflow should be
unnecessary. Can't the user incorporate the overflow into the average
number of nonzeros e.g. use 10.1 instead of 10? Therefore I would
prefer skipping it. As floating point numbers for the number of
nonzeros per row seems natural, we could also simply ask the user for
provinding an upper bound of the total number of nonzeros exspected.
I am unsure whether there should be an alternative to throwing an
exception. Think of a productive parallel simulatio and you experience
such an exception after a few days of runtime.
> Those parameters are currently set via setMymodeParameters(), but
> I'm not sure whether that's the best way of doing things.
One more argument for stripping the build mode off into an inserter class.
> After setting the parameters, you set the size of the matrix and can start assembling (there is also a new constructor that
> takes the additional parameters for direct matrix setup in a single step).
>
> In order to avoid slowing down operator[] (and because the CompressedBlockVectorWindows aren't functional at this stage),
> you have to access the matrix entries using entry(row,column) during construction. As long as there are less entries in the
> current row than the specified average, new entries will be placed directly in the matrix; otherwise they end up in a std::map
> for now.
>
IMHO this another strong argument for using an inserter class. Having
different access methods for different stages seems rather confusing
to me. Addtionally, it will not be possible to use existing
algorithms based on our matrix access scheme.
> When you are done with the assembly, you have to finish up the construction process by calling compress(). That call frees the
> map used as intermediate storage and optimizes data storage, but it does not reclaim any excess memory requested via the row
> average and overflow parameters. After the call, you cannot add additional entries anymore and the matrix works just as if it had
> been constructed the traditional way. compress() also returns a struct with some statistics about your matrix: The average and
> maximum number of entries per row, the number of entries that had to be cached in the map and the fraction of the allocated
> memory that is actually in use.
Again this seems clear with an inserter class. There this would simply
appen in the destructor. Still, I have to admit that the user needs to
make sure that the destructor is called by using correct scoping.
>
> During construction, you cannot copy matrices. If a matrix is copied after construction, the copy constructor only allocates as much
> memory as is actually needed by the matrix, throwing away unused overflow space.
>
> The one problem with this approach is the fact that matrix entries are created on the fly, and if they are of type FieldMatrix with a
> primitive field type, you run into FS1322.
>
> We did some benchmarking with this new scheme, and it really improves memory usage and (initial) assembly time quite a bit: In
> PDELab (wich right now uses a std::vector<std::unordered_set<std::size_t> > to build up the pattern before creating the matrix),
> memory usage goes down by a factor of at least 2 (and a lot more if you have a large pattern), and initial assembly is sped up as well
> (depending on pattern size, the improvements range from just a few percent up to a reduction by a factor of 2 or 3). Interestingly, the
> mode is quite tolerant regarding the overflow area: I've tried with large matrices, and even if you end up with a million entries in the
> map, performance only goes down by something like 10-20%. If you are really concerned about matrix size, you can use this to
> compensate for the smaller pattern at the domain boundary: Just specify a smaller row average and add enough overflow. ;-)
>
Sounds promissing.
> You can find Dominic's code on GitHub (https://github.com/dokempf/dune-istl/tree/feature/new-fast-build-mode), and if you have
> PDELab, you can play with the new build mode on the branch
> p/muething/new-bcrs-matrix. Check out
> dune/pdelab/test/testpoisson.cc
> as an example. Just note that you might need the fix from FS1322. The code in ISTL isn't really cleaned up yet, but I thought it might
> be a good time to get some initial feedback.
>
I am not sure when I will find the time to test this.
> Any comments / ideas / problems / naming suggestions?
>
Have you thought about exting a sparsity pattern of an already setup
matrix. For the AMG for non-overlapping grids Rebecca might take
advantage of this.
Cheers,
Markus
--
Do you need more support with DUNE or HPC in general?
Dr. Markus Blatt - HPC-Simulation-Software & Services http://www.dr-blatt.de
Hans-Bunte-Str. 8-10, 69123 Heidelberg, Germany
Tel.: +49 (0) 160 97590858 Fax: +49 (0)322 1108991658
More information about the Dune-devel
mailing list