[Dune-devel] New assembly mode for BCRSMatrix

Steffen Müthing steffen.muething at iwr.uni-heidelberg.de
Wed Jul 24 11:55:02 CEST 2013


Hi everybody,

as there won't be a GSoC project for the new BCRSMatrix assembly mode, Peter hired a student in Heidelberg
(Dominic Kempf) to implement it instead (PDELab *really* needs this to get memory usage under control).

We now have a prototype that is working extremely well - what's left IMO is ironing out the details like naming and API.
Here is what we've got right at the moment: There is a new build mode (currently creatively called "mymode") that
builds the matrix in a way that is similar to PETSc / Trilinos. The new mode has two parameters: The average number
of nonzeroes per matrix row and an overflow, which is given as a fraction of the total number of nonzeroes in the matrix.

Example: You have a matrix with 1 000 rows and 25 entries on average per row (25 000 entries in total). Specifying
an overflow of .05 gives you a total capacity for 25 000 + 1 250 = 26 250 nonzero entries. If you try to add more entries
than that, you get an exception.

Those parameters are currently set via setMymodeParameters(), but I'm not sure whether that's the best way of doing things.
After setting the parameters, you set the size of the matrix and can start assembling (there is also a new constructor that
takes the additional parameters for direct matrix setup in a single step).

In order to avoid slowing down operator[] (and because the CompressedBlockVectorWindows aren't functional at this stage),
you have to access the matrix entries using entry(row,column) during construction. As long as there are less entries in the
current row than the specified average, new entries will be placed directly in the matrix; otherwise they end up in a std::map
for now.

When you are done with the assembly, you have to finish up the construction process by calling compress(). That call frees the
map used as intermediate storage and optimizes data storage, but it does not reclaim any excess memory requested via the row
average and overflow parameters. After the call, you cannot add additional entries anymore and the matrix works just as if it had
been constructed the traditional way. compress() also returns a struct with some statistics about your matrix: The average and
maximum number of entries per row, the number of entries that had to be cached in the map and the fraction of the allocated
memory that is actually in use.

During construction, you cannot copy matrices. If a matrix is copied after construction, the copy constructor only allocates as much
memory as is actually needed by the matrix, throwing away unused overflow space.

The one problem with this approach is the fact that matrix entries are created on the fly, and if they are of type FieldMatrix with a
primitive field type, you run into FS1322.

We did some benchmarking with this new scheme, and it really improves memory usage and (initial) assembly time quite a bit: In
PDELab (wich right now uses a std::vector<std::unordered_set<std::size_t> > to build up the pattern before creating the matrix),
memory usage goes down by a factor of at least 2 (and a lot more if you have a large pattern), and initial assembly is sped up as well
(depending on pattern size, the improvements range from just a few percent up to a reduction by a factor of 2 or 3). Interestingly, the
mode is quite tolerant regarding the overflow area: I've tried with large matrices, and even if you end up with a million entries in the
map, performance only goes down by something like 10-20%. If you are really concerned about matrix size, you can use this to
compensate for the smaller pattern at the domain boundary: Just specify a smaller row average and add enough overflow. ;-)

You can find Dominic's code on GitHub (https://github.com/dokempf/dune-istl/tree/feature/new-fast-build-mode), and if you have
PDELab, you can play with the new build mode on the branch p/muething/new-bcrs-matrix. Check out dune/pdelab/test/testpoisson.cc
as an example. Just note that you might need the fix from FS1322. The code in ISTL isn't really cleaned up yet, but I thought it might
be a good time to get some initial feedback.

Any comments / ideas / problems / naming suggestions?

I'm mostly unsure about the best way to specify the additional parameters (having an extra method just for this mode just irks me).
Moreover, we might want to also allow people to pass an array with the exact per-row sizes, as those will often be quite easy to
calculate. That way, you directly end up with a perfectly sized matrix. Might be especially important for p-adaptivity...

Best,
Steffen


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 535 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.dune-project.org/pipermail/dune-devel/attachments/20130724/5b7a2003/attachment.sig>


More information about the Dune-devel mailing list