[Dune-devel] [GSoC2013][Proposal] Projects 2: Inserter & 4: Performance

Sam Skalicky (RIT Student) sxs5464 at rit.edu
Sat Apr 27 23:08:44 CEST 2013


Hello DUNE Developers!

My name is Sam Skalicky, im a graduate student in computer science at
Rochester Institute of Technology (RIT). I’m interested in two of the
suggested project ideas, and I think I can contribute to either one of
these projects. I’ll describe my background for each of these projects. I
am looking for feedback on which of these projects would be more beneficial
to the community, as well as suggestions for improvement.

--------------------------------------------

Project 2: Implement better sparse matrix creation through use of an
inserter object

Motivation: I’d like to work on this project since I have previously
written a custom sparse matrix implementation in C++ to solve some problems
in my research. In my work, I needed to operate on huge (100’s of millions)
nodes in a graph, and so we needed a minimalistic solution. I have in-depth
experience coding custom c++ STL allocators to use memory-mapping (we
didn’t have enough RAM to store the data structures 100’s of GBs).

The suggested paper has an interesting solution for the specific case when
the number of non-zero entries per row/slot is known (or able to be closely
estimated). But I am also interested in the performance of the technique,
such as how much memory could be over-committed or how much overall
performance is degraded when the slot size is underestimated.

Methodology: To complete this project I propose the following milestones:

1. Implement memory reservation when slot size is given

a. Assume that slots are never overfilled

b. Implement unit tests

2. Implement additional map container for slot overflow (spare container)

a. Implement unit tests

3. Evaluate memory usage & performance to characterize implementation

a. Implement full system test cases


--------------------------------------------

Project 4: Performance testing to evaluate memory usage, and time
performance for single and multiple processor executions of DUNE(grid)

Motivation: I have recently had a paper
accepted<http://samskalicky.wordpress.com/2013/04/27/paper-accepted-processor-performance-comparision/>
comparing
the performance of CPU, GPU, and FPGA architectures for linear algebra
computations: dot product, matrix-vector multiply, matrix-matrix multiply,
matrix inverse, and matrix decomposition. In this analysis I investigated
multiple implementations for each processor architecture (in particular for
CPU: Matlab & AMD C Math Library (ACML)).

I have experience programming using MPI on 64 processor clusters, and
evaluating the performance of aspects such as: node compute time, node idle
time, control message transmit time, data transmit time, etc. At RIT we
have a multiple processor systems class with many projects requiring this
cluster based implementation strategy. In addition, I use the research
computing resources at RIT to implement my graph research.

Methodology: To complete this project I propose the following milestones:

1. Implement a small set of performance tools to measure execution time and
memory allocation

2. Choose a few routines to evaluate the performance of

3. Design performance output that is tailored to developer’s needs

4. Integrate these performance tools with DUNE unit tests

5. Implement tests on larger system level routines

6. Integrate with system level unit tests

7. Document procedure to allow developers to integrate performance testing
into future (and any other existing) routines

--------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dune-project.org/pipermail/dune-devel/attachments/20130427/262eefd5/attachment.htm>


More information about the Dune-devel mailing list