GPUs are able to provide high performance for the operation C = A * B for dense matrices A, B, and C. However, if A and B are both sparse matrices, a lot of additional tricks are required to get reasonable performance. The aim of this project is to implement a toolkit of algorithms analyzing the sparsity patterns, which are then composed to yield a fast sparse matrix-matrix multiplication.
Moreover, the implementations should be tuned to GPUs from NVIDIA and AMD as well as Intel's MIC platform.
Benefit for the Student
The student will get hands-on experience in GPU programming using both OpenCL and CUDA. In particular, the student will learn the various tricks required to obtain high performance.
Benefit for the Project
The sparse matrix-matrix multiplication is a key building block for algebraic multigrid solvers and preconditioners. A fast sparse matrix-matrix multiplication will directly improve the efficiency of such methods significantly.
Experience in either OpenCL or CUDA is desired.