Description

The memory on a single graphics adapter is typically limited to dense matrices of at most 10.000 by 10.000 entries. However, for many applications much larger matrices need to be handled, which can be achieved by distributing the matrices to multiple computing nodes. On the API level, library users wish to have the distributed data handled automatically, as if the matrix were located on a single GPU.

The student should implement such a distributed matrix type and provide basic linear algebra operations such as matrix additions, matrix-matrix and matrix-vector multiplications for this type. Internally, Boost.MPI should be used.

Benefit for the Student

The student will get hands-on experience in high-performance computing. The challenges of distributed computing will be tackled.

Benefit for the Project

Many linear algebra libraries offering GPU accelerations are limited to shared-memory systems and require that all data fits onto GPUs. ViennaCL will be one of the first open source libraries to support GPU acceleration for large-scale problems.

Requirements

The student should be familiar with basic linear algebra, i.e. matrices and vectors. Moderate C and C++ knowledge is sufficient. Familiarity with MPI is desired. Access to a machine with a mid- to high-range graphics adapter is beneficial, but not mandatory.

Mentors

Josef Weinbub, Karl Rupp