User Tools

Site Tools


2015-viennacl-openmp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
2015-viennacl-openmp [2015/03/09 13:04]
viennastar
2015-viennacl-openmp [2015/03/24 20:40]
viennastar Added puzzle
Line 4: Line 4:
  
 === Description === === Description ===
-ViennaCL has three computing backends: One based on CUDA, one based on OpenCL, and one based on OpenMP. While the CUDA and OpenCL backends provide high performance,​ this is not yet the case with the OpenMP backend. Although the OpenMP-backend was initially introduced as a fall-back mechanism for CPU-only systems, it is now mature enough to be tuned for high performance. The student will tune the individual linear algebra kernels (vector operations, matrix-vector products, etc.) for best performance.+Our free open source linear algebra library [[http://​viennacl.sourceforge.net|ViennaCL]] has three computing backends: One based on [[http://​www.nvidia.com/​object/​cuda_home_new.html|CUDA]], one based on [[https://​www.khronos.org/​opencl/​|OpenCL]], and one based on [[http://​openmp.org/​|OpenMP]]. While the CUDA and OpenCL backends provide high performance,​ this is not yet the case with the OpenMP backend. Although the OpenMP-backend was initially introduced as a fall-back mechanism for CPU-only systems, it is now mature enough to be tuned for high performance. The student will tune the individual linear algebra kernels (vector operations, matrix-vector products, etc.) for best performance.
  
 === Benefit for the Student === === Benefit for the Student ===
Line 16: Line 16:
 Moderate C or C++ skills are required. Experience in using OpenMP is a plus. Moderate C or C++ skills are required. Experience in using OpenMP is a plus.
  
-=== Mentors ​=== +=== Primary Mentor ​=== 
-Karl Rupp, Josef Weinbub+Karl Rupp
  
 +=== Puzzles ===
  
 +Write a C or C++ program using OpenMP which computes the dense matrix-matrix product B = A * A for a double-precision floating point square matrix A in row-major storage (i.e. the element [i,j] is located at i*N + j in the underlying array) as fast as possible. The size of A should be a command line parameter of your program. Plot the obtained performance for matrix sizes between 10 and 2000 in dependence of the number of threads and comment on the results.
 +
 +Contact rupp_AT_iue.tuwien.ac.at or stop by at the institute if you have questions. Submit the code with your application.
  
2015-viennacl-openmp.txt · Last modified: 2015/03/24 20:40 by viennastar