fft

**This is an old revision of the document!**

This project will deal with the computation of the fast fourier transform (FFT) and singular value decompositions (SVD) using GPU acceleration. There are first implementations entirely based on OpenCL available, which should be reconsidered and extended to CUDA and OpenMP. Consequently, the student will learn how to code in CUDA and OpenCL for using GPUs, and also get some experience with OpenMP. Successful implementations will be made available to the public via the free open source library ViennaCL.

Juraj Kabzan

The following list of milestones will help you in completing the project successfully. The reimbursement for each of the milestones is indicative, as we hope that you will reach and complete the final milestone anyway

**Familiarize with FFT:**Investigate and understand the current implementation of FFT routines in ViennaCL. Familiarize yourself with how OpenCL kernels are called and how multiple backends are supported in other routines.

**Refactor FFT:**Extend the current implementation of FFT to the CUDA and OpenMP backends. Update the current test suite (also considering corner cases) for the nightly builds. Update the documentation accordingly. Ensure that there are no severe performance bottlenecks.

**Salary: EUR 700**

**Refactor NMF:**Refactor the nonnegative matrix factorization code from only OpenCL to all three compute backends (CUDA, OpenCL, OpenMP). This involves the following tasks:- Port the kernel `mul_div_kernel` to all three compute backends
- Modify the interface to `nmf()` such that `matrix_base` instead of `matrix` is accepted.
- Adjust the nightly test such that all three backends are tested and that the execution time is reduced by about a factor of five.
- Adjust the documentation in the manual accordingly
- Create an example in `examples/tutorial/` explaining the usage of the nmf-routine

**Salary: EUR 300**

**TBD**

**Salary: EUR 700**

Ample of tutorials for CUDA and OpenCL can be found in the web. A small selection is as follows:

https://developer.nvidia.com/cuda-education-training

http://www.nvidia.com/docs/IO/116711/sc11-cuda-c-basics.pdf

http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/introductory-tutorial-to-opencl/

http://www.cc.gatech.edu/~vetter/keeneland/tutorial-2011-04-14/06-intro_to_opencl.pdf

fft.1421937883.txt.gz · Last modified: 2015/01/22 14:44 by viennastar