Description

Statistical Agencies and organisations have restrictions due to laws on privacy to give their data into the hand of researchers and the public. In general these laws aim to avoid re-identification of individuals. Using the developed free and open-source R package sdcMicro it is possible to perturbate the original data by minimising the information loss at the same time. Since the package is in use by subject matter specialists that are often not trained in command line coding, a graphical user interface - written with the help of the RGtk2 package - was needed to design and develop. The student should investigate the source package and the graphical user interface (GUI) and further develop the GUI and corresponding methods that mostly needs to be written in a compile language like ANSI C or C++ and called from R for fast calculations on large data sets in R.

Benefit for the Student

The student will learn basics in statistical disclosure control and will experience with the state-of-the-art tool in that area. Moreover, the student will enhance the programming skills because of dealing with several free and open-source languages.

Benefit for the Project

The improvement of the methods in terms of computational speed and the improvement of the graphical user interface that calls the methods gives powerful free and open source software into the hand of organisations that have to provide data but also have to respect laws on privacy.

Requirements

The student should offer skills in object-orient programming, especially with C++ and with the statistical environment R. Knowledge in the multi-platform toolkit gtk is preferable but not required beforehand.

Mentors

Peter Filzmoser, Matthias Templ

Contact

Send an email to This email address is being protected from spambots. You need JavaScript enabled to view it. (first subscribe here) using the prefix [R].

More Information

Links to related papers and software can be found in the wiki.