3.2.2 HDF

HDF (HIERARCHICAL DATA FORMAT) is a binary file format which is obviously not intended to be human readable. It is maintained at the NCSA (NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS) of the UNIVERSITY OF ILLINOIS AT URBANA - CHAMPAIGN. The idea of using a binary file format is related to performance and disk space issues. A binary file saves a considerable amount of disk space compared to an ASCII file, thus -- being smaller in size -- reading performance is improved. While size is not a concern for small structures (e.g. two-dimensional problems) and it is also not the limiting factor for three-dimensional problems, it might still prove a nuisance if e.g. an archive of simulation results from realistic three-dimensional Wafers must be managed3.7.

HDF was chosen as the binary persistent Wafer representation for several reasons. First, it is wide-spread and very well adopted for scientific applications. Second, it provides so-called parallel I/O which allows an HDF file to be spread over several disk drives contained in different workstations. This gives a scalable I/O bandwidth and makes HDF well suited to store and retrieve very huge amounts of data. The NCSA provides two completely different (and incompatible) versions of HDF (versions four and five). Although version four has the advantage of many scientific applications and visualization programs supporting it to date, there are several shortcomings compared to version five:

For the above mentioned drawbacks HDF version five was chosen over version four. As with WSS both Reader and Writer interfaces are implemented [35] using the HDF library. At the time of the development of this module the C++ API to HDF was not yet available, thus the C API was used.

Table 3.1 gives a comparison of WSS and HDF for different data compression levels. The timings were performed on an Intel PII-$ 266$ under LINUX. The observed difference in the reading times is not only due to the bigger size of the ASCII file format, but also because the WSS Reader was not optimized for speed but for memory demand. Depending on the ordering of the grid sections in a WSS file, some of the grid sections are parsed several times in order to avoid temporary storage of grids in memory.

Table 3.1: Comparison of WSS and HDF. The good writing time for the WSS file format is an indication that the writing overhead for both file formats is of the same magnitude.
  HDF (compression level $ 5$) HDF (no compression) WSS
file size [kB] $ 227.2$ $ 684.9$ $ 1015.8$
reading time [s] $ 1.31$ $ 1.22$ $ 61.32$
writing time [s] $ 1.53$ $ 1.25$ $ 1.35$


2003-03-27