An additional speed-up can be realized by a parallel integration of the initial value problems in the stabilized march algorithm and, to a minor extent, by a parallel evaluation of the required matrix operations (Table 6.4). Since these operations determine the overall run-time the number of parallel processors would linearly scale the run-time. The PVM library [250,251] provides a powerful environment for the parallelization of computer programs. Due to the modular structure of the algorithm no major modifications of the code are required.
The usage of expansion techniques other than the Fourier transform is mainly of theoretical interest for the present, but if an appropriate set of basis functions is found the performance can be significantly enhanced. Three candidates seem to be promising: The first two are polynomial expansions either based on Tschebyscheff or Legendre polynomials. In contrast to the Fourier expansion that minimizes the mean square error the polynomial expansions obey strict error bounds, i.e., the absolute discrepancy between the exact solution and the numerical solution can be prescribed. The third candidate implies wavelets. Since Fourier expansions do not exhibit any location in the spatial domain the number of required coefficients to resolve abrupt changes in material properties is rather high. Wavelets have a good spatial-frequency location and thus sharp geometry steps can be simulated with fewer coefficients. Hence an appropriate choice of the wavelet set can significantly reduce the number of coefficients and printing of more than one feature could be rigorously simulated with the differential method.
Another point is the treatment of quasi-periodic incident light. In the implementation of the differential method presented all source point contributions have to be periodic to treat them simultaneously, i.e., the same ordinary differential equation system can be solved for all excitation vectors. A necessary condition for it is to restrict the source point locations onto an ortho-product-tensor-grid as shown in Figure 4.8. This requirement is sometimes too stringent, a finer source discretization would be advantageous. An example are illumination apertures with a small partial coherence factor. The grid has to be subdivided resulting in quasi-periodic waves incident on the wafer. The full benefits of the proposed implementation are then lost since the system matrix of the ordinary differential matrix is not the same for all excitation vectors. But similar to the periodic modes also quasi-periodic modes with an offset equal to the fundamental frequencies can be grouped together. Hence even in case of a finer source location the computation costs do not grow proportionally to the number of source points but proportionally to the number of quasi-periodic groups. This is still a big performance gain of the differential method in comparison to other techniques. The code modifications required do not concern the core part of the stabilized march algorithm since only the entries of the system matrix change.