Disseration: Efficient Flux Calculations for Topography Simulation — Review of Methods for Particle Transport and Surface Evolution

2 Review of Methods for Particle Transport and Surface Evolution

This chapter starts with an overview of some currently available three-dimensional topography/process simulators (Section 2.1). The underlying numerical methods used in these simulators are described, although the level of detail of the available information is limited for closed-source simulators. Nevertheless, recurrent components can be identified as a numerical method to solve the particle transport inside the domain and a numerical scheme to advect the surface. In Section 2.2, three numerical methods to solve the particle transport with the aim to obtain the surface rates (which enter the surface velocity model) are discussed. Section 2.3 introduces the level-set method which is predominantly used for the surface advection. Finally, Section 2.4 concludes this chapter with a brief summary.

2.1 Software and Methods

Three-dimensional topography simulation for etching and deposition processes is available via commercial, open-source, and in-house simulation frameworks. For a selection of publicly known frameworks, a brief description of the relevant methods and numerical approaches is given in the following. For closed-source simulators the brevity of the descriptions stems from the fact that little or no details of their numerical approaches are available.

Victory Process [84]: is a commercial general purpose process simulator distributed by Silvaco. It features a Process Simulator Mode to simulate three-dimensional etching and deposition processes based on physical models. A set of etching and deposition models is provided and user-defined models, which have access to a set of simulated local surface conditions, are supported. The particle transport simulation engine is parallelized via a shared-memory approach. To evolve the surface it is discretized using a Cartesian hierarchical level-set-based representation. In [11] the authors use Victory Process to model the Bosch process in a microelectromechanical systems (MEMS) application.
Sentaurus Topography 3D [83]: is a commercial profile simulator distributed by Synopsis. It features three-dimensional topography simulations and is focused on etching and deposition processes. It has a concept of user-defined models similar to Victory Process and also provides a set of predefined models for etching and deposition. The particle transport is modeled with a Monte Carlo approach. The surface representation used for the evolution is level-set-based. Parallelization on shared-memory systems is supported. In [6] the authors use Sentaurus Topography 3D to investigate the influence of the photoresist profile on the etching process.
SEmulator3D [86]: is a commercial three-dimensional process modeling platform distributed by Coventor. It uses an unspecified proprietary evolution engine to evolve the material surfaces. Similar to the above mentioned simulators, it offers a set of predefined models for etching and deposition. In [9] the authors use SEmulator3D to assess different approaches for patterning metal lines and vias.
ViennaTS [87][8]: is an open-source topography simulator developed and maintained by the Institute for Microelectronics, TU Wien. Two- and three-dimensional simulations are supported. The surface is represented as a single resolution level-set discretized on a Cartesian grid. The implementation of the level-set is based on the sparse field level-set method using a hierarchical run-length encoded data structure. The particle transport is implemented using a Monte Carlo approach and an explicit representation of the surface (partially overlapping disks). A set of advanced physical etching and deposition models are implemented; user-defined models require a recompilation. The surface evolution and the particle transport is parallelized for shared-memory systems using OpenMP. In [10] the authors simulate the Bosch process to demonstrate the capabilities of ViennaTS. Prior to ViennaTS, Topo3D[7], also a three-dimensional level-set-based topography simulator, was developed at the Institute for Microelectronics, TU Wien.

A selection of recent publications which apply three-dimensional process simulations using closed, in-house frameworks are mentioned below; depending on the publication some details are provided about the numerical approaches taken: In [12] a wet etching process is simulated based on the sparse field level-set method implementation of the ITK[88] library extended for arbitrary surface speed functions. The authors of [13] use a narrow band level-set method for the surface evolution and a Monte Carlo method to model the particle transport in a three-dimensional simulation of a deep reactive ion etching (DRIE) process; the surface is represented with small overlapping spheres during the computation of the Monte Carlo particle transport. In [21] a cell-based method seems to be used for the surface evolution (although not explicitly stated) and a Monte Carlo method is used to model the particle transport.

Most simulation frameworks described above provide a set of common parameterized models for the surface velocity but also allow for user-defined model implementations. The underlying reason for the simulation stipulates which models are appropriate, e.g., etching simulations of high aspect ratio structures rely on accurate models for re-emission and re-deposition of particles. Equation (2.1) is an example for a general formulation for the surface velocity which relates the normal surface velocity $V_n$ to

(a) the volumetric properties of the bulk material $M$
(e.g., to model the influence of stress types in the bulk material),
(b) the normal direction $\bm {n}$
(e.g., to model the influence of a crystallographic direction),
(c) the curvature $\kappa$
(e.g., to model a smoothing of the surface),
(d) and the surface rate distributions of ions $R_{ion}$ and neutral particles $R_{neu}$
(e.g., to model the etch rate in an ion-enhanced plasma etching process).

$(2.1) \{begin}{align} \label {eq:surfacevelocity} V_n(\mathbf {x}) & = f(M(\mathbf {x}), \bm {n}(\mathbf {x}), \kappa (\mathbf {x}), R_{ion}(\mathbf {x}), R_{neu}(\mathbf {x})) \{end}{align}$

The surface rate of neutral particles $R_{neu}$ is typically a scalar flux value describing how many particles are locally adsorbed per unit time. For accelerated ions the surface rate $R_{ion}$ typically does not represent a particle flux directly but the effective local sputter rate, e.g., how much of the local material is sputtered away. Models for the sputter rate typically depend on the energy and incoming angle (relative to the surface normal) of the ions.

Often only a subset of dependencies is important to be modeled. If an important dependence for the simulation is identified a subsequent question is how the relevant parameters are obtained. Typically a compromise between simulation runtime and accuracy of the models has to be made. In any of the frameworks described above the particle transport is an integral part of the advanced models for dry etching or deposition processes. The next section concentrates on the numerical methods to approximate the surface rate distributions based on the particle transport.

2.2 Feature-Scale Particle Transport

This section reviews approaches to model the particle transport towards the surface in a feature-scale simulation, eventually resulting in the surface rate distributions which enter the surface velocity model as a parameter.

Any advanced surface velocity model for dry etching or deposition processes depends on how the relevant particle species are distributed. In general, the rate of the particles on the surface (number of particles per time and per area) and their kinetic energy is important. If accelerated ions are modeled, the effect on the surface can depend significantly on the kinetic energy of the ions. For neutral particles (which are not accelerated by an electric field) the energy is less important as the energy levels at usual process temperatures are too small to have a noticeable effect on the surface. The influence of the neutral particles on the surface is mainly due to the induced chemical reactions which, in turn, are modeled using the rate of the particles.

A general formulation relating the arrival particle flux distribution $\Gamma _{in}$ to the particle flux distribution which is emitted $\Gamma _{out}$ using a re-emission function $F_{re}$ is given in (2.2). Additionally, a source term $\Gamma _{src}$ independent of the arriving particles is considered. $N_Q$ is the number of particle types and $\Gamma _{re}$ is the re-emitted particle flux distribution. The equation is defined for a point $\bm {x}$ on the surface at time $t$ , although not explicitly shown for the sake of brevity.

$(2.2) \begin{equation} \begin {split} \label {eq:rendering} \Gamma _{out}(\bm {\omega }_0,Q_0,E_0) &= \Gamma _{src}(\bm {\omega }_0,Q_0,E_0) + \\ + \sum _{q=0}^{N_Q} & \underbrace { \left [ \int \displaylimits _{\Omega } \int \displaylimits _{E} \left [ F_{re}((\bm {\omega }_0,Q_0,E_0),(\bm {\omega }_{d\Omega },Q_q,E)) \cdot \Gamma _{in}(\bm {\omega }_{d\Omega },Q_q,E) \right ] dE d\Omega \right ] }_{\textstyle \Gamma _{re}(\bm {\omega }_0,Q_0,E_0)} \end {split} \end{equation}$

Equation (2.2) states that the flux of particles of type $Q_0$ and energy $E_0$ leaving into direction $\bm {\omega }_0$ is equal to all arriving particles (integrated over all incoming directions $\Omega$ and all energy levels $E$ ) weighted with a re-emission function $F_{re}$ . Note that $F_{re}$ potentially also re-emitts particles different in energy and type from the incoming particle ( $Q_q,E$ ), e.g., if accelerated ions bombard the surface and sputter other types of particles situated on the surface.

In the field of computer graphics the rendering equation (2.3) [56] takes a form similar to (2.2) for a point $\bm {x}$ on the surface at time $t$ .

$(2.3) \begin{equation} \begin {split} \label {eq:renderingCG} L_{out}(\bm {\omega }_0,\lambda ) &= L_{src}(\bm {\omega }_0,\lambda ) + \\ &+ \int \displaylimits _{\Omega } F_{BRDF}(\bm {\omega }_0, \bm {\omega }_{d\Omega },\lambda ) \cdot L_{in}(\bm {\omega }_{d\Omega },\lambda ) (\bm {\omega }_{d\Omega } \cdot \bm {n}) d\Omega \end {split} \end{equation}$

Equation (2.3) states (for wavelength $\lambda$ ) that the radiance $L_{out}$ emitted into direction $\bm {\omega }_0$ is equal to the integral of the received radiance $L_{in}$ normalized by the projected area $(\bm {\omega }_{d\Omega } \cdot \bm {n})$ and scaled with a bidirectional reflectance distribution function $F_{BRDF}$ . The radiance is the radiant energy per unit time per unit solid angle per unit projected area.

The following assumptions are implicitly made if (2.2) is used as a general starting point to model the particle transport:

(a) The re-emission of particles is not influenced by the arriving particles, e.g., $F_{re}$ is independent of $\Gamma _{in}$ .
(b) The re-emission occurs instantaneous and at the location where the particles hit the surface, e.g., no transport on/below the surface or time delay is considered.

When taking the perspective of a point on the surface $\mathcal {S}$ , Figure 2.1 illustrates the directions for which other parts of the surface and the source plane $\mathcal {P}$ contribute to the arriving flux distribution $\Gamma _{in}$ , if

(c) particle-particle collisions (which would change the momentum of the involved particles) are neglected due to the low pressure and small dimension of the simulation domain, and
(d) charged particles (ions) are not influenced by the electromagnetic field in the domain which allows to model the trajectory of particles as straight lines.

Figure 2.1: The feature-scale simulation domain is delimited by the source plane (green) in upward direction. The lateral boundaries (red) are typically either reflective (left side) or periodic (right side). The surface of the geometry $\mathcal {S}$ is shown in blue. The different shades of gray below the surface indicate different material regions. The arriving flux $\Gamma _{in}$ is illustrated for point $\bm {x}$ on $\mathcal {S}$ . The arriving flux $\Gamma _{in}$ is divided into a direct contribution from the source (green segment) and a contribution from re-emitted particles (blue segment). An isotropic (diffuse) emission distribution $\Gamma _{src}$ is illustrated for a point $\bm {x}_{d\mathcal {P}}$ on $\mathcal {P}$ and an isotropic re-emission distribution $\Gamma _{re}$ is illustrated for a point $\bm {x}_{d\mathcal {S}}$ on $\mathcal {S}$ . The surface normal directions $\bm {n}$ and the direction vectors $\bm {\omega }$ between the points are shown.

$ \mathcal {S} $ — Figure 2.1: The feature-scale simulation domain is delimited by the source plane (green) in upward direction. The lateral boundaries (red) are typically either reflective (left side) or periodic (right side). The surface of the geometry $\mathcal {S}$ is shown in blue. The different shades of gray below the surface indicate different material regions. The arriving flux $\Gamma _{in}$ is illustrated for point $\bm {x}$ on $\mathcal {S}$ . The arriving flux $\Gamma _{in}$ is divided into a direct contribution from the source (green segment) and a contribution from re-emitted particles (blue segment). An isotropic (diffuse) emission distribution $\Gamma _{src}$ is illustrated for a point $\bm {x}_{d\mathcal {P}}$ on $\mathcal {P}$ and an isotropic re-emission distribution $\Gamma _{re}$ is illustrated for a point $\bm {x}_{d\mathcal {S}}$ on $\mathcal {S}$ . The surface normal directions $\bm {n}$ and the direction vectors $\bm {\omega }$ between the points are shown.

Surface Rate Calculation

The general formulation of the re-emission function $F_{re}$ in (2.2) allows to model numerous effects, e.g., specular-like reflections and sputtering of particles from the surface which themselves are transported in the domain. In [22] a broad overview of common multi-particle models for the surface velocity is provided. In the following only the case of

(a) a single source of a mono-energetic particle species,
(b) a directionally independent probability for adsorption $s(\bm {x})$ ,
(c) an isotropic re-emission with probability $1-s(\bm {x})$ ,
(d) and a single scalar surface rate $R$ representing the total arriving particle flux

is considered. Hence, (2.2) becomes

$(2.4) \begin{equation} \begin {split} \label {eq:renderingDIFFUSE} \Gamma _{out}(\bm {x},\bm {\omega }_0) &= \Gamma _{src}(\bm {x},\bm {\omega }_0) + \int \displaylimits _{\Omega } \left [ (1-s) \frac {1}{\pi }(\bm {n}_{\bm {x}}\cdot \bm {\omega }_0) \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) \right ] d\Omega \\ &= \Gamma _{src}(\bm {x},\bm {\omega }_0) + \underbrace { (1-s) \frac {1}{\pi }(\bm {n}_{\bm {x}}\cdot \bm {\omega }_0) \int \displaylimits _{\Omega } \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) d\Omega }_{\textstyle \Gamma _{re}(\bm {x},\bm {\omega }_0)} \end {split} \end{equation}$

or reformulated when using an integral over the visible source and surface areas

$(2.5) \begin{equation} \begin {split} \label {eq:renderingDIFFUSE_2} \int \displaylimits _{\Omega } \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) d\Omega &= \int \displaylimits _{\mathcal {P}_{vis}} \frac {\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{d\mathcal {P}}\parallel ^2} \left [ \Gamma _{src}(\bm {x_{d\mathcal {P}}},-\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}}) \right ] d\mathcal {P} + \\ &+ \int \displaylimits _{\mathcal {S}_{vis}} \frac {\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {S}}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{d\mathcal {S}}\parallel ^2} \left [ \Gamma _{re}(\bm {x_{d\mathcal {S}}},-\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {S}}} \right ] d\mathcal {S} \ . \end {split} \end{equation}$

The objective is to solve (2.4) and to obtain the surface rate distributions. The surface rates $R_i$ are calculated by an integration of the arriving flux distribution $\Gamma _{in}$ weighted by a corresponding weight function $r_i$ .

$(2.6) \begin{equation} \begin {split} \label {eq:extractDIFFUSE_I} R_i(\bm {x}) = \int \displaylimits _{\Omega } r_i(\bm {\omega }_{d\Omega }) \cdot \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) d\Omega \end {split} \end{equation}$

In the simplest case (which is the one considered here) only a single rate $R$ is used and $r(\bm {\omega }_{d\Omega })=1$ , which reduces (2.6) to (2.7) and corresponds to the total arriving particle flux.

$(2.7) \begin{equation} \begin {split} \label {eq:extractDIFFUSE} R(\bm {x}) = \int \displaylimits _{\Omega } \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) d\Omega \end {split} \end{equation}$

A simple corresponding surface velocity model is

$(2.8) \{begin}{align} \label {eq:surfacevelocityDIFFUSE} V_n(\mathbf {x}) & = f(M(\mathbf {x}), R(\mathbf {x})) = \alpha \cdot R(\bm {x}) \cdot s(\bm {x}) \ , \{end}{align}$

where $\alpha$ is a scalar value. In the following three commonly used but conceptually different methods are described which aim to obtain the surface rates $R$ by solving (2.4) or to approximate the solution.

Straightforward Approach

The “straightforward approach” solves the particle transport equation beforehand and subsequently extracts the surface rates from the arriving particle flux distributions. The surface $\mathcal {S}$ and the source plane $\mathcal {P}$ are discretized into $N_\mathcal {S}$ and $N_\mathcal {P}$ elements respectively, e.g., triangles (three dimensions) or lines (two dimensions). The flux distributions $\Gamma$ are approximated by the superposition of a set of basis functions with corresponding coefficients. In the general case $\Gamma$ can then be approximated as

$(2.9) \begin{equation} \begin {split} \label {eq:basisfunctDISSFUSE_I} \Gamma (\bm {x},\bm {\omega }) \approx \sum \displaylimits _{i} C_i(\bm {x}) \cdot c_i(\bm {\omega }) \ , \end {split} \end{equation}$

where $C_i$ are the scalar coefficients and $c_i$ are the basis functions. If the basis functions are orthonormal the coefficients $C_i$ are defined as

$(2.10) \begin{equation} \begin {split} \label {eq:basisfunctDISSFUSE} C_i(\bm {x}) = \int \displaylimits _\Omega c_i(\bm {\omega }_{d\Omega }) \Gamma (\bm {x},\bm {\omega }_{d\Omega }) d\Omega \ . \end {split} \end{equation}$

For the simplest case, a single constant basis function is used and (2.10) reduces to

$(2.11) \begin{equation} \begin {split} \label {eq:coeffDISSFUSE} C(\bm {x}) = \int \displaylimits _\Omega \Gamma (\bm {x},\bm {\omega }_{d\Omega }) d\Omega = R(\bm {x}). \end {split} \end{equation}$

Inserting (2.11) into (2.5) and expressing $\Gamma _{re}$ according to (2.4) yields

$(2.12) \begin{equation} \begin {split} \label {eq:renderingDIFFUSE_DirectInt} C_{in}(\bm {x}) &= \int \displaylimits _{\mathcal {P}_{vis}} \frac {\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{d\mathcal {P}}\parallel ^2} \left [ \Gamma _{src}(\bm {x_{d\mathcal {P}}},- \bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}}) \right ] d\mathcal {P} + \\ &+ \int \displaylimits _{\mathcal {S}_{vis}} \frac {\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {S}}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{d\mathcal {S}}\parallel ^2} \left [ (1-s) \frac {1}{\pi }(\bm {n}_{\bm {x}_{d\mathcal {S}}}\cdot -\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {S}}}) \cdot C_{in}(\bm {x_{d\mathcal {S}}}) \right ] d\mathcal {S} \ . \end {split} \end{equation}$

The discretization of $\mathcal {S}$ and $\mathcal {P}$ into elements with area $A$ results in

$(2.13) \begin{equation} \begin {split} \label {eq:renderingDIFFUSE_DirectInt_discrete} C_{in}(\bm {x}) &= \sum \displaylimits _{j} \int \displaylimits _{A_j} \frac {\bm {\omega }_{\bm {x}\bm {x}_{dA_j}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{dA_j}\parallel ^2} \left [ \Gamma _{src}(\bm {x}_{dA_j},- \bm {\omega }_{\bm {x}\bm {x}_{dA_j}}) \right ] dA_j + \\ &+ \sum \displaylimits _{j} \int \displaylimits _{A_j} \frac {\bm {\omega }_{\bm {x}\bm {x}_{dA_j}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{dA_j}\parallel ^2} \left [ (1-s) \frac {1}{\pi }(\bm {n}_{\bm {x}_{dA_j}}\cdot -\bm {\omega }_{\bm {x}\bm {x}_{dA_j}}) \cdot C_{in}(\bm {x_{dA_j}}) \right ] dA_j \ , \end {split} \end{equation}$

and an integration of the arriving total flux $C_{in}$ for a surface element leads to

$(2.14) \begin{equation} \begin {split} \label {eq:renderingDIFFUSE_DirectInt_discrete_radiosity} \int \displaylimits _{A_i} C_{in}(\bm {x}_{dA_i}) & dA_i = \sum \displaylimits _{j}^{N_\mathcal {P}}\int \displaylimits _{A_i} \int \displaylimits _{A_j} \frac {\bm {\omega }_{\bm {x}_{dA_i}\bm {x}_{dA_j}} \cdot \bm {n}_{\bm {x}_{dA_i}}}{\parallel \bm {x}_{dA_i} - \bm {x}_{dA_j}\parallel ^2} \left [ \Gamma _{src}(\bm {x}_{dA_j},- \bm {\omega }_{\bm {x}_{dA_j}\bm {x}_{dA_j}}) \right ] dA_j dA_i + \\ + \sum \displaylimits _{j}^{N_\mathcal {S}}\int \displaylimits _{A_i} \int \displaylimits _{A_j} & \frac {\bm {\omega }_{\bm {x}_{dA_i}\bm {x}_{dA_j}} \cdot \bm {n}_{\bm {x}_{dA_i}}}{\parallel \bm {x}_{dA_i} - \bm {x}_{dA_j}\parallel ^2} \left [ (1-s) \frac {1}{\pi }(\bm {n}_{\bm {x}_{dA_j}}\cdot -\bm {\omega }_{\bm {x}_{dA_j}\bm {x}_{dA_j}}) \cdot C_{in}(\bm {x}_{dA_j}) \right ] dA_j dA_i \ . \end {split} \end{equation}$

In this case only isotropic (diffuse) re-emission is considered. If additionally

• an isotropic source emission, e.g., $\Gamma _{src}(\bm {x},\bm {\omega }) = E_{src} \frac {1}{\pi } \bm {n_{\bm {x}}} \cdot \bm {w}$ ,
• planar surface elements, e.g., triangles, and
• a constant re-emission factor $(1-s)$ for each surface element

is assumed, (2.14) can be converted into the discrete radiosity equation (2.15).

$(2.15) \begin{equation} \begin {split} \label {eq:receiving_radiosity} A_i \cdot C_{in_i} & = \sum \displaylimits _{j}^{N_\mathcal {P}} A_j \cdot E_{src_j}\underbrace {\frac {1}{\pi A_j} \int \displaylimits _{A_i} \int \displaylimits _{A_j} \frac {cos(\theta _{idA_j}) }{r_{ij}^2} cos(\theta _{jdA_i}) dA_j dA_i}_{\textstyle F_{ji}} + \\ &+ \sum \displaylimits _{j}^{N_\mathcal {S}} (1-s) A_j \cdot C_{in_j} \underbrace {\frac {1}{\pi A_j}\int \displaylimits _{A_i} \int \displaylimits _{A_j} \frac {cos(\theta _{idA_j}) }{r_{ij}^2} cos(\theta _{jdA_i}) dA_j dA_i}_{\textstyle F_{ji}} \end {split} \end{equation}$

$A_i \cdot C_{in_i}$ is the total arriving particle flux for element $i$ , $r_{ij}$ is the distance between two points on element $i$ and element $j$ , and $\theta _{jdA_i}$ denotes the angle between the surface normal of element $i$ and the direction towards a point on element $j$ . The view factors $F_{ji}$ are identified; in the radiosity framework (which originates from the field of radiative heat transfer) a view factor is the portion of the total radiation of a surface patch $j$ which is received by another surface patch $i$ .

The radiosity formulation is the special (most simple) case of the “straightforward approach”, when only diffuse emission and re-emissions are considered. When modeling non-diffuse effects the choice of the basis functions $c_i$ must be appropriate to capture the properties of the arriving particle flux $\Gamma _{in}$ , which are of importance during the extraction of the surface rates.

A linear system of equations is constructed from (2.15) in case of a radiosity setting and from (2.14) in the general case. If $N_c$ basis functions are used, the size of the system matrix is $N_\mathcal {S}N_c \times N_\mathcal {S}N_c$ . The number of non-zero entries in the matrix is $\mathcal {O}((N_\mathcal {S} N_c )^2)$ as potentially all surface elements are mutually visible. The implementation of boundary conditions (cf. Figure 2.1) leads to an extension of the system matrix with additional entries.

For highly resolved surfaces in a three-dimensional simulation, the direct integration method requires significant memory for storing the system matrix. From an implementation point of view, the selection of appropriate basis functions and the evaluation of resulting matrix elements is problematic [22]. During the evaluation of the matrix elements visibility tests between pairs of points on the surface are used to determine mutual direct visibility which is a condition for the exchange of particles. Some iterative methods offer a natural interpretation, when used to approximate a solution to the linear system, e.g., using $\bm {0}$ as initial guess and performing $n$ iterations of the Jacobi method can be interpreted as only considering $n$ re-emissions and omitting the influence of particles after the $n$ -th re-emission; performing only one iteration corresponds to solely considering the direct particle flux from the source plane to the surface.

Bottom-Up Iterative Approach

Different from the method described above, the bottom-up iterative scheme does not generate a system matrix upfront. In the first iteration only the direct flux from the source plane $\mathcal {P}$ towards the surface $\mathcal {S}$ is considered using only the first summand of (2.4) leading to

$(2.16) \begin{equation} \begin {split} \label {eq:BU_direct} C_1(\bm {x}) = \int \displaylimits _{\Omega } \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) d\Omega = \int \displaylimits _{\Omega _{vis\mathcal {P}}} \frac {\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{d\mathcal {P}}\parallel ^2} \left [ \Gamma _{src}(\bm {x_{d\mathcal {P}}},-\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}}) \right ] d\Omega \ . \end {split} \end{equation}$

For a set of $N_{\mathcal {S}}$ points on the surface $\mathcal {S}$ the integral is evaluated numerically by means of a directional sampling of the sphere. Various schemes can be used to sample the spherical directions. Figure 2.2 illustrates the use of a latitude-longitude grid or geodesic grids in two dimensions.

Figure 2.2: Schematic illustration of the *bottom-up* iterative method: Two different sampling schemes for the spherical directions are shown. For point $\bm {x}_1$ the visible solid angle of the source plane is shown and together with a latitude-longitude-like scheme for the sampling directions approximated for two-dimensions. For point $\bm {x}_2$ the visible solid angle of the surface is shown together with a geodesic-like scheme for the sampling directions approximated for two dimensions.

$ \bm {x}_1 $ — Figure 2.2: Schematic illustration of the *bottom-up* iterative method: Two different sampling schemes for the spherical directions are shown. For point $\bm {x}_1$ the visible solid angle of the source plane is shown and together with a latitude-longitude-like scheme for the sampling directions approximated for two-dimensions. For point $\bm {x}_2$ the visible solid angle of the surface is shown together with a geodesic-like scheme for the sampling directions approximated for two dimensions.

Visibility tests are necessary for each sampling direction to detect if the source plane is directly visible. A discretization of $\mathcal {P}$ is not necessary if $\Gamma _{src}$ does not depend on the position on $\mathcal {P}$ . To approximate non-diffuse re-emission in subsequent iterations certain averaged properties of the arriving particle flux can be captured and stored during the numerical integration, e.g., a mean incoming direction.

In all of the subsequent iterations the numerical integration is performed not over the visible directions towards the source plane $\Omega _{vis\mathcal {P}}$ , but over the visible directions towards the surface.

$(2.17) \begin{equation} \begin {split} \label {eq:BU_second} C_2(\bm {x}) = \int \displaylimits _{\Omega } \Gamma _{in}(\bm {x},\bm {\omega }_{d\Omega }) d\Omega = \int \displaylimits _{\Omega _{vis\mathcal {S}}} \frac {\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {P}}} \cdot \bm {n_x}}{\parallel \bm {x} - \bm {x}_{d\mathcal {S}}\parallel ^2} \left [ \Gamma _{re}(\bm {x_{d\mathcal {S}}},-\bm {\omega }_{\bm {x}\bm {x}_{d\mathcal {S}}}) \right ] d\Omega \end {split} \end{equation}$

An intersection test is required to obtain the closest intersection with the surface $\mathcal {S}$ along the sampling direction.

The final surface rates are calculated using

$(2.18) \begin{equation} \begin {split} \label {eq:BU_finalsum} R(\bm {x}) = C_1(\bm {x}) + C_2(\bm {x}) + ... + C_n(\bm {x}) \ . \end {split} \end{equation}$

The computational demands in terms of storage depend on the complexity of the captured properties of the arriving flux distribution for each integration point and are $\mathcal {O}({N_{\mathcal {S}}\cdot n})$ . Depending on the number of iterations $n$ and the chosen sampling scheme the necessary visibility/intersection tests are the predominant computational task. The integration schemes are free to change in each iteration, e.g., the schemes can be adopted to known properties of $\Gamma _{src}$ in the first iteration by, e.g., using a grid refined towards a pole as shown in Figure 2.2. A boundary causes a reflection or translation of the direction of the visibility/intersection test for reflective and periodic conditions, respectively.

Top-Down Monte Carlo Approach

The top-down Monte Carlo sampling scheme launches a large number of particles $N_p$ on the source plane $\mathcal {P}$ and traces their trajectories inside the domain. The position and direction of the particles starting on $\mathcal {P}$ are generated randomly but weighted so that they collectively resemble the properties of $\Gamma _{src}$ . When a particle $p$ first hits the surface $\mathcal {S}$ at location $\bm {x}$ , it contributes with its properties (e.g., incoming direction, energy, and particle type) directly to the local surface rates. Re-emission is modeled by launching one or more new particles from the same location $\bm {x}$ , where the original particle hit the surface. Again, the re-emitted particles are generated randomly but collectively resemble the re-emission distribution $\Gamma _{re}(\bm {x})$ . Figure 2.3 illustrates the trajectories of 4 exemplary particles launched from the source plane through the domain.

Figure 2.3: Schematic illustration of the *top-down* Monte Carlo method: The trajectories of 4 exemplary particles $p_1$ to $p_4$ are shown. The effect of reflective (left) and periodic (right) boundary condi- tions is shown.

$ p_1 $ — Figure 2.3: Schematic illustration of the *top-down* Monte Carlo method: The trajectories of 4 exemplary particles $p_1$ to $p_4$ are shown. The effect of reflective (left) and periodic (right) boundary condi- tions is shown.

Similar to the methods discussed above, a maximum number of re-emissions can be defined at which the generation of new particles is stopped. Alternatively each particle has a “weight” and a threshold value determines when a particle trajectory is terminated; the weight is updated after each reflection event of the particle. Due to the stochastic nature of the approach, noise is present in the resulting surface rates. The storage requirement for this method is low as for each particle hit the contribution to the rate is directly evaluated. Complex re-emission properties, e.g., specular-like reflections, can be considered straightforwardly [22].

To obtain the closest intersection with the surface for a given particle position and direction an intersection test computationally very similar to the visibility test is required.

Efficient Visibility/Intersection Tests

All methods described above rely on visibility/intersection tests. How dominant these tests are in relation to the overall computational task depends on the method:

• In the “straightforward approach” the visibility tests are used during the construction of the entries of the system matrix. The number of tests is $\mathcal {O}(N_{\mathcal {S}} \cdot N_{\mathcal {S}})$ . For a surface discretized with 1 million elements $\approx 0.5$ trillion ( $10^6 \cdot (10^6-1) / 2$ ) visibility tests are necessary to evaluate the mutual visibility between each pair of elements. The surface normals can be used to reduce this number by first evaluating the sign of their scalar product, where only for a value $<0$ the visibility test is performed.
• In the bottom-up iterative scheme visibility tests are used in the initial iteration and intersection tests are used in all subsequent iterations. The number of tests for $n$ iterations is $\mathcal {O}(N_{\mathcal {S}} \cdot n)$ , if a fixed number of sampling directions is used. For a surface discretized with 1 million elements and considering $1000$ sampling directions the necessary number of tests is 1 billion ( $10^6 \cdot 1000$ ) per iteration.
• In the top-down Monte Carlo scheme the number of tests equals the number of particles launched from the source plane $N_p$ multiplied with the average number of re-emissions per particle $n_{re}$ . The number of tests is $\mathcal {O}(N_p \cdot n_{re})$ . For a surface discretized with 1 million elements and aiming at an average number of 1000 particles (for direct flux) reaching each element, 1 billion ( $10^6 \cdot 1000$ ) of intersection tests are necessary for the direct flux.

The considerations above show clearly that the first method is not right away suited for high-resolution three-dimensional simulations. The comparison between the other two methods reveals a similar number of necessary intersection tests per re-emission. Regardless of the approach, a computationally efficient implementation of a visibility/intersection test is desirable as a vast number of tests are used in all of the methods described above, especially if re-emissions are considered.

Ray-Surface Intersection

A visibility/intersection test is a ray-surface intersection test for a representation of the surface consisting of $N$ primitives, a starting point of the ray (origin) and a search direction. A straightforward implementation is to test each primitive for an intersection along the search direction and to track the minimum intersection distance. The result is the closest intersection point on one of the primitives. Due to its complexity $\mathcal {O}(N_{primitives} \cdot N_{tests})$ such a straightforward implementation is prohibitive considering the required number of tests mentioned above.

In the field of computer graphics, ray-surface intersection tests are used intensively for ray tracing [54] as a rendering technique. Specialized frameworks are used to perform these intersection tests (ray casts) efficiently. All have in common that initially a tree-like acceleration structure is constructed from the scene, e.g., a bounding volume hierarchy (BVH). This allows to traverse a ray more efficiently through the domain without performing intersection tests with all primitives but mostly with bounding volumes of the BVH. Typically, these frameworks are optimized to perform tasks occurring in rendering situations, e.g., to traverse a bundle of spatially coherent rays (i.e., rays with similar direction) together through the scene. The applied performance metric is million ray casts per second (Mrays/s). Non-coherent ray casting, as required for the visibility/intersection tests during the particle transport, is typically also implemented very efficiently, although the maximum Mray/s scores are reached for ray bundles.

The following selection of frameworks perform accelerated ray casting using explicit surfaces, e.g., triangle meshes:

Embree: [55][89] is an open-source collection of C++ ray tracing kernels developed and maintained by Intel. The kernels are optimized by making use of SIMD (single instruction, multiple data) instruction sets available on modern central processing units (CPUs). It includes optimized algorithms for coherent and incoherent ray workloads. All kernels are solely implemented using a single-precision floating point representations.
NanoRT: [92] is an open-source C++ header only CPU ray tracing kernel. It supports ray tracing using single- and double-precision floating point representation.
Optix Prime: [90] is a closed-source C++ ray tracing library distributed by Nvidia. It is optimized for ray-triangle intersections on graphics processing units (GPUs) and solely uses single-precision floating point representation. It includes fallback implementations for CPUs, if no supported GPU is found on the system. It depends on CUDA [91].
pbrt-v3: [93] is an open-source C++ ray tracing renderer, accompanying the book Physically Based Rendering: From Theory to Implementation [56]. It supports ray tracing using single- and double-precision floating point representation.
RadeonRays: [94] is an open-source C++ ray intersection library maintained by AMD. It is targeted at CPUs and GPUs. The ray tracing is implemented using the OpenCL 1.2 standard.

All of the above mentioned frameworks support triangular meshes. The construction of an acceleration structure (i.e., an optimized construction of a BVH and a data structure to store and access the BVH) is provided by all frameworks.

In volume rendering application, rays are casted into a scene consisting of implicit surfaces, e.g., signed-distance functions. Similar to ray casting for explicit surfaces, an acceleration structure, e.g., a BVH, is used for efficient traversal of the ray. The actual intersection with the surface is obtained by marching the ray back and forth, until an intersection (i.e., change of sign) is detected. Sphere tracing [57] is a ray marching algorithm using the signed-distance information to determine the next step length along the search direction. Concerning data structures in this context, there is only one framework which stands out:

OpenVDB: [95] is an open-source C++ project developed and maintained by DreamWorks Animation. It provides a data structure for sparse volume data bundled with a large toolset to operate on the sparse volume data. The data structure is a tree-like structure of 4 levels with non-equal branching factors. It implements ray marching using the sparse volume data structure directly without constructing a BVH [58].

2.3 Surface Evolution Using the Level-Set Method

The normal surface velocity $V_n$ is prescribed by the surface velocity model (2.1) and is available for a set of points on the surface. The objective is to evolve the material surfaces according to $V_n$ . As the evolution is unconstrained the surfaces may undergo topological changes, e.g.,

• a material surface vanishes completely, or
• a portion of a surface is pinched off, or
• two surfaces merge into one larger surface.

The majority of the simulators presented in Section 2.1 relies on a level-set-based method to evolve the surface as it offers robust treatment of the topological changes mentioned above. Before more details on the level-set method are presented, the three main groups of techniques used to model surface evolution are briefly explained and advantages and disadvantages are discussed.

Explicit Representation.: The surface is discretized as an explicit set of nodes with connectivity information, e.g., a triangle mesh. To evolve the surface each node is advanced according to the normal velocity. Initially equidistantly spaced nodes can accumulate or rarefy depending on the curvature and surface velocity. To ensure a maximal resolution (maximal distance between nodes) and to avoid numerical instabilities in accumulated regions, nodes must be redistributed; if the surface area is increasing, new nodes must be inserted and connected. An unconstrained advancement of the nodes leads to a self-intersection of the elements. These self-intersections must be healed/resolved from time to time to ensure a valid surface representation. In the case of a topological change, a change of the connectivity between nodes is required, which is not easily implemented robustly. In general, there is no restriction for the time step size but larger time steps (i.e., larger deformations) make it more difficult to find robust solutions for the redistribution, intersection-healing, and connectivity adjustments mentioned above. However, sharp features of the surface are preserved well. In [23] a method for tracking fluid surfaces using a triangle mesh is presented. In [27] and [28] algorithms for adaptive restructuring of meshes on evolving surfaces are presented.
Cellular (Voxel) Representation.: The surface is represented by a Cartesian grid of cubic cells. Numerical values (typically a scalar value in $[0,1]$ ) are assigned to each cell identifying the cell as interior, boundary, or exterior cell. The surface is implicitly represented through the boundary cells. The numerical values are updated using local update rules incorporating the values of neighboring cells and the surface velocity. The maximum time step is enforced by the update rules. Topological changes are handled robustly. Features below the resolution of the cubic cells cannot be represented. In [30] a cellular material representation is used to evolve the surface in a plasma etching simulation. The determination of the surface normal requires averaging across several cells [30]. In [29] a computationally efficient three-dimensional cellular automata model is used to model the surface evolution in an etching simulation. SEmulator3D [86] uses a voxel modeling approach for geometric processing steps.
Level-Set-Based Representation.: The surface is represented implicitly by the zero-level-set of a scalar field. The zero-level-set is the zero-isoline of a two-dimensional scalar field or the zero-isosurface for a three-dimensional scalar field. An evolution of the implicit surface is therefore equivalent to the advection of the scalar field. The equation for this advection is the level-set equation. The numerical solution is typically performed using a Cartesian grid combined with appropriate finite difference schemes for the spatial and temporal derivatives. The time step is limited by the temporal discretization scheme. Topological changes are inherently supported. In [31] the level-set method is mathematically derived, related methods are covered and numerous application examples are shown. The method is continuously being improved, e.g., to improve mass conservation and unintended smoothing of the surface [32][33][24][34]. The surface evolution in Victory Process [84], ViennaTS [87], and Sentuarus3D [83] is based on the level-set method.

An overview of surface tracking approaches (including the ones mentioned above) in the field of fluid dynamics is provided in [26].

In the following, details on the level-set method including velocity extension, normalization, discretization, surface extraction, and data structures are provided.

The Level-Set Method

The level-set method takes the approach that the surface/interface to be tracked is not explicitly represented. In the following a simple example to illustrate the method is considered: A spherical surface shrinking with unit speed.

The surface of a sphere of radius $r=1$ and center at $c=[0,0,0]$ is represented with a three-dimensional scalar level-set function $\phi$ , where the values are initialized as $\phi (\bm {x}) = \parallel \bm {x}\parallel -1$ . Using this initialization $\phi =0$ on the surface, $\phi <0$ inside, and $\phi >0$ outside the sphere. If a vector velocity field is defined as $\bm {V}(\bm {x}) = \frac {-\bm {x}}{\parallel \bm {x}\parallel }$ , i.e., an inward pointing unit speed, and the level-set function is advected using

$(2.19) \{begin}{align} \label {eq:levelset} \frac {\partial \phi (\mathbf {x})}{\partial t} & = - \underbrace {\bm {V}(\bm {x}) \cdot \frac { \nabla \phi (\mathbf {x}) }{ \parallel \nabla \phi (\bm {x})\parallel }}_{\textstyle V_{n_{\nabla \phi }}(\nabla \phi ,\bm {x})} \parallel \nabla \phi (\mathbf {x})\parallel \ , \{end}{align}$

the implicit surface (represented by the zero-isosurface of $\phi$ ) is a sphere shrinking with unit speed. Equation (2.19) relates the change of $\phi$ over time to the norm of the gradient $\parallel \nabla \phi \parallel$ weighted with a speed function $\bm {V}$ projected into the direction of the gradient. For time $t>1.0$ the zero-isosurface of $\phi$ vanishes (and so does the tracked interface) as all values of $\phi$ are now larger than zero. Renaming the velocity into the direction of the gradient $V_{n_{\nabla \phi }}(\nabla \phi ,\bm {x})$ to $F(\nabla \phi ,\bm {x})$ (2.19) can be rewritten as

$(2.20) \{begin}{align} \label {eq:levelsetDEFAULT} \frac {\partial \phi (\mathbf {x})}{\partial t} & + F(\nabla \phi ,\bm {x}) \parallel \nabla \phi (\mathbf {x})\parallel = 0 \ , \{end}{align}$

which is the standard form of the level-set equation. The dependencies of $F$ determine if (2.20) is linear or non-linear:

• (2.20) is linear if $F$ is independent of $\phi$ and its derivatives and only depends on the location $\bm {x}$ , e.g., the linear transport equation $\phi _t + F \phi _x = 0$ (for one dimension).
• (2.20) becomes non-linear if $F$ is dependent on $\phi$ or $\nabla \phi$ , e.g., $F=\phi$ leading to $\phi _t + \phi \phi _x = 0$ (i.e., the inviscid Burgers’ equation for one dimension [82]).

In the case of a process simulation, a meaningful velocity is only available for the points on the surface, but a velocity field $F$ is required for the advection of the level-set function $\phi$ . The next section describes how a suitable velocity field is obtained from the surface velocities using extrapolation.

Velocity Extension

In a process simulation, the velocity field has no physical meaning as velocities are only meaningful on the surface itself. There are generally no requirements for the velocity field $F$ besides it must agree with the surface velocity at the zero-level-set which corresponds to the interface. The most straightforward approach is to extrapolate from the closest point on the surface $\mathcal {S}$ [35]: For a point $\bm {x}$ away from the surface the velocity from the closest point on the surface $\bm {x}_{cp} \ \in \ \mathcal {S}$ is used.

Depending on the data structure, the identification of the closest point on the surface can be a demanding task. In [31] an alternative approach for obtaining the velocity field is described, where the velocity field is “marched away” from the surface with the fast marching method [36] to solve the static boundary value problem

$(2.21) \{begin}{align} \label {eq:eikonal} \parallel \nabla \phi (\bm {x}) \parallel F(\bm {x}) & = 1,\ \ \phi (\bm {x}) = 0 \ \text {on} \ \Gamma \ , \{end}{align}$

where $\Gamma$ is the interface and $F$ is a strictly positive speed function only depending on the location $\bm {x}$ .

A consequence of how the velocity is extended is how well the level-set function retains a signed-distance function over time. While this property is generally not maintained using the approach in [35], the alternative approach [31] analytically maintains the signed-distance property but discretization errors in $\phi$ and its derivatives will gradually destroy this property over time. The next section describes how the signed-distance property of the level-set function can be sustained.

Sustaining a Signed-Distance Property

In the example of the spherical surface in Section 2.3 the level-set function $\phi$ has been initialized to a signed-distance function. That is, the magnitude $|\phi |$ encodes the distance to the zero-level-set and the sign encodes the inside ( $-$ ) and outside ( $+$ ) regions. The signed-distance property is not maintained as $\phi$ evolves in time due to the construction of the extended velocity field itself or due to discretization errors. It is desirable to maintain a signed-distance property of $\phi$ :

• If $\phi$ is very “flat” or very “steep” the numerical determination of the zero-level-set locations is less accurate.
• If $|\nabla \phi |$ varies the accuracy of numerical derivatives suffers [19]; this includes the accuracy of the curvature and the surface normal.
• The closest point on the surface can be approximated by a scaled step along the gradient: $\bm {x}_{cp} = \bm {x} - \phi (\bm {x})\nabla \phi (\bm {x})$ .
• The level-set function qualifies to robustly apply constructive solid geometry (CSG) operations and geometrical “offset” operations [60].

For the reasons given above most level-set applications perform a process of “reshaping” the level-set function in regular intervals with the aim to restore the signed-distance property. This process if often called reinitialization, redistancing, or normalization. Related methods can be categorized into two distinct groups:

Boundary Value Problem: -based methods solve (2.21). Examples are the fast marching method [36], the fast sweeping method [37], and the fast iterative method [60]. Solutions to the boundary value problem can be generated with near optimal complexity. The solution near the interface is used as initial condition and is not varied, which is a desirable property. On the other hand, deviations from the signed-distance property near the interface are not corrected.
Flow: -based methods solve an equation of the form (2.22). The flow defined by (2.22) has normalizing properties, e.g., it vanishes when the signed-distance property is obtained. The solutions near the interface are also influenced by these methods with the benefit that deviations from the signed-distance function near the interface are corrected. A consequence is that the interface is also slightly advected by the flow.

$(2.22) \begin{equation} \begin {split} \label {eq:flownormal} \frac {\partial \phi (\mathbf {x})}{\partial t} & + \sigma (\bm {x}) \parallel \nabla \phi (\mathbf {x}) - 1\parallel = 0 \\ \sigma (\bm {x}) & = \text {sgn}(\phi (\bm {x})) = \frac {\phi (\bm {x})}{\sqrt {\phi (\bm {x})^2 + \epsilon }} \end {split} \end{equation}$

A discussion of advantages and disadvantages for boundary value problem-based or flow-based methods for normalization can be found in [19].

Discretization

The level-set equation (2.20) belongs to the class of Hamilton-Jacobi equations of the form

$(2.23) \begin{equation} \begin {split} \label {eq:HJequation} \frac {\partial \phi (\mathbf {x})}{\partial t} + & H(\bm {x},t,\phi ,\nabla \phi ) = 0, \quad \text {with the Hamiltonian} \\ &H(\bm {x},t,\phi ,\nabla \phi ) = F(\nabla \phi , \bm {x}) \parallel \nabla \phi (\mathbf {x})\parallel . \end {split} \end{equation}$

On uniform Cartesian grids a set of discretization schemes has established itself as a widely used default choice [14].

Spatial Derivatives

The weighted ENO (essentially non-oscillatory) scheme [40][18], in short WENO, is used for the approximation of the first derivatives of $\phi$ . The WENO scheme is based on the ENO scheme [39], which adaptively selects a finite difference stencil (out of a fixed set of stencils) for the approximation: By calculating the smoothness of $\phi$ inside each stencil’s interval, the selected stencil is the one the interval of which does least overlap with a discontinuity of $\phi$ . The WENO scheme does not solely use one stencil but instead uses a weighted convex combination of all stencils, where the weights depend on a smoothness measure of $\phi$ in the stencil’s interval.

Hamiltonian Discretization

The level-set equation (2.20) can be related to a scalar conservation law. Starting with the one-dimensional form of (2.20)

$(2.24) \begin{equation} \begin {split} \label {eq:levelset1D} \phi _t + H(\phi _x) = 0 \ , \end {split} \end{equation}$

and $\phi _x = p$ , and differentiating results in

$(2.25) \begin{equation} \begin {split} \label {eq:scalarconserv} \left [\left [\int p dx\right ]_{t}\right ]_x + \left [H(p)\right ]_x = 0 \end {split} \end{equation}$

which can be simplified to the conservation law

$(2.26) \begin{equation} \begin {split} \label {eq:scalarconserv2} p_t + [H(p)]_x = 0. \end {split} \end{equation}$

If $H(p) = \alpha p$ , this gives the linear transport equation $p_t + \alpha p_x = 0$ and upwind finite difference schemes, which take the derivatives in the direction of information propagation, can be used, as the upwind direction is known beforehand. For a nonlinear conservation law, the transport depends on the solution itself and the information propagation direction cannot be determined a priori. Furthermore, finite difference schemes are not well suited for scalar conservation laws as solutions are not necessarily smooth. The integral form of the conservation law

$(2.27) \begin{equation} \begin {split} \label {eq:scalarconserv3} \left [\int _a^b p dx \right ]_t = - \int _a^b [H(p)]_x dx = - \left [H(p(b)) - H(p(a))\right ] \end {split} \end{equation}$

is therefore the starting point for the numerical schemes for scalar conservation laws. Using a finite volume approach, the cell averages at the next time step are obtained by integrating (2.27) over the domain $[x_{i-1/2}, x_{i+1/2}] \times [t^n, t^{n+1}]$ leading to

$(2.28) \begin{equation} \begin {split} \label {eq:scalarconserv4} (p_i^{n+1} - p_i^{n} ) \Delta x = - \Delta t ( \bar {G}_{i+1/2}^n - \bar {G}_{i-1/2}^n) \ , \end {split} \end{equation}$

where $\bar {G}$ are the fluxes at the cell boundaries

$(2.29) \begin{equation} \begin {split} \label {eq:scalarconserv5} \bar {G}_{i+1/2}^n = \frac {1}{\Delta t} \int _{t^n}^{t^{n+1}} H(p(x_{i+1/2},t)) \ dt \ , \end {split} \end{equation}$

which are not known a priori. The Godunov method [14][17][41] provides an a priori explicit approximation to $\bar {G} = G$ by identifying a Riemann problem for each cell interface.

Different schemes exist [46] to approximate a solution to the Riemann problems, e.g., the Lax-Friedrich scheme, the Engquist-Osher scheme, and the Godunov scheme (not to be confuse with the Godunov method above). Each of the schemes produces an approximation $G(p_{i+1/2}) \approx g(p_{i},p_{i+1})$ which results in

$(2.30) \begin{equation} \begin {split} \label {eq:scalarconserv6} \frac {p_i^{n+1} - p_i^{n}}{\Delta t} = - \frac { g(p_{i}^n,p_{i+1^n}) - g(p_{i-1}^n,p_{i}^n)}{\Delta x} \ . \end {split} \end{equation}$

The approximation schemes for $H$ in (2.24) rely on the numerical fluxes $g(p,p)$ . Using $\phi _x = p$ equation (2.24) can be rewritten as

$(2.31) \begin{equation} \begin {split} \label {eq:levelset1D2} \phi _t = - H(\phi _x) = - H(p) \ , \end {split} \end{equation}$

and an approximation of $H(p)$ is given by the numerical flux function $g(p,p)$ . This leads to

$(2.32) \begin{equation} \begin {split} \label {eq:hamiltonflux} H(p_i^n) \approx g(p_{i-1/2},p_{i+1/2}) \ , \end {split} \end{equation}$

and finally (for an explicit Euler step) to

$(2.33) \begin{equation} \begin {split} \label {eq:levelset1D3} \frac {\phi ^{t+1}_i - \phi ^{t}_i}{\Delta t} = - g(p_{i-1/2},p_{i+1/2}). \end {split} \end{equation}$

Temporal Discretization

Total Variation Diminishing Runge-Kutta (TVD-RK) schemes [43][42] are typically used to discretize (2.20) in time [14]. The first-order accurate scheme (TVD-RK1) corresponds to a single explicit Euler step, i.e., for one dimension

$(2.34) \begin{equation} \begin {split} \label {eq:tvdrk1} \frac {\phi ^{n+1}_{i} - \phi ^{n}_{i}}{\Delta t} + g^{GODUNOV}(\phi _{x_{i-1/2}}^{WENO-}, \phi _{x_{i+1/2}}^{WENO+}) = 0 \ , \end {split} \end{equation}$

where the Godunov scheme is used for the numerical flux and the WENO scheme for the derivatives of $\phi$ at $i-1/2$ and $i+1/2$ . Second (TVD-RK2) and third order (TVD-RK3) schemes combine multiple temporary Euler steps to obtain higher order accuracy.

The size for the discrete time step is limited by the assumptions of the spatial discretization scheme and the explicit time integration scheme. To obtain a stable scheme, the maximum time step is

$(2.35) \{begin}{align} \label {eq:cflsimple} \Delta t < & \frac {\Delta h }{|F_{max}|} \cdot \alpha _{CFL} \ , \{end}{align}$

where $\Delta h$ is the spatial grid spacing, $F_{max}$ is the maximum value in the velocity field, and $\alpha _{CFL} \in \ ]0,1]$ is a positive constant depending on the discretization schemes. This condition is known as the Courant-Friedrichs-Lewy (CFL) condition [47].

Data Structures

The straightforward implementation of the level-set method using a multi-dimensional dense array of values leads to storage complexity $\mathcal {O}(n^d)$ where $n$ is the average spatial resolution and $d$ is the number of spatial dimensions. For high-resolution three-dimensional simulations a dense array is not practical, e.g., a domain of $1000^3$ requires more than $7$ GB of memory, if double-precision floating values are stored. The memory footprint can be reduced by using single-precision or even less accurate floating point representations.

To overcome the large memory footprint an attractive approach is to only store the level-set function values near the surface. This is feasible as it is sufficient to only solve the level-set equation inside this narrow band around the surface which solely defines the position of the implicit surface [48][52]. Data structures can be distinguished into two main groups:

• Tree-based approaches use a tree of nested cubes, where the leaf nodes store the level-set function values [49][51]. The non-leaf nodes contain either only the sign or level-set values at a reduced resolution.
• Run-length encoded approaches compress regions away from the narrow band by just storing a sign denoting inside or outside [22][50][53].

General requirements for the data structures are an efficient mechanism to adopt the narrow band after the surface has evolved, and fast access methods for sequential and random access.

Surface Extraction

An explicit representation of the surface is obtained by performing an isosurface extraction. The extraction algorithms can be categorized into the following groups [44]:

Primal methods produce a polygonal mesh by connecting the intersection points of the isosurface with a structured grid. The marching cubes algorithm is robust, straightforward to implement, and is the standard algorithm for surface extraction from a signed-distance field. Variations of the original algorithm exist, e.g., to improve mesh quality and unwanted smoothing of sharp features [44][25].

Dual methods identify a representative point for each cell of the structured grid and connect these points to form a polygonal mesh. The dual marching cubes [45] is an example for a dual extraction method. In [61], an algorithm for dual contouring is introduced and compared to primal methods.

2.4 Summary

Most process simulators rely on the level-set method to evolve the surface due to its inherent support for arbitrary surface evolutions. An implicit representation of the surface (level-set function) is maintained throughout the simulation. For efficiency reasons, the level-set function is stored using sparse volume data structures which only store the values inside the narrow band around the zero-level-set. The fact that it is sufficient to solve the level-set equation only in the vicinity of the zero-level-set makes this memory efficient approach possible.

The particle transport is an integral part of any advanced simulation of etching or deposition processes, as the surface rates (entering the surface velocity model) depend on the particle distributions inside the domain. A vast number of visibility/intersection tests are a major computational task in all presented methods to model the particle transport. The particle transport is calculated for every time step of a simulation.

The visibility/intersection tests can either be performed using an implicit representation (level-set function) or using an explicit representation of the surface. The explicit representation is created from the level-set function using an extraction algorithm.

Visibility/intersection tests are ray-surface intersection tests (ray casting). Ray tracing is a rendering technique used in the field of computer graphics, which relies on casting (light) rays into the scene. Numerous optimized libraries implementing efficient ray casting are available, originally targeted for rendering applications. Most of those libraries perform all computations using single-precision floating point operations.

Starting out with the approaches for the particle transport described above, the aim of this work is to reduce the computational workload contributed by the particle transport calculation during a process simulation. The next chapter introduces the simulation framework which is used to implement and validate the methods presented in the remainder of this work.