An *oct-tree* is a data structure to perform three-dimensional *point location**s* and range
searches. In a finite *oct-tree* the geometrical data are presorted in a way that
allows efficient element searches. The space containing a grid is recursively
decomposed into sub-spaces. A sub-space can either be a node or a terminal
leaf. A node is an intermediate sub-space that does not contain any elements
itself but serves as a container for other sub-nodes or sub-leafs. A terminal
leaf represents a situation where a further split would not reduce the number of
elements in at least one of the resulting sub-leafs (Fig. 3.12).

The effort to search for an element in an *oct-tree* is of order
which gives the best possible performance among the two *point location* methods
presented in this thesis. This nearly optimal search performance is due to the
drastic recursive decomposition of space in sub-spaces respectively. For
homogeneous grid density, every search step reduces the remaining elements to
search by a factor of .

The major disadvantages of a finite *oct-tree* are time consuming
pre-processing and memory overheads. The pre-processing time is related to the
algorithm that inserts elements into the *oct-tree*. For every grid element it must
be decided to which (already existing) sub-node it belongs. This results in a
large number of geometrical predicates that have to be computed until the final
sub-space, the so-called terminal leaf, is found. Furthermore, if a grid element
collides with a terminal leaf, this leaf must be replaced by a node and all
elements of that leaf have to be re-distributed over the newly created
sub-spaces. Depending on how the elements are shaped and how they fit into
already existing leafs the complexity of such a test can vary significantly. In
case one of the points is contained within a leaf the test is reduced to a
simple point compare operation. In the worst case the orientation of several
tetrahedrons has to be computed. Every inserted element traverses all nodes
until it reaches the terminal leaf. On every node level all sub-spaces that
overlap the element must be computed. For each overlapping sub-space the
algorithm recurses into the node or leaf associated to the sub-space. For a
homogeneous grid the majority of grid elements will only overlap one sub-node in
any given level. Thus the amount of overlap decisions per element can be
estimated to where is the depth of the *oct-tree*.

Obviously the memory overhead results from the nodes and terminal
leafs that hold the data. For an average depth of the *oct-tree* the number of
nodes and leafs results to:

To give an example lets compute the memory consumption including the overhead to
store a number of
homogeneously distributed tetrahedrons on a
bit machine. The average depth is given as
. According to (3.1) the number of nodes and leafs computes to
. This is of the same magnitude as the number of elements
that are stored. A tetrahedron as it is stored in the *oct-tree* consists of
point references (handles, c. f. Chapter 3.7.1), thus the per
tetrahedron memory consumption is bytes. If we estimate an average of
tetrahedrons that share a point we have a total of
points
in our example. A point coordinate is stored as an IEEE754 double precision
floating point number that takes bytes. The memory needed to store the data
is then
megabytes for the tetrahedrons and
megabytes for the points, which gives a total of
megabytes.

A node consists of references to sub-nodes or leafs and two
references to points, which amounts to bytes of memory. The last of the
*layers* in the *oct-tree* structure always consists of leafs, thus the total
number of nodes in our example is
which gives a memory
consumption of
megabytes. For the remaining leafs (
) the memory consumption depends on the type of the leaf. Every leaf stores
two point references. Additionally, the smallest leaf stores a reference to one
element, the largest holds an arbitrary number of references. To make a guess
we assume that the medium number of stored element references is three. The
average memory consumption per leaf is then
bytes and the total
amount of memory for the leafs is
megabytes. The total of the
introduced memory overhead in this example amounts to megabytes. If we
relate the memory for the data to the introduced overhead it gives
. The number of overlap tests is also quite impressive, it can be
estimated to be in the order of
.

A finite *oct-tree* that implements the decomposition algorithm was
developed at the Institute for Microelectronics [28]. The methods used to determine whether an
element overlaps a leaf are encapsulated in an interface. This allows arbitrary
elements to be governed by the *oct-tree* as long as they implement this
interface. Fig. 3.13 and Fig. 3.14 depict a small mesh and the
resulting leaf structure in the finite *oct-tree*.

There is also a parallelized version of this algorithm [47,48] available that distributes sub-trees over various networked computers to utilize several CPUs. For communication between the computers the high-level protocol CORBA [49] was used. CORBA is a standard that allows to call methods of remote objects. The location of the object is thereby transparent to the client. The connection between client and remote (server) object is established via an object reference. The references of objects are managed by the CORBA nameserver. The client contacts the nameserver upon startup and requests a reference to a certain object. Such a high-level protocol imposes a considerable overhead compared to a local function call. The overhead is observed as a time delay or as a maximum number of method invocations per second. The delay is closely related to speed and workload of the underlying network infrastructure.

The efficiency of the parallelization is determined by the time
delay and the remote CPU time that is consumed by the remote method. If the
relation between delay and CPU time is low, or -- in other words -- if a
complex remote computation takes place, then the overhead can be neglected. This
is the case for the fairly time consuming *oct-tree* insert method. During the
implementation it turned out, however, that the serialization and dynamic
instantiation of objects that is necessary to transfer an object (like, e.g. a
point) over the network is not well supported in the target language (C++)
and imposes an extra programming overhead for each element type. This is in the
contrary to the design of the single-threaded *oct-tree* which is capable of storing
arbitrary elements without the need to recompile the *oct-tree* source code.

Future developments will focus on a multi-threaded version of the *oct-tree* where
only CPUs on one machine can be utilized.

The decision whether an application uses the *oct-tree* or rather
the *jump-and-walk* algorithm presented in the next section strongly depends on the
amount of *point location**s* that will be performed during the
simulation. Fig. 3.13 and Fig. 3.14 show a grid of a cuboidal
region and the resulting sub-division into nodes and leafs.

2003-03-27