6.2 Data Types

While it may be possible to derive a certain amount of pleasure from the mere storage of items in various memory locations, it falls far short from the intents and purposes of digital computers, which rather boast the capability of performing operations on the data items at a speed far surpassing the capabilities of average humans. Algebraic operations as outlined in Section 4.2 are among the most important operations to be carried out by a machine. In its most crude form digital computers represent data items as a binary pattern within their memory. This pattern is in essence subject to interpretation to determine not only the value but also the operations available. Both of these two notions are intimately related and are therefore formally combined in the concept of data types. Usually several simple data types are inherently defined as part of programming languages, such as for integers and rational numbers. Higher level programming languages, especially those considered to support the object-oriented paradigm, allow the definition of custom data types and the specification of available operations for these newly defined types. Several programming languages furthermore allow to overload operators for these custom types, thus allowing the expression of semantics in a fashion comparable to the built in types.

However, the limited, finite nature of the used machines results in restrictions to the data types which can be implemented. It especially restricts the modelling of mathematical structures, such as the real numbers ℝ . Thus, while implemented data types may approximate these algebraic structures, they cannot mimic these structures in their entirety. As a consequence it is of considerable importance to be aware of the limitations of the used data types and the errors incurred due to the approximation.

An example of the separation of the algebraic properties from basic numerical data types can be given using the notion of polynomials (Definition 19). The definitions specifies that the coefficients should be drawn from a ring (Definition 13) but does not further specify the nature of the variable x . Similarly the implementation can focus on separating the type of the variable x from the type of the coefficients, since the the operators for addition (subtraction) and multiplication can be implemented completely irrespective of x ; it is only necessary to derive formal powers. Any operations regarding the polynomials need to deal with an important decision with respect to the representation and storage of the coefficients on a digital computer. A very simple manner to store coefficients of polynomials is to store them in an array, or a similarly indexable container. Such that a polynomial of the form

         5     2     1       0
p(x ) = x + 7x  + 2x  +  1.3x                          (6.1)

can be associated with an expression of the form


: Simple storage of a polynomial’s coefficients.
  std::vector<double> poly; 
  poly[0] = 1.3; 
  poly[1] = 2; 
  poly[2] = 7; 
  poly[3] = 0; 
  poly[4] = 0; 
  poly[5] = 1;

This easy method of storage, which is intuitively similar to the ubiquitous positional notation of numbers, unfortunately requires the storage of all the coefficients between the minimal and maximal power within the polynomial, since the power the coefficient is associated to is stored implicitly by the position within the container. Such a method of storage shall therefore be referred to as implicit. The efficiency of implicit storage depends on the intended field of application, which dictates the number of 0  coefficients, which still need to be stored. On the other hand memory can be allocated in as a contiguous block, wherefore it is also known as dense.

The limitations due to inefficiency of implicit representation increases as the number of zero coefficients increases. While, in a mathematical setting it is easy to deal with an infinite number of coefficients, especially, if only a finite number is non-zero. The polynomial of Equation 6.1 can also be represented in the following, completely equivalent, manner.

   c = 0, ∀i ∈ ℕ, i ⁄= 0,1,2,5                       (6.2a)
    i
c0 = 1.3,c1 = 2,c2 = 7,c5 = 1                       (6.2b)

The above notation, however, provides a model for a different approach to the storage of coefficients, in the following referred to as explicit or sparse storage. The name is due to the fact that the power to which a coefficient belongs to needs to be specified explicitly. Using an STL container this can be accomplished using


: Sparse/Explicit storage of coefficients.
std::map<longdouble> poly; 
poly[0] = 1.3; 
poly[1] = 2; 
poly[2] = 7; 
poly[5] = 1;

where in contrast the dense case all unspecified values simply do not exist, which makes it easier to deal with polynomials where many of the coefficients are zero, without incurring memory overhead, at the cost of requiring more intricate memory access mechanisms.

By using the GSSE, the different STL data structure definitions are unified into a common data specification compile time program. The next code snippet presents the equivalent dense and sparse container definitions.


: Example topology data structures.
typedef gsse::map< 
        gsse::pair<tag_dimension, mpl::int_<0> > 
      , gsse::pair<tag_storage,   double > 
    > container_dense_t; 
 
typedef gsse::map< 
        gsse::pair<tag_dimension, mpl::int_<0> > 
      , gsse::pair<tag_storage,   double > 
      , gsse::pair<tag_index,     long> 
    > container_sparse_t;

The examples given so far have made use of data types from the STL and GSSE. While these containers are quite capable of storing the coefficients, they are not equipped with operators to deal with the operations of addition and multiplication of polynomials associated with the stored values. This can be remedied by directly implementing operators for the used containers, e.g., the std::vector or the std::map. While this is definitely a viable option to model the algebraic structure of polynomials, it is not only completely lacking elegance, but also incurs serious problems. Such an approach has the profound side effect that now every container of the chosen type could be treated as a polynomial with respect to addition and multiplication, even if this is by no means intended. The situation then escalates, when different algebraic entities also utilize the same type of container, but with different algebraic structures which should, however, be represented using syntactically and formally identical operators, such as multiplication for vectors (Definition 16).

It is therefore prudent to implement custom data types, for which the specific operations are defined, to avoid these grave repercussions. The custom data type may contain one of the previously described containers for dense or sparse storage of coefficients, however, as an implementation detail, which is encapsulated and thus can be easily exchanged and adapted as the situation requires without adversely affecting the interface. Proceeding in this manner is consistent with the object-oriented paradigm as described in Section 3.1.2.

A generic library for the handling of polynomials using run time as well as compile time mechanisms [104] is available, which defines not only the basic algebraic structure of polynomials, but also provides facilities for integration and differentiation polynomials in a flexible and efficient manner. As a compile time example the derivative of a second-degree polynomial is calculated:

     2
d(10x--+-4.5x-+-3)-= 20x + 4.5
        dx
The type list represents the type of each coefficient starting from the zero to the second degree coefficient.
typedef gsse::vector_ct <double
                         double
                         int> coeffs; 
 
poly   p(x, coeffs(3.0, 4.5, 10)); 
diffed d = diff(p, x);

The combination of providing custom data types along with the ability to overload operators enable the realization of specialized domain specific embedded languages (DSELs), which can greatly increase the level of abstraction as well as ease of notation, but does not impede the run time performance. By evaluating the resulting assembler code of this compile time program, the binary only contains the final coefficients 4.5, 20.