8.5 Schemata and the Schema Theorem

The notion of schemata and the Schema theorem provide the theoretical background why genetic algorithms work so well in practice. The notion of schemata is fundamental to the following analysis. After introducing it the Schema theorem will be deduced.

The theoretical background of genetic algorithms is seen most clearly using the binary string representation and the classic algorithm (cf. Section 8.4.2). First the wildcard shall match any single component of a vector. A schema is built of the alphabet of genes (the fixed components) augmented by the wildcard symbol [49,77], e.g., matches and if the alphabet is . Hence a schema represents all strings of the search space which match the schema at all components except .

The following analysis is performed for the classic genetic algorithm, whose inner loop is the following.

- .
- Select the new population from the old one by single string selection based on a positive fitness function.
- Recombine by one point crossover and mutation.
- Evaluate .
- Repeat.

In the single string selection step of the classic genetic algorithm considered, an individual has the probability of being selected. After the selection step it is expected that

Introducing the average fitness of the population , this can be written as

The interpretation of this formula is that the number of individuals matching a schema grows in each time step like the ratio of the fitness of and the total fitness. Thus an above average schema, i.e., a schema with , will proliferate successfully in the next generation, but a below average schema will not. In the long term the formula implies that above average schemata will proliferate exponentially.

The next step after selection is recombination, i.e., crossover and mutation, where new individuals are introduced into the population. The effects of crossover are considered first. A crossover site is selected uniformly among possible sites, where is the length of the vector representing an individual. Hence is the probability that the schema is destroyed if a certain vector (or chromosome) undergoes crossover, and thus

Thus taking into account selection and crossover, the number of individuals matching a schema can be estimated by

The final operator to be considered is mutation. Let be the probability that a single component is changed. Since mutations are independent from one another, the probability that a schema survives a mutation is

Hence the final estimation of is

The Schema theorem can thus be stated as follows [40,77,13].

Clemens Heitzinger 2003-05-08