6.5.1 Performance

Operation	Step	Counts	Storage	CPU
QR-factorization of a N_ODE `x` N_ODE/2 matrix	1 & 2.b	N_p	N_ODE²/2	N_ODE³/2
IVP integration to obtain N_ODE/2 linearly independent solutions	2.a	N_ODE/2	N_ODE	N_ODE²/2 `x` N_z
LU-factorization of a N_ODE/2 `x` N_ODE/2 matrix	3	1	N_ODE²/4	N_ODE³/8
IVP integration to obtain N_s BVP solutions	4	N_s	N_ODE `x` N_z	N_ODE²/2 `x` N_z

The following notation is used: N_ODE stands for the order of the ordinary differential equation system, N_s signifies the number of boundary conditions, N_p denotes the number of shooting intervals, and N_z is the number of discretization points in the whole simulation interval [0, h].

The computational costs can be gathered from Table 6.5 by multiplying the costs of each operation with its counts. Since the order N_ODE of the ODE is large in comparison to the number of vertical discretization points N_z, the number of source points N_s, and the number of shooting intervals N_p, i.e., N_ODE $\gg$ N_z, N_s, N_p, they are proportional to $\mathcal {O}$ (N_z x N_ODE³/4) as can be seen from the second row of Table 6.5. Hence, most of the computational effort in the stabilized march algorithm goes into the IVP integrations. The application of reduced superposition is thus of crucial importance since the number of required IVP integrations is halved. The second largest expense are the N_p QR-factorizations requiring $\mathcal {O}$ (N_p x N_ODE³/2) operations. The additional costs $\mathcal {O}$ (N_ODE³/8) of the LU-factorization due to the multiple BCs resulting from the distributed source are negligible, which is one of the major advantages of the proposed implementation of the differential method.

The memory demands of the algorithm are dominated by the storage of the linearly independent solution matrix as can be seen from Table 6.5 and thus grow with $\mathcal {O}$ (N_ODE²/2). The counts do not to have be considered here since the solution is propagated recursively through the simulation interval by the marching technique. Hence, also the entire memory demands of the stabilized march algorithm are proportional to $\mathcal {O}$ (N_ODE²/2).