## 15.5. 15.5 Performance in theory

In the previous section we considered the performance measures used in the practice.

In the theoretical investigations the algorithms are tested using abstract computers called computation models.

The required quantity of resources can be characterised using absolute and relative measures.

Let , resp. denote the time necessary in worst case to solve the problem of size by the sequential algorithm A, resp. parallel algorithm P (using processors).

In a similar way let , resp. the time necessary for algorithm A, resp. P in best case to solve the problem of size (algorithm P can use processors).

Let , resp. the time needed by any sequential, resp. parallel algorithm to solve problem of size (algorithm P can use processors). These times represent a lower bound of the corresponding running time.

Let suppose the distribution function of the problem of size is given. Then let , resp. the expected value of the time necessary for algorithm A, resp. P to solve problem of size (algorithm P uses processors).

In the analysis it is often supposed that the input data of equal size have equal probability. For such cases we use the notation , resp. and termin average running time.

The value of the performance measures and depend on the used computation model too. For the simplicity of notations we suppose that the algorithms determine the computation model.

Usually the context shows in a unique way the investigated problem. If so, then the parameter is omitted.

Among these performance measures hold the following inequalities:

In a similar way for the characteristic data of the parallel algorithms the following inequalities are true:

For the expected running time we have

and

These notations can be used not only for the running time, but also for any other resource, as memory requirement, number of messages, etc.

Now we define some relative performance measures.

Speedup shows, how many times is smaller the running time of a parallel algorithm, than the running time of the parallel algorithm solving the same problem.

The speedup (or relative number of steps or relative speed) of a given parallel algorithm P, comparing it with a given sequential algorithm A, is defined as

If for a sequential algorithm A and a parallel algorithm P holds

then the speedup of P comparing with A is linear, if

then the speedup of P comparing with A is sublinear, and if

then the speedup of P comparing with A is superlinear.

In the case of parallel algorithms it is a very important performance measure the work , defined by the product of the running time and the number of the used processors:

This definition is used even then if some processors work only in a small fraction of the running time. Therefore the real work can be much smaller, then given by the formula (15.15).

The efficiency is a measure of the fraction of time for which the processors are usefully employed; it is defined as the ratio of the work of the sequential algorithm to the work of the parallel algorithm P:

One can observe, that the ratio of the speedup and the number of the used parallel processors results the same value. If the parallel work is not less than the sequential one, then efficiency is between zero and one, and the relatively large values are beneficial.

In connection with the analysis of the parallel algorithms the work-efficiency is a central concept. If for a parallel algorithm P and sequential algorithm A holds

then algorithm P work-optimal comparing with A.

This definition is equivalent with the equality

According to this definition a parallel algorithm is work-optimal only if the order of its total work is not greater, than the order of the total work of the considered sequential algorithm.

A weaker requirement is the following. If there exists a finite positive integer such that

then algorithm P is work-efficient comparing with A.

If a sequential algorithm A, resp. a parallel algorithm P uses only , resp. units of a given resource, then A, resp. P is called—for the given resource and the considered model of computation—asymptotically optimal.

If an A sequential or a P parallel algorithm uses only the necessary amount of some resource for all possible size of the input, that is , resp. units, and so we have

for A and

for P, then we say, that the given algorithm is absolute optimal for the given resource and the given computation model. In this case we say, that is the accurate complexity of the given problem.

Comparing two algorithms and having

we say, that the speeds of the growths of algorithms and asymptotically have the same order.

Comparing the running times of two algorithms A and B (e.g. in worst case) sometime the estimation depends on : for some values of algorithm A, while for other values of algorithm B is the better. A possible formal definition is as follows. If the functions and are defined for all positive integer , and for some positive integer hold

1. ;

2. ,

then the number is called crossover point of the functions and .

For example multiplying two matrices according to the definition and algorithm of Strassen we get one crossover point, whose value is about 20.

Exercises

15.5-1 Suppose that the parallel algorithms P and Q solve the selection problem. Algorithm P uses processors and its running time is . Algorithm Q uses processors and its running time is . Determine the work, speedup and efficiency for both algorithms. Are these algorithms work-optimal or at least work-efficient?

15.5-2 Analyse the following two assertions.

a) Running time of algorithm P is at least .

b) Since the running time of algorithm P is , and the running time of algorithm B is , therefore algorithm B is more efficient.

15.5-3 Extend the definition of the crossover point to noninteger values and parallel algorithms.