When the first page replacement algorithms were tested in the IBM Watson Research Institute at the beginning of the 1960's, it caused a great surprise that in certain cases increasing the size of the memory leads to an increase in running time of the programs. In computer systems the phenomenon, when using more recourses leads to worse results is called anomaly. Let us give three concrete examples. The first one is in connection with the FIFO page replacement algorithm, the second one with the
algorithm used for processor scheduling, and the third one with parallel program execution in computers with interleaved memories.
Note that in two examples out of the three ones a very rare phenomenon can be observed, namely that the degree of the anomaly can be any large.
Let , , and be positive integers , a non-negative integer, a finite alphabet. is the set of the words over with length , and the words over with finite length. Let be the number of page frames in the main memory of a small, and a big computer. The FIFO algorithm has already been defined in the previous section. Since in this subsection only the FIFO page replacement algorithm is discussed, the sign of the page replacement algorithm can be omitted from the notations.
Let us denote the number of the page faults by . The event, when and is called anomaly. In this case the quotient is the degree of the anomaly. The efficiency of algorithm is measured by paging speed which is defined as
for a finite reference string , while for an infinite reference string by
Let and let be an infinite, circular reference sequence. In this case .
If we process the reference string , then we will get 9 page faults in the case of , and 10 ones in the case of , therefore, . Bélády, Nelson and Shedler has given the following necessary and sufficient condition for the existing of the anomaly.
The following has been proved as far as the degree of the anomaly is concerned.
Bélády, Nelson and Shedler had the following conjecture.
This conjecture can be refuted e. g. by the following example. Let , and , where and . If execution sequence U is executed using a physical memory with page frames, then there will be 29 page faults, and the processing results in controlling status (7,3,6,2,5). After that every execution of reference sequence causes 7 new page faults and results in the same controlling status.
If the reference string is executed using a main memory with page frames, then we get control state and 14 page faults. After that every execution of reference sequence causes 21 new page faults and results in the same control state.
Choosing the degree of the anomaly will be . As we increment the value of , the degree of the anomaly will go to three. Even more than that is true: according to the following theorem by Péter Fornai and Antal Iványi the degree of the anomaly can be any arbitrarily large.
Suppose that we would like to execute tasks on processors. By the execution the priority order of the programs has to be taken into consideration. The processors operate according to First Fit, and the execution is carried out according to a given list . E. G. Coffman jr. wrote in 1976 that decreasing the number of processors, decreasing execution time of the tasks, reducing the precedence restrictions, and altering the list can each cause an anomaly. Let the vector of the execution times of the tasks denoted by t, the precedence relation by ≤, the list by , and execution time of all the tasks with a common list on equivalent processors by .
The degree of the anomaly is measured by the ratio of the execution time at the new parameters and execution time at the original parameters. First let us show four examples for the different types of the anomaly.
Example 17.4 Consider the following task system and its scheduling received using list on equivalent processors. In this case (see Figure 17.1), which can be easily proved to be the optimal value.
Example 17.5 Schedule the previous task system for equivalent processors with list . In this case for the resulting scheduling we get (see Figure 17.2).
Example 17.6 Schedule the task system with list for processors. It results in (see Figure 17.3).
Example 17.7 Decrement the executing times by one in . Schedule the resulting task system with list for processors. The result is: (see Figure 17.4).
Example 17.8 Reduce the precedence restrictions: omit edges and from the graph. The result of scheduling of the resulting task system can be seen in Figure 17.5: .
The following example shows that the increase of maximal finishing time in the worst case can be caused not only by a wrong choice of the list.
Example 17.9 Let task system and its optimal scheduling be as showed by Figure 17.6. In this case .
We can easily prove that if the executing times are decremented by one, then in the resulting task system we cannot reach a better result than with any lists (see Figure 17.7).
After these examples we give a relative limit reflecting the effects of the scheduling parameters. Suppose that for given task systems and we have , , . Task system is scheduled with the help of list , and with — the former on , while the latter on equivalent processors. For the resulting schedulings and let and .
Proof. Consider scheduling diagram for the parameters with apostrophes (for ). Let the definition of two subsets— and —of the interval be the following: , . Note that both sets are unions of disjoint, half-open (closed from the left and open from the right) intervals.
Let be a task the execution of which ends in instant of time according to D1 (i. e., ). In this case there are two possibilities: Starting point of task is either an inner point of , or not.
If is an inner point of , then according to the definition of there is a processor for which with it holds that it does not work in the interval . This can only occur if there is a task for which and (case a).
If is not an inner point of , then either (case b), or . If has got a smaller element than (case c), then let , else let (case d). If , then it follows from the construction of and that there is a processor for which a task can be found the execution of which is still in progress in this time interval, and for which .
Summarising the two cases we can say that either there is a task for which in the case of holds (case a or c), or for every number or holds (case a or d).
Repeating this procedure we get a task chain for which it holds that in the case of either or . This proves that there are tasks for which
and in every instant of time there is a processor which is working, and is executing one of the elements of the chain. It yields
where denotes the empty periods, so the sum on the left hand side refers to all the empty periods in .
Based on (11.9) and , therefore,
The following examples show us not only that the limit in the theory is the best possible, but also that we can get the given limit (at least asymptotically) by altering any of the parameters.
If this task system is scheduled for processors with list , then we get the optimal scheduling which can be seen in Figure 17.8.
If we use the list instead of list , then we get scheduling which can be seen in Figure 17.9.
In this case , , therefore ; which means that altering the list results in the theorem holding with equality, i.e., the expression on the right hand side of the sign cannot be decreased.
Example 17.11 In this example we decrease the execution times. We use list . in both cases. Here as well as in the remaining part of the chapter denotes an arbitrarily small positive number. Original execution times are stored in vector , where
The new execution times are
The precedence graph of task system , and its modification are shown in Figure 17.10, while optimal scheduling and scheduling can be seen in Figure 17.11. Here and C = Cmax, therefore, increasing goes to value (). This means that altering the execution times we can approach the limit in the theorem arbitrarily closely.
Example 17.12 In this example we reduce the precedence restrictions. The precedence graph of task system is shown in Figure 17.12.
The execution times of the tasks are: , , if , and . The optimal scheduling of belonging to list can be seen in Figure 17.13.
Omitting all the precedence restrictions from we get the task system . Scheduling is shown in Figure 17.14.
Example 17.13 This time the number of the processors will be increased from to . The graph of task system is shown by Figure 17.15, and the running times are
Comparing the , and maximal finishing times we get the ratio and so again the required asymptotic value:
With the help of these examples we proved the following statement.
We describe the parallel algorithm modelling the operating of computers with interleaved memory in a popular way. The sequence of dumplings is modelling the reference string, the giants the processors and the bites the commands executed simultaneously. Dwarfs cook dumplings of different types. Every dwarf creates an infinite sequence of dumplings.
These sequences are usually given as random variables with possible values . For the following analysis of the extreme cases deterministic sequences are used.
The dumplings eating giants eat the dumplings. The units of the eating are the bits.
The appetite of the different giants is characterised by the parameter . Giant is able to eat up most dumplings of the same sort at one bite.
Giant eats the following way. He chooses for his first bite from the beginning from the beginning of the dumpling sequence of dwarf so many dumplings, as possible (at most of the same sort), and he adds to these dumplings so many ones from the beginning of the sequences of the dwarfs as possible.
After assembling the first bite the giant eats it, then he assembles and eats the second, third, bites.
or in a shorter form
where the star (*) denotes a subsequence repeated infinitely many times.
For his first bite chooses from the first sequence the first four dumlings 1212 (because the fifth dumpling is the third one of the sort 1) and no dumpling from the second sequence (because the beginning element is 2, and two dumplings of this sort is chosen already). The second bite contains the subsequence 1233 from the first sequence, and the dumplings 244 from the second one. The other bites are identical: 321321 from the first sequence and 44 from the second one. In a short form the bites are as follows:
(bites are separated by double lines).
For given dumpling sequences and a given giant let denote the number of dumplings in the -th bite. According to the eating-rules holds for every .
Considering the elements of the dumpling sequences as random variables with possible values and given distribution we define the dumpling-eating speed (concerning the given sequences) of as the average number of dumplings in one bite for a long time, more precisely
where denotes the expected value of the random variable .
One can see that the defined limit always exists.
Let us consider the case, when we have at least one dumpling sequence, at least one type of dumplings, and two different giants, that is let . Let the sequences be deterministic.
Since for every bite-size of holds , the same bounds are right for every average value and for every expected value , too. From this it follows, that the limits and defined in (17.16) also must lie between these bounds, that is
Choosing the maximal value of and the minimal value of and vice versa we get the following trivial upper and lower bounds for the speed ratio :
Now we show that in many cases these trivial bounds cannot be improved (and so the dumpling eating speed of a small giant can be any times bigger than that of a big giant).
Proof. To see the sharpness of the lower limit in the inequality (17.18) giving the natural limits let consider the following sequences:
Giant eats these sequences in the following manner:
Here , , , (for .
For the given sequences we have
eats these sequences as follows:
In this case we get (in a similar way, as we have got )
In order to derive the exact upper bound, we consider the following sequence:
eats these sequences as follows:
From here we get
'c eating is characterised by
Since for , therefore , and so .
We usually try to avoid anomalies.
For example at page replacing the sufficient condition of avoiding it is that the replacing algorithm should have the stack property: if the same reference string is run on computers with memory sizes of and , then after every reference it holds that the bigger memory contains all the pages that the smaller does. At the examined scheduling problem it is enough not to require the scheduling algorithm's using a list.
17.3-1 Give parameters and so that the FIFO algorithm would cause at least three more page faults with a main memory of size than with that of size .
17.3-2 Give such parameters that using scheduling with list when increasing the number of processors the maximal stopping time increases at least to half as much again.
17.3-3 Give parameters with which the dumpling eating speed of a small giant is twice as big as that of a big giant.