The theory of computer networks has become one of the most significant areas of computer science. In the planning of computer networks many optimisation problems arise and most of these problems are actually on-line, since neither the traffic nor the changes in the topology of a computer network cannot be precisely predicted. Recently some researchers working at the area of on-line algorithms have defined some on-line mathematical models for problems related to computer networks. In this section we consider this area; we present three problems and show the basic results. First the data acknowledgement problem is considered, then we present the web caching problem, and the section is closed by the on-line routing problem.
In the communication of a computer network the information is sent by packets. If the communication channel is not completely safe, then the arrival of the packets are acknowledged. The data acknowledgement problem is to determine the time of sending acknowledgements. An acknowledgement can acknowledge many packets but waiting for long time can cause the resending of the packets and that results in the congestion of the network. On the other hand, sending an acknowledgement about the arrival of each packet immediately would cause again the congestion of the network. The first optimisation model for determining the sending times of the acknowledgements was developed by Dooly, Goldman and Scott in 1998. We present the developed model and some of the basic results.
In the mathematical model of the data acknowledgement problem the input is the list of the arrival times of the packets. The decision maker has to determine when to send acknowledgements; these times are denoted by . In the optimisation model the cost function is:
where is the number of the sent acknowledgements and is the total latency collected by the -th acknowledgement. We consider the on-line problem which means that at time the decision maker only knows the arrival times of the packets already arrived and has no information about the further packets. We denote the set of the unacknowledged packets at the arrival time by .
For the solution of the problem the class of the alarming algorithms has been developed. An alarming algorithm works as follows. At the arrival time an alarm is set for time . If no packet arrives before time , then an acknowledgement is sent at time which acknowledges all of the unacknowledged packets. Otherwise at the arrival of the next packet at time the alarm is reset for time . Below we analyse an algorithm from this class in details. This algorithm sets the alarm to collect total latency by the acknowledgement. The algorithm is called
. We obtain the above defined rule from the general definition using the solution of the following equation as value :
Example 9.4 Consider the following example. The first packet arrives at time (), so
sets an alarm with value for time . Suppose that the next arrival time is . This arrival is before the alarm time, thus the first packet has not been acknowledged yet and we reset the alarm with value for time . Suppose that the next arrival time is . This arrival is before the alarm time, thus the first two packets have not been acknowledged yet and we reset the alarm with value for time . Suppose that the next arrival time is . No packet arrived before the alarm time , thus at that time the first three packets were acknowledged and the alarm is set for the new packet with value for time .
Proof. Suppose that algorithm
sends acknowledgements. These acknowledgements divide the time into time intervals. The cost of the algorithm is , since is the cost of the acknowledgements, and the alarm is set to have total latency for each acknowledgement.
Suppose that the optimal off-line algorithm sends acknowledgements. If , then is obviously valid, thus we obtain that the algorithm is -competitive. If , then at least time intervals among the ones defined by the acknowledgements of algorithm
do not contain any of the off-line acknowledgements. This yields that the off-line total latency is at most , thus we obtain that which inequality proves that
As the following theorem shows, algorithm
has the smallest possible competitive ratio.
Proof. Consider an arbitrary on-line algorithm and denote it by
. Analyse the following input. Consider a long sequence of packets where the packets always arrive immediately after the time when
sends an acknowledgement. The on-line cost of a sequence containing packets is , since the cost resulted from the acknowledgements is , and the latency of the -th acknowledgement is , where the value is used.
Consider the following two on-line algorithms.
sends the acknowledgements after the odd numbered packets and
sends the acknowledgements after the even numbered packets.
The costs achieved by these algorithms are
Therefore . On the other hand, none of the costs achieved by
is greater than the optimal off-line cost, thus , which yields that . From this inequality it follows that the competitive ratio of
is not smaller than 2, because using a sufficiently long sequence of packets the value can be arbitrarily large.
The file caching problem is a generalisation of the caching problem presented in the chapter on memory management. World-wide-web browsers use caches to store some files. This makes it possible to use the stored files if a user wants to see some web-page many times during a short time interval. If the cache becomes full, then some files must be eliminated to make space for the new file. The file caching problem models this scenario; the goal is to find good strategies for determining which files should be eliminated. It differs from the standard paging problem in the fact that the files have size and retrieval cost (the problem is reduced to the paging if each size and each retrieval cost are ). So the following mathematical model describes the problem.
There is a given cache of size and the input is a sequence of pages. Each page has a size denoted by and a retrieval cost denoted by . The pages arrive from a list one by one and after the arrival of a page the algorithm has to place it into the cache. If the page is not contained in the cache and there is not enough space to put it into the cache, then the algorithm has to delete some pages from the cache to make enough space for the requested page. If the required page is in the cache, then the cost of serving the request is , otherwise the cost is . The aim is to minimise the total cost. The problem is on-line which means that for the decisions (which pages should be deleted from the cache) only the earlier pages and decisions can be used, the algorithm has no information about the further pages. We assume that the size of the cache and also the sizes of the pages are positive integers.
For the solution of the problem and for its special cases many algorithms have been developed. Here we present algorithm
which was developed by Young.
The algorithm stores a credit value for each page which is contained in the current cache. In the rest of the section the set of the pages in the current cache of
is denoted by . If
has to retrieve a page then the following steps are performed.
IFis not contained in 2
WHILEthere is not enough space for 3
DO4 for each let 5 evict some pages with 6 place into cache and let 7
ELSEreset to any value between and
Example 9.5 Suppose that and contains the following three pages: with , with and with . Suppose that the next requested page is , with parameters and . Therefore, there is not enough space for it in the cache, so some pages must be evicted.
determines the value and changes the credits as follows: and , thus is evicted from cache . There is still not enough space for in the cache. The new value is and the new credits are: , thus is evicted from the cache. Then there is enough space for , thus it is placed into cache with the credit value .
is weakly -competitive, but a stronger statement is also true. For the web caching problem an on-line algorithm
is called -competitive, if there exists such a constant , that is valid for each input, where is the cost of
using a cache of size and is the optimal off-line cost using a cache of size . The following statement holds for algorithm
Proof. Consider an arbitrary input sequence of pages and denote the input by . We use the potential function technique. During the analysis of the procedure we suppose that an off-line optimal algorithm with cache size and
with cache size are running parallel on the input. We also suppose that each page is placed first into the off-line cache by the off-line algorithm and then it is placed into by the on-line algorithm. We denote the set of the pages contained in the actual cache of the optimal off-line algorithm by
. Consider the following potential function
The changes of the potential function during the different steps are as follows
places into its cache.
In this case
has cost . In the potential function only the second part may change. On the other hand, , thus the increase of the potential function is at most .
decreases the credit value for each .
In this case for each the decrease of is , thus the decrease of is
where and denote the total size of the pages contained in sets and , respectively. At the time when this step is performed,
have already placed page into its cache , but the page is not contained in cache . Therefore . On the other hand, this step is performed if there is not enough space for the page in thus , which yields , because the sizes are positive integers. Therefore we obtain that the decrease of is at least
Since and , this decrease is at least .
evicts a page from cache .
only evicts pages having credit , during this step remains unchanged.
places page into cache and sets the value .
The cost of
is . On the other hand, was not contained in cache before the performance of this step, thus was valid. Furthermore, first
places the page into its cache, thus is also valid. Therefore the decrease of is .
resets the credit of a page to a value between and .
In this case is valid, since
places page into its cache first. Value is not decreased and , thus can not increase during this step.
Hence the potential function has the following properties.
places a page into its cache, then the increase of the potential function is at most times more than the cost of
places a page into its cache, then the decrease of is times more than the cost of
During the other steps does not increase.
By the above properties we obtain that , where and are the starting and final values of the potential function. The potential function is nonnegative, thus we obtain that , which proves that
In computer networks the congestion of the communication channels decreases the speed of the communication and may cause loss of information. Thus congestion control is one of the most important problems in the area of computer networks. A related important problem is the routing of the communication, where we have to determine the path of the messages in the network. Since we have no information about the further traffic of the network, thus routing is an on-line problem. Here we present two on-line optimisation models for the routing problem.
The mathematical model
The network is given by a graph, each edge has a maximal available bandwidth denoted by and the number of edges is denoted by . The input is a sequence of requests, where the -th request is given by a vector which means that to satisfy the request bandwidth must be reserved on a path from to for time duration and the benefit of serving a request is . Hereafter, we assume that , and we omit the value of from the requests. The problem is on-line which means that after the arrival of a request the algorithm has to make the decisions without any information about the further requests. We consider the following two models.
Load balancing model: In this model all requests must be satisfied. Our aim is to minimise the maximum of the overload of the edges. The overload is the ratio of the total bandwidth assigned to the edge and the available bandwidth. Since each request is served, thus the benefit is not significant in this model.
Throughput model: In this model the decision maker is allowed to reject some requests. The sum of the bandwidths reserved on an edge can not be more than the available bandwidth. The goal is to maximise the sum of the benefits of the accepted requests. We investigate this model in details. It is important to note that this is a maximisation problem thus the notion of competitiveness is used in the form defined for maximisation problems.
Below we define the exponential algorithm. We need the following notations to define and analyse the algorithm. Let denote the path which is assigned to the accepted request . Let denote the set of requests accepted by the on-line algorithm. In this case is the ratio of the total reserved bandwidth and the available bandwidth on before the arrival of request .
The basic idea of the exponential algorithm is the following. The algorithm assigns a cost to each , which is exponential in and chooses the path which has the minimal cost. Below we define and analyse the exponential algorithm for the throughput model. Let be a constant which depends on the parameters of the problem; its value will be given later. Let , for each request and edge . The exponential algorithm performs the following steps after the arrival of a request .
1 let be the set of the paths 2 3
THENreserve bandwidth on path 5
ELSEreject the request
Note. If we modify this algorithm to accept each request, then we obtain an exponential algorithm for the load balancing model.
Example 9.6 Consider the network which contains vertices and edges , where the available bandwidths of the edges are , , , . Suppose that and that the reserved bandwidths are: on path , on path , on path , on path . The next request is to reserve bandwidth on some path between and . Therefore values are: , , , . There are two paths between and and the costs are
The minimal cost belongs to path . Therefore, if , then the request is accepted and the bandwidth is reserved on path . Otherwise the request is rejected.
To analyse the algorithm consider an arbitrary input sequence . Let denote the set of the requests accepted by
, and the set of the requests which are accepted by
and rejected by
. Furthermore let denote the path reserved by
for each request accepted by
. Define the value for each , which value gives the ratio of the reserved bandwidth and the available bandwidth for at the end of the on-line algorithm. Furthermore, let for each .
Let , where is an upper bound on the benefits and for each request and each edge the inequality
is valid. In this case the following statements hold.
The inequality is valid for and (it is valid for all edges and requests). Furthermore, after the acceptance of request the sum of the bandwidths is greater than the available bandwidth on edge , thus we obtain that . On the other hand, this yields that the inequality
holds for value used in algorithm
. Using the assumption on we obtain that , and , thus from the above inequality we obtain that
On the other hand, this inequality is a contradiction, since
would reject the request. Therefore we obtained a contradiction thus we proved the statement of the lemma.
On the other hand, holds for each , thus we obtain that
The sum of the bandwidths reserved by
is at most the available bandwidth for each , thus .
which inequality is the one which we wanted to prove.
Since , if , and because of the assumptions , we obtain that
Summarising the bounds given above we obtain that
accepts the requests with the property , the above inequality proves the required statement.
With the help of the above lemmas we can prove the following theorem.
Proof. From Lemma 9.8 it follows that the algorithm results in a feasible solution where the available bandwidths are not violated. Using the notations defined above we obtain that the benefit of algorithm
on the input is , and the benefit of
is at most . Therefore by Lemma 9.9 and Lemma 9.10 it follows that
which inequality proves the theorem.
9.3-1 Consider the modified version of the data acknowledgement problem with the objective function , where is the number of acknowledgements and is the maximal latency of the -th acknowledgement. Prove that algorithm
is also 2-competitive in this modified model.
9.3-2 Represent the special case of the web caching problem, where for each page as a special case of the -server problem. Define the metric space which can be used.
9.3-3 In the web caching problem cache of size contains three pages with the following sizes and credits: . We want to retrieve a page of size and retrieval cost . The optimal off-line algorithm
with cache of size already placed the page into its cache, so its cache contains the pages and . Which pages are evicted by
to place ? In what way does the potential function defined in the proof of Theorem 9.7 change?
9.3-4 Prove that if in the throughput model no bounds are given for the ratios , then there is no constant-competitive on-line algorithm.