## 9.3. 9.3 Models related to computer networks

The theory of computer networks has become one of the most significant areas of computer science. In the planning of computer networks many optimisation problems arise and most of these problems are actually on-line, since neither the traffic nor the changes in the topology of a computer network cannot be precisely predicted. Recently some researchers working at the area of on-line algorithms have defined some on-line mathematical models for problems related to computer networks. In this section we consider this area; we present three problems and show the basic results. First the data acknowledgement problem is considered, then we present the web caching problem, and the section is closed by the on-line routing problem.

### 9.3.1. 9.3.1 The data acknowledgement problem

In the communication of a computer network the information is sent by packets. If the communication channel is not completely safe, then the arrival of the packets are acknowledged. The data acknowledgement problem is to determine the time of sending acknowledgements. An acknowledgement can acknowledge many packets but waiting for long time can cause the resending of the packets and that results in the congestion of the network. On the other hand, sending an acknowledgement about the arrival of each packet immediately would cause again the congestion of the network. The first optimisation model for determining the sending times of the acknowledgements was developed by Dooly, Goldman and Scott in 1998. We present the developed model and some of the basic results.

In the mathematical model of the data acknowledgement problem the input is the list of the arrival times of the packets. The decision maker has to determine when to send acknowledgements; these times are denoted by . In the optimisation model the cost function is:

where is the number of the sent acknowledgements and is the total latency collected by the -th acknowledgement. We consider the on-line problem which means that at time the decision maker only knows the arrival times of the packets already arrived and has no information about the further packets. We denote the set of the unacknowledged packets at the arrival time by .

For the solution of the problem the class of the alarming algorithms has been developed. An alarming algorithm works as follows. At the arrival time an alarm is set for time . If no packet arrives before time , then an acknowledgement is sent at time which acknowledges all of the unacknowledged packets. Otherwise at the arrival of the next packet at time the alarm is reset for time . Below we analyse an algorithm from this class in details. This algorithm sets the alarm to collect total latency by the acknowledgement. The algorithm is called `Alarm` . We obtain the above defined rule from the general definition using the solution of the following equation as value :

Example 9.4 Consider the following example. The first packet arrives at time (), so `Alarm` sets an alarm with value for time . Suppose that the next arrival time is . This arrival is before the alarm time, thus the first packet has not been acknowledged yet and we reset the alarm with value for time . Suppose that the next arrival time is . This arrival is before the alarm time, thus the first two packets have not been acknowledged yet and we reset the alarm with value for time . Suppose that the next arrival time is . No packet arrived before the alarm time , thus at that time the first three packets were acknowledged and the alarm is set for the new packet with value for time .

Theorem 9.5 Algorithm `Alarm` is 2-competitive.

Proof. Suppose that algorithm `Alarm` sends acknowledgements. These acknowledgements divide the time into time intervals. The cost of the algorithm is , since is the cost of the acknowledgements, and the alarm is set to have total latency for each acknowledgement.

Suppose that the optimal off-line algorithm sends acknowledgements. If , then is obviously valid, thus we obtain that the algorithm is -competitive. If , then at least time intervals among the ones defined by the acknowledgements of algorithm `Alarm` do not contain any of the off-line acknowledgements. This yields that the off-line total latency is at most , thus we obtain that which inequality proves that `Alarm` is -competitive.

As the following theorem shows, algorithm `Alarm` has the smallest possible competitive ratio.

Theorem 9.6 There is no on-line algorithm for the data acknowledgement problem which has smaller competitive ratio than .

Proof. Consider an arbitrary on-line algorithm and denote it by `ONL` . Analyse the following input. Consider a long sequence of packets where the packets always arrive immediately after the time when `ONL` sends an acknowledgement. The on-line cost of a sequence containing packets is , since the cost resulted from the acknowledgements is , and the latency of the -th acknowledgement is , where the value is used.

Consider the following two on-line algorithms. `ODD` sends the acknowledgements after the odd numbered packets and `EVEN` sends the acknowledgements after the even numbered packets.

The costs achieved by these algorithms are

and

Therefore . On the other hand, none of the costs achieved by `ODD` and `EVEN` is greater than the optimal off-line cost, thus , which yields that . From this inequality it follows that the competitive ratio of `ONL` is not smaller than 2, because using a sufficiently long sequence of packets the value can be arbitrarily large.

### 9.3.2. 9.3.2 The file caching problem

The file caching problem is a generalisation of the caching problem presented in the chapter on memory management. World-wide-web browsers use caches to store some files. This makes it possible to use the stored files if a user wants to see some web-page many times during a short time interval. If the cache becomes full, then some files must be eliminated to make space for the new file. The file caching problem models this scenario; the goal is to find good strategies for determining which files should be eliminated. It differs from the standard paging problem in the fact that the files have size and retrieval cost (the problem is reduced to the paging if each size and each retrieval cost are ). So the following mathematical model describes the problem.

There is a given cache of size and the input is a sequence of pages. Each page has a size denoted by and a retrieval cost denoted by . The pages arrive from a list one by one and after the arrival of a page the algorithm has to place it into the cache. If the page is not contained in the cache and there is not enough space to put it into the cache, then the algorithm has to delete some pages from the cache to make enough space for the requested page. If the required page is in the cache, then the cost of serving the request is , otherwise the cost is . The aim is to minimise the total cost. The problem is on-line which means that for the decisions (which pages should be deleted from the cache) only the earlier pages and decisions can be used, the algorithm has no information about the further pages. We assume that the size of the cache and also the sizes of the pages are positive integers.

For the solution of the problem and for its special cases many algorithms have been developed. Here we present algorithm `Landlord` which was developed by Young.

The algorithm stores a credit value for each page which is contained in the current cache. In the rest of the section the set of the pages in the current cache of `Landlord` is denoted by . If `Landlord` has to retrieve a page then the following steps are performed.

`Landlord(` `)`

```  1
`IF`

is not contained in    2
`THEN`

`WHILE`
there is not enough space for    3
`DO`

4          for each  let    5          evict some pages with    6          place  into cache  and let    7
`ELSE`
reset  to any value between  and
```

Example 9.5 Suppose that and contains the following three pages: with , with and with . Suppose that the next requested page is , with parameters and . Therefore, there is not enough space for it in the cache, so some pages must be evicted. `Landlord` determines the value and changes the credits as follows: and , thus is evicted from cache . There is still not enough space for in the cache. The new value is and the new credits are: , thus is evicted from the cache. Then there is enough space for , thus it is placed into cache with the credit value .

`Landlord` is weakly -competitive, but a stronger statement is also true. For the web caching problem an on-line algorithm `ALG` is called -competitive, if there exists such a constant , that is valid for each input, where is the cost of `ALG` using a cache of size and is the optimal off-line cost using a cache of size . The following statement holds for algorithm `Landlord` .

Theorem 9.7 If , then algorithm `Landlord` is -competitive.

Proof. Consider an arbitrary input sequence of pages and denote the input by . We use the potential function technique. During the analysis of the procedure we suppose that an off-line optimal algorithm with cache size and `Landlord` with cache size are running parallel on the input. We also suppose that each page is placed first into the off-line cache by the off-line algorithm and then it is placed into by the on-line algorithm. We denote the set of the pages contained in the actual cache of the optimal off-line algorithm by `OPT` . Consider the following potential function

The changes of the potential function during the different steps are as follows

`OPT` places into its cache.

In this case `OPT` has cost . In the potential function only the second part may change. On the other hand, , thus the increase of the potential function is at most .

`Landlord` decreases the credit value for each .

In this case for each the decrease of is , thus the decrease of is

where and denote the total size of the pages contained in sets and , respectively. At the time when this step is performed, `OPT` have already placed page into its cache , but the page is not contained in cache . Therefore . On the other hand, this step is performed if there is not enough space for the page in thus , which yields , because the sizes are positive integers. Therefore we obtain that the decrease of is at least

Since and , this decrease is at least .

`Landlord` evicts a page from cache .

Since `Landlord` only evicts pages having credit , during this step remains unchanged.

`Landlord` places page into cache and sets the value .

The cost of `Landlord` is . On the other hand, was not contained in cache before the performance of this step, thus was valid. Furthermore, first `OPT` places the page into its cache, thus is also valid. Therefore the decrease of is .

`Landlord` resets the credit of a page to a value between and .

In this case is valid, since `OPT` places page into its cache first. Value is not decreased and , thus can not increase during this step.

Hence the potential function has the following properties.

If `OPT` places a page into its cache, then the increase of the potential function is at most times more than the cost of `OPT` .

If `Landlord` places a page into its cache, then the decrease of is times more than the cost of `Landlord` .

During the other steps does not increase.

By the above properties we obtain that , where and are the starting and final values of the potential function. The potential function is nonnegative, thus we obtain that , which proves that `Landlord` is -competitive.

### 9.3.3. 9.3.3 On-line routing

In computer networks the congestion of the communication channels decreases the speed of the communication and may cause loss of information. Thus congestion control is one of the most important problems in the area of computer networks. A related important problem is the routing of the communication, where we have to determine the path of the messages in the network. Since we have no information about the further traffic of the network, thus routing is an on-line problem. Here we present two on-line optimisation models for the routing problem.

The mathematical model

The network is given by a graph, each edge has a maximal available bandwidth denoted by and the number of edges is denoted by . The input is a sequence of requests, where the -th request is given by a vector which means that to satisfy the request bandwidth must be reserved on a path from to for time duration and the benefit of serving a request is . Hereafter, we assume that , and we omit the value of from the requests. The problem is on-line which means that after the arrival of a request the algorithm has to make the decisions without any information about the further requests. We consider the following two models.

Load balancing model: In this model all requests must be satisfied. Our aim is to minimise the maximum of the overload of the edges. The overload is the ratio of the total bandwidth assigned to the edge and the available bandwidth. Since each request is served, thus the benefit is not significant in this model.

Throughput model: In this model the decision maker is allowed to reject some requests. The sum of the bandwidths reserved on an edge can not be more than the available bandwidth. The goal is to maximise the sum of the benefits of the accepted requests. We investigate this model in details. It is important to note that this is a maximisation problem thus the notion of competitiveness is used in the form defined for maximisation problems.

Below we define the exponential algorithm. We need the following notations to define and analyse the algorithm. Let denote the path which is assigned to the accepted request . Let denote the set of requests accepted by the on-line algorithm. In this case is the ratio of the total reserved bandwidth and the available bandwidth on before the arrival of request .

The basic idea of the exponential algorithm is the following. The algorithm assigns a cost to each , which is exponential in and chooses the path which has the minimal cost. Below we define and analyse the exponential algorithm for the throughput model. Let be a constant which depends on the parameters of the problem; its value will be given later. Let , for each request and edge . The exponential algorithm performs the following steps after the arrival of a request .

`EXP(` `)`

```  1  let  be the set of the paths    2     3
`IF`

4
`THEN`
reserve bandwidth  on path    5
`ELSE`
reject the request ```

Note. If we modify this algorithm to accept each request, then we obtain an exponential algorithm for the load balancing model.

Example 9.6 Consider the network which contains vertices and edges , where the available bandwidths of the edges are , , , . Suppose that and that the reserved bandwidths are: on path , on path , on path , on path . The next request is to reserve bandwidth on some path between and . Therefore values are: , , , . There are two paths between and and the costs are

The minimal cost belongs to path . Therefore, if , then the request is accepted and the bandwidth is reserved on path . Otherwise the request is rejected.

To analyse the algorithm consider an arbitrary input sequence . Let denote the set of the requests accepted by `EXP` , and the set of the requests which are accepted by `OPT` and rejected by `EXP` . Furthermore let denote the path reserved by `OPT` for each request accepted by `OPT` . Define the value for each , which value gives the ratio of the reserved bandwidth and the available bandwidth for at the end of the on-line algorithm. Furthermore, let for each .

Let , where is an upper bound on the benefits and for each request and each edge the inequality

is valid. In this case the following statements hold.

Lemma 9.8 The solution given by algorithm `EXP` is feasible, i.e. the sum of the reserved bandwidths is not more than the available bandwidth for each edge.

Proof. We prove the statement by contradiction. Suppose that there is an edge where the available bandwidth is violated. Let be the first accepted request which violates the available bandwidth on .

The inequality is valid for and (it is valid for all edges and requests). Furthermore, after the acceptance of request the sum of the bandwidths is greater than the available bandwidth on edge , thus we obtain that . On the other hand, this yields that the inequality

holds for value used in algorithm `EXP` . Using the assumption on we obtain that , and , thus from the above inequality we obtain that

On the other hand, this inequality is a contradiction, since `EXP` would reject the request. Therefore we obtained a contradiction thus we proved the statement of the lemma.

Lemma 9.9 For the solution given by `OPT` the following inequality holds:

Proof. Since `EXP` rejected each , thus for each , and this inequality is valid for all paths between and . Therefore

On the other hand, holds for each , thus we obtain that

The sum of the bandwidths reserved by `OPT` is at most the available bandwidth for each , thus .

Consequently

which inequality is the one which we wanted to prove.

Lemma 9.10 For the solution given by algorithm `EXP` the following inequality holds:

Proof. It is enough to show that the inequality is valid for each request . On the other hand,

Since , if , and because of the assumptions , we obtain that

Summarising the bounds given above we obtain that

Since `EXP` accepts the requests with the property , the above inequality proves the required statement.

With the help of the above lemmas we can prove the following theorem.

Theorem 9.11 Algorithm `EXP` is -competitive, if , where is an upper bound on the benefits, and for all edges and requests

Proof. From Lemma 9.8 it follows that the algorithm results in a feasible solution where the available bandwidths are not violated. Using the notations defined above we obtain that the benefit of algorithm `EXP` on the input is , and the benefit of `OPT` is at most . Therefore by Lemma 9.9 and Lemma 9.10 it follows that

which inequality proves the theorem.

Exercises

9.3-1 Consider the modified version of the data acknowledgement problem with the objective function , where is the number of acknowledgements and is the maximal latency of the -th acknowledgement. Prove that algorithm `Alarm` is also 2-competitive in this modified model.

9.3-2 Represent the special case of the web caching problem, where for each page as a special case of the -server problem. Define the metric space which can be used.

9.3-3 In the web caching problem cache of size contains three pages with the following sizes and credits: . We want to retrieve a page of size and retrieval cost . The optimal off-line algorithm `OPT` with cache of size already placed the page into its cache, so its cache contains the pages and . Which pages are evicted by `Landlord` to place ? In what way does the potential function defined in the proof of Theorem 9.7 change?

9.3-4 Prove that if in the throughput model no bounds are given for the ratios , then there is no constant-competitive on-line algorithm.