24.2. 24.2 The general frame of the B&B method

The aim of this section is to give a general description of the B&B method. Particular realizations of the general frame are discussed in later sections.

B&B is based on the notion of relaxation. It has not been defined yet. As there are several types of relaxations the first subsection is devoted to this notion. The general frame is discussed in the second subsection.

24.2.1. 24.2.1 Relaxation

Relaxation is discussed in two steps. There are several techniques to define relaxation to a particular problem. There is no rule for choosing among them. It depends on the design of the algorithm which type serves the algorithm well. The different types are discussed in the first part titled “Relaxations of a particular problem”. In the course of the solution of Problem (24.6) subproblems were generated which were still knapsack problems. They had their own relaxations which were not totally independent from the relaxations of each other and the main problem. The expected common properties and structure is analyzed in the second step under the title “Relaxation of a problem class”.

24.2.1.1.  Relaxations of a particular problem

The description of Problem (24.6) consists of three parts: (1) the objective function, (2) the algebraic constraints, and (3) the requirement that the variables must be binary. This structure is typical for optimization problems. In a general formulation an optimization problem can be given as

24.2.1.2.  Relaxing the non-algebraic constraints

The underlying logic of generating relaxation (24.7) is that constraint (24.16) has been substituted by a looser one. In the particular case it was allowed that the variables can take any value between 0 and 1. In general (24.16) is replaced by a requirement that the variables must belong to a set, say , which is larger than , i.e. the relation must hold. More formally the relaxation of Problem (24.14)-(24.16) is the problem

(24.14)                                                                                                   

(24.15)                                                                                                   

This type of relaxation can be applied if a large amount of difficulty can be eliminated by changing the nature of the variables.

24.2.1.3.  Relaxing the algebraic constraints

There is a similar technique such that (24.16) the inequalities (24.15) are relaxed instead of the constraints. A natural way of this type of relaxation is the following. Assume that there are inequalities in (24.15). Let be fixed numbers. Then any satisfying (24.15) also satisfies the inequality

Then the relaxation is the optimization of the (24.14) objective function under the conditions (24.18) and (24.16). The name of the inequality (24.18) is surrogate constraint.

The problem

is a general zero-one optimization problem. If then the relaxation obtained in this way is Problem (24.6). Both problems belong to NP-complete classes. However the knapsack problem is significantly easier from practical point of view than the general problem, thus the relaxation may have sense. Notice that in this particular problem the optimal solution of the knapsack problem, i.e. (1,0,1,1,0), satisfies the constraints of (24.19), thus it is also the optimal solution of the latter problem.

Surrogate constraint is not the only option in relaxing the algebraic constraints. A region defined by nonlinear boundary surfaces can be approximated by tangent planes. For example if the feasible region is the unit circuit which is described by the inequality

can be approximated by the square

If the optimal solution on the enlarged region is e.g. the point (1,1) which is not in the original feasible region then a cut must be found which cuts it from the relaxed region but it does not cut any part of the original feasible region. It is done e.g. by the inequality

A new relaxed problem is defined by the introduction of the cut. The method is similar to one of the method relaxing of the objective function discussed below.

24.2.1.4.  Relaxing the objective function

In other cases the difficulty of the problem is caused by the objective function. If it is possible to use an easier objective function, say , but to obtain an upper bound the condition

must hold. Then the relaxation is

(24.15)                                                                                                   

(24.16)                                                                                                   

This type of relaxation is typical if B&B is applied in (continuous) nonlinear optimization. An important subclass of the nonlinear optimization problems is the so-called convex programming problem. It is again a relatively easy subclass. Therefore it is reasonable to generate a relaxation of this type if it is possible. A Problem (24.14)-(24.16) is a convex programming problem, if is a convex set, the functions are convex and the objective function is concave. Thus the relaxation can be a convex programming problem if only the last condition is violated. Then it is enough to find a concave function such that (24.20) is satisfied.

For example the single variable function is not concave in the interval .

Footnote. A continuous function is concave if its second derivative is negative. which is positive in the open interval .

Thus if it is the objective function in an optimization problem it might be necessary that it is substituted by a concave function such that . It is easy to see that satisfies the requirements.

Let be the optimal solution of the relaxed problem (24.21), (24.15), and (24.16). It solves the original problem if the optimal solution has the same objective function value in the original and relaxed problems, i.e. .

Another reason why this type of relaxation is applied that in certain cases the objective function is not known in a closed form, however it can be determined in any given point. It might happen even in the case if the objective function is concave. Assume that the value of is known in the points . If concave then it is smooth, i.e. its gradient exists. The gradient determines a tangent plane which is above the function. The equation of the tangent plane in point is

Footnote. The gradient is considered being a row vector.

Hence in all points of the domain of the function we have that

Obviously the function is an approximation of function .

The idea if the method is illustrated on the following numerical example. Assume that an “unknown” concave function is to be maximized on the [0,5] closed interval. The method can start from any point of the interval which is in the feasible region. Let 0 be the starting point. According to the assumptions although the closed formula of the function is not known, it is possible to determine the values of function and its derivative. Now the values and are obtained. The general formula of the tangent line in the point is

Hence the equation of the first tangent line is giving the first optimization problem as

As is a monotone increasing function, the optimal solution is . Then the values and are provided by the method calculating the function. The equation of the second tangent line is . Thus the second optimization problem is

As the second tangent line is a monotone decreasing function, the optimal solution is in the intersection point of the two tangent lines giving . Then the values and are calculated and the equation of the tangent line is . The next optimization problem is

The optimal solution is . It is the intersection point of the first and third tangent lines. Now both new intersection points are in the interval [0,5]. In general some intersection points can be infeasible. The method goes in the same way further on. The approximated “unknow” function is .

24.2.1.5.  The Lagrange Relaxation

Another relaxation called Lagrange relaxation. In that method both the objective function and the constraints are modified. The underlying idea is the following. The variables must satisfy two different types of constraints, i.e. they must satisfy both (24.15) and (24.16). The reason that the constraints are written in two parts is that the nature of the two sets of constraints is different. The difficulty of the problem caused by the requirement of both constraints. It is significantly easier to satisfy only one type of constraints. So what about to eliminate one of them?

Assume again that the number of inequalities in (24.15) is . Let be fixed numbers. The Lagrange relaxation of the problem (24.14)–(24.16) is

(24.16)                                                                                                   

Notice that the objective function (24.22) penalizes the violation of the constraints, e.g. trying to use too much resources, and rewards the saving of resources. The first set of constraints disappeared from the problem. In most of the cases the Lagrange relaxation is a much easier one than the original problem. In what follows Problem (24.14)–(24.16) is also denoted by and the Lagrange relaxation is referred as . The notation reflects the fact that the Lagrange relaxation problem depends on the choice of 's. The numbers 's are called Lagrange multipliers.

It is not obvious that is really a relaxation of . This relation is established by

Theorem 24.2 Assume that both and have optimal solutions. Then for any nonnegative the inequality

holds.

Proof. The statement is that the optimal value of is an upper bound of the optimal value of . Let be the optimal solution of . It is obviously feasible in both problems. Hence for all the inequalities , hold. Thus which implies that

Here the right-hand side is the objective function value of a feasible solution of , i.e.

There is another connection between and which is also important from the point of view of the notion of relaxation.

Theorem 24.3 Let be the optimal solution of the Lagrange relaxation. If

and

then is an optimal solution of .

Proof. (24.23) means that is a feasible solution of . For any feasible solution of it follows from the optimality of that

i.e. is at least as good as .

The importance of the conditions (24.23) and (24.24) is that they give an optimality criterion, i.e. if a point generated by the Lagrange multipliers satisfies them then it is optimal in the original problem. The meaning of (24.23) is that the optimal solution of the Lagrange problem is feasible in the original one and the meaning of (24.24) is that the objective function values of are equal in the two problems, just as in the case of the previous relaxation. It also indicates that the optimal solutions of the two problems are coincident in certain cases.

There is a practical necessary condition for being a useful relaxation which is that the relaxed problem is easier to solve than the original problem. The Lagrange relaxation has this property. It can be shown on Problem (24.19). Let , . Then the objective function (24.22) is the following

The only constraint is that all variables are binary. It implies that if a coefficient is positive in the objective function then the variable must be 1 in the optimal solution of the Lagrange problem, and if the coefficient is negative then the variable must be 0. As the coefficient of is zero, there are two optimal solutions: (1,0,1,1,0) and (1,1,1,1,0). The first one satisfies the optimality condition thus it is an optimal solution. The second one is infeasible.

24.2.1.6.  What is common in all relaxation?

They have three common properties.

  1. All feasible solutions are also feasible in the relaxed problem.

  2. The optimal value of the relaxed problem is an upper bound of the optimal value of the original problem.

  3. There are cases when the optimal solution of the relaxed problem is also optimal in the original one.

The last property cannot be claimed for all particular case as then the relaxed problem is only an equivalent form of the original one and needs very likely approximately the same computational effort, i.e. it does not help too much. Hence the first two properties are claimed in the definition of the relaxation of a particular problem.

Definition 24.4 Let be two functions mapping from the -dimensional Euclidean space into the real numbers. Further on let be two subsets of the -dimensional Euclidean space. The problem

is a relaxation of the problem

if

(i) and

(ii) it is known priori, i.e. without solving the problems that (24.25) (24.26).

24.2.1.7.  Relaxation of a problem class

No exact definition of the notion of problem class will be given. There are many problem classes in optimization. A few examples are the knapsack problem, the more general zero-one optimization, the traveling salesperson problem, linear programming, convex programming, etc. In what follows problem class means only an infinite set of problems.

One key step in the solution of (24.6) was that the problem was divided into subproblems and even the subproblems were divided into further subproblems, and so on.

The division must be carried out in a way such that the subproblems belong to the same problem class. By fixing the value of a variable the knapsack problem just becomes another knapsack problem of lesser dimension. The same is true for almost all optimization problems, i.e. a restriction on the value of a single variable (introducing either a lower bound, or upper bound, or an exact value) creates a new problem in the same class. But restricting a single variable is not the only possible way to divide a problem into subproblems. Sometimes special constraints on a set of variables may have sense. For example it is easy to see from the first constraint of (24.19) that at most two out of the variables , , and can be 1. Thus it is possible to divide it into two subproblems by introducing the new constraint which is either , or . The resulted problems are still in the class of binary optimization. The same does not work in the case of the knapsack problem as it must have only one constraint, i.e. if a second inequality is added to the problem then the new problem is out of the class of the knapsack problems.

The division of the problem into subproblems means that the set of feasible solutions is divided into subsets not excluding the case that one or more of the subsets turn out to be empty set. and gave such an example.

Another important feature is summarized in formula (24.12). It says that the upper bound of the optimal value obtained from the undivided problem is at most as accurate as the upper bound obtained from the divided problems.

Finally, the further investigation of the subset could be abandoned as was not giving a higher upper bound as the objective function value of the optimal solution on which lies at the same time in , too, i.e. the subproblem defined on the set was solved.

The definition of the relaxation of a problem class reflects the fact that relaxation and defining subproblems (branching) are not completely independent. In the definition it is assumed that the branching method is a priori given.

Definition 24.5 Let and be two problem classes. Class is a relaxation of class if there is a map with the following properties.

  1. maps the problems of into the problems of .

  2. If a problem (P) is mapped into (Q) then (Q) is a relaxation of (P) in the sense of Definition 24.4.

  3. If (P) is divided into (), ,() and these problems are mapped into (), ,(), then the inequality

    holds.

  4. There are infinite many pairs (P), (Q) such that an optimal solution of (Q) is also optimal in (P).

24.2.2. 24.2.2 The general frame of the B&B method

As the Reader has already certainly observed B&B divides the problem into subproblems and tries to fathom each subproblem by the help of a relaxation. A subproblem is fathomed in one of the following cases:

  1. The optimal solution of the relaxed subproblem satisfies the constraints of the unrelaxed subproblem and its relaxed and non-relaxed objective function values are equal.

  2. The infeasibility of the relaxed subproblem implies that the unrelaxed subproblem is infeasible as well.

  3. The upper bound provided by the relaxed subproblem is less (in the case if alternative optimal solution are sought) or less or equal (if no alternative optimal solution is requested) than the objective function value of the best known feasible solution.

The algorithm can stop if all subsets (branches) are fathomed. If nonlinear programming problems are solved by B&B then the finiteness of the algorithm cannot be always guaranteed.

In a typical iteration the algorithm executes the following steps.

  • It selects a leaf of the branching tree, i.e. a subproblem not divided yet into further subproblems.

  • The subproblem is divided into further subproblems (branches) and their relaxations are defined.

  • Each new relaxed subproblem is solved and checked if it belongs to one of the above-mentioned cases. If so then it is fathomed and no further investigation is needed. If not then it must be stored for further branching.

  • If a new feasible solution is found which is better than the so far best one, then even stored branches having an upper bound less than the value of the new best feasible solution can be deleted without further investigation.

In what follows it is supposed that the relaxation satisfies definition 24.5.

The original problem to be solved is

(24.14)                                                                                                   

(24.15)                                                                                                   

(24.16)                                                                                                   

Thus the set of the feasible solutions is

The relaxed problem satisfying the requirements of definition 24.5 is

where and for all points of the domain of the objective functions and for all points of the domain of the constraint functions . Thus the set of the feasible solutions of the relaxation is

Let be a previously defined subset. Suppose that it is divided into the subsets , i.e.

Let and be the feasible sets of the relaxed subproblems. To satisfy the requirement (24.27) of definition 24.5 it is assumed that

The subproblems are identified by their sets of feasible solutions. The unfathomed subproblems are stored in a list. The algorithm selects a subproblem from the list for further branching. In the formal description of the general frame of B&B the following notations are used.

Note that . The frame of the algorithms can be found below. It simply describes the basic ideas of the method and does not contain any tool of acceleration.

Branch-and-Bound

  1     2     3     4  
                     WHILE
                   
                     5    
                     DO
                   determination of    6          7       determination of    8       determination of branching    9       
                     FOR
                   
                   
                  
                     TO
                   
                   
                  
                     DO
                    10            11          calculation of   12          
                     IF
                   
                    13             
                     THEN
                   
                  
                     IF
                   
                    14                
                     THEN
                   
                    15             
                     ELSE
                   
                    16      17    
                     FOR
                   
                   
                  
                     TO
                   
                   
                  
                     DO
                    18       
                     IF
                   
                    19          
                     THEN
                   
                    20  
                     RETURN
                   
               

The operations in rows 5, 7, 8, and 11 depend on the particular problem class and on the skills of the designer of the algorithm. The relaxed subproblem is solved in row 11. A detailed example is discussed in the next section. The handling of the list needs also careful consideration. Section 24.4 is devoted to this topic.

The loop in rows 17 and 18 can be executed in an implicit way. If the selected subproblem in row 5 has a low upper bound, i.e. then the subproblem is fathomed and a new subproblem is selected.

However the most important issue is the number of required operations including the finiteness of the algorithm. The method is not necessarily finite. Especially nonlinear programming has infinite versions of it. Infinite loop may occur even in the case if the number of the feasible solutions is finite. The problem can be caused by an incautious branching procedure. A branch can belong to an empty set. Assume that that the branching procedure generates subsets from such that one of the subsets is equal to and the other ones are empty sets. Thus there is an index i such that

If the same situation is repeated at the branching of then an infinite loop is possible.

Assume that a zero-one optimization problem of variables is solved by B&B and the branching is made always according to the two values of a free variable. Generally it is not known that how large is the number of the feasible solutions. There are at most feasible solutions as it is the number of the zero-one vectors. After the first branching there are at most feasible solutions in the two first level leaves, each. This number is halved with each branching, i.e. in a branch on level there are at most feasible solutions. It implies that on level there is at most feasible solution. As a matter of fact on that level there is exactly 1 zero-one vector and it is possible to decide whether or not it is feasible. Hence after generating all branches on level the problem can be solved. This idea is generalized in the following finiteness theorem. While formulating the statement the previous notations are used.

Theorem 24.6 Assume that

(i) The set is finite.

(ii) There is a finite set such that the following conditions are satisfied. If a subset is generated in the course of the branch and bound method then there is a subset of such that . Furthermore if the branching procedure creates the cover then has a partitioning such that

and moreover

(iii) If a set belonging to set has only a single element then the relaxed subproblem solves the unrelaxed subproblem as well.

Then the Branch-and-Bound procedure stops after finite many steps. If then there is no feasible solution. Otherwise is equal to the optimal objective function value.

Proof. Assume that the procedure Branch-and-Bound executes infinite many steps. As the set is finite it follows that there is at least one subset of say such that it defines infinite many branches implying that the situation described in (24.29) occurs infinite many times. Hence there is an infinite sequence of indices, say , such that is created at the branching of and . On the other hand the parallel sequence of the sets must satisfy the inequalities

It is impossible because the s are finite sets.

The finiteness of implies that optimal solution exists if and only if is nonempty, i.e. the problem cannot be unbounded and if feasible solution exist then the supremum of the objective function is its maximum. The initial value of is . It can be changed only in row 14 of the algorithm and if it is changed then it equals to the objective function value of a feasible solution. Thus if there is no feasible solution then it remains . Hence if the second half of the statement is not true, then at the end of the algorithm equal the objective function value of a non-optimal feasible solution or it remains .

Let be the maximal index such that still contains the optimal solution. Then

Hence it is not possible that the branch containing the optimal solution has been deleted from the list in the loop of rows 17 and 18, as . It is also sure that the subproblem

has not been solved, otherwise the equation should hold. Then only one option remained that was selected for branching once in the course of the algorithm. The optimal solution must be contained in one of its subsets, say which contradicts the assumption that has the highest index among the branches containing the optimal solution.

Remark. Notice that the binary problems mentioned above with 's of type

where is the set of fixed variables and is a fixed value, satisfy the conditions of the theorem.

If an optimization problem contains only bounded integer variables then the sets s are the sets the integer vectors in certain boxes. In the case of some scheduling problems where the optimal order of tasks is to be determined even the relaxations have combinatorial nature because they consist of permutations. Then is also possible. In both of the cases Condition (iii) of the theorem is fulfilled in a natural way.

Exercises

24.2-1 Decide if the Knapsack Problem can be a relaxation of the Linear Binary Optimization Problem in the sense of Definition 24.5. Explain your solution regardless that your answer is YES or NO.