In many situations in engineering and economy there are cases when the conflicting interests of several decision makers have to be taken into account simultaneously, and the outcome of the situation depends on the actions of these decision makers. One of the most popular methodology and modeling is based on game theory.
Let denote the number of decision makers (who will be called players), and for each let be the set of all feasible actions of player . The elements are called strategies of player , is the strategy set of this player. In any realization of the game each player selects a strategy, then the vector is called a simultaneous strategy vector of the players. For each each player has an outcome which is assumed to be a real value. This value can be imagined as the utility function value of the particular outcome, in which this function represents how player evaluates the outcomes of the game. If denotes this value, then is called the payoff function of player . The value is called the payoff of player and is called the payoff vector. The number of players, the sets of strategies and the payoff functions completely determine and define the -person game. We will also use the notation for this game.
The solution of game is the Nash-equilibrium, which is a simultaneous strategy vector such that for all ,
Condition means that the -th component of the equilibrium is a feasible strategy of player , and condition shows that none of the players can increase its payoff by unilaterally changing its strategy. In other words, it is the interest of all players to keep the equilibrium since if any player departs from the equilibrium, its payoff does not increase.
Game is called finite if the number of players is finite and all strategy sets contain finitely many strategies. The most famous two-person finite game is the prisoner's dilemma, which is the following.
Example 10.1 The players are two prisoners who committed a serious crime, but the prosecutor has only insufficient evidence to prosecute them. The prisoners are held in separate cells and cannot communicate, and the prosecutor wants them to cooperate with the authorities in order to get the needed additional information. So , and the strategy sets for both players have two elements: cooperating (), or not cooperating (). It is told to both prisoners privately that if he is the only one to confess, then he will get only a light sentence of year, while the other will go to prison for a period of 10 years. If both confess, then their reward will be a 5 year prison sentence each, and if none of them confesses, then they will be convicted to a less severe crime with sentence of 2 years each. The objective of both players are to minimize the time spent in prison, or equivalently to maximize its negative. Figure 10.1 shows the payoff values, where the rows correspond to the strategies of player , the columns show the strategies of player , and for each strategy pair the first number is the payoff of player , and the second number is the payoff of player . Comparing the payoff values, it is clear that only can be equilibrium, since
The strategy pair is really an equilibrium, since
In this case we have a unique equilibrium.
The existence of an equilibrium is not guaranteed in general, and if equilibrium exists, it might not be unique.
If all payoff values are identical, then we have multiple equilibria: any strategy pair is an equilibrium.
Let denote the number of players, and for the sake of notational convenience let denote the feasible strategies of player . That is, . A strategy vector is an equilibrium if and only if for all and ,
In applying the enumeration method, inequality (10.2) is checked for all possible strategy -tuples to see if (10.2) holds for all and . If it does, then is an equilibrium, otherwise not. If during the process of checking for a particular we find a and such that (10.2) is violated, then is not an equilibrium and we can omit checking further values of and . This algorithm is very simple, it consists of imbedded loops with variables and .
The maximum number of comparisons needed equals
however in practical cases it might be much lower, since if (10.2) is violated with some , then the comparison must stop for the same strategy vector.
The algorithm can formally be given as follows:
IF(10.2) fails 9
THENand go to 10 10
THEN() is equilibrium 12
Consider next the two-person case, , and introduce the real matrixes and with elements and respectively. Matrixes and are called the payoff matrixes of the two players. A strategy vector is an equilibrium if and only if the element in matrix is the largest in its column, and in matrix it is the largest in its row. In the case when , the game is called zero-sum, and , so the game can be completely described by the payoff matrix of the first player. In this special case a strategy vector is an equilibrium if and only if the element is the largest in its column and smallest in its row. In the zero-sum cases the equilibria are also called the saddle points of the games. Clearly, the enumeration method to find equilibria becomes more simple since we have to deal with a single matrix only.
The simplified algorithm is as follows:
THEN7 go to 12 8
THEN11 go to 12 12
Many finite games have the common feature that they can be represented by a finite directed tree with the following properties:
there is a unique root of the tree (which is not the endpoint of any arc), and the game starts at this node;
to each node of the tree a player is assigned and if the game reaches this node at any time, then this player will decide on the continuation of the game by selecting an arc originating from this node. Then the game moves to the endpoint of the chosen arc;
to each terminal node (in which no arc originates) an -dimensional real vector is assigned which gives the payoff values for the players if the game terminates at this node;
each player knows the tree, the nodes he is assigned to, and all payoff values at the terminal nodes.
For example, the chess-game satisfies the above properties in which , the nodes of the tree are all possible configurations on the chessboard twice: once with the white player and once with the black player assigned to it. The arcs represent all possible moves of the assigned player from the originating configurations. The endpoints are those configurations in which the game terminates. The payoff values are from the set where means win, represents loss, and shows that the game ends with a tie.
Proof. We present the proof of this result here, since it suggests a practical algorithm to find equilibria. The proof goes by induction with respect to the number of nodes of the game tree. If the game has only one node, then clearly it is the only equilibrium.
Assume next that the theorem holds for any tree with less than nodes (), and consider a game with nodes. Let be the root of the tree and let () be the nodes connected to by an arc. If denote the disjoint subtrees of with roots , then each subtree has less than nodes, so each of them has an equilibrium. Assume that player is assigned to . Let be the equilibrium payoffs of player on the subtrees and let . Then player will move to node from the root, and then the equilibrium continues with the equilibrium obtained on the subtree . We note that not all equilibria can be obtained by this method, however the payoff vectors of all equilibria, which can obtained by this method, are identical.
We note that not all equilibria can be obtained by this method, however the payoff vectors of all equilibria, which can be obtained by this method, are identical.
The proof of the theorem suggests a dynamic programming-type algorithm which is called backward induction. It can be extended to the more general case when the tree has chance nodes from which the continuations of the game are random according to given discrete distributions.
The solution algorithm can be formally presented as follows. Assume that the nodes are numbered so each arc connects nodes and only for . The root has to get the smallest number , and the largest number is given to one of the terminal nodes. For each node let denote the set of all nodes such that there is an arc from to . For each terminal node , is empty, and let denote the payoff vector associated to this node. And finally we will denote player assigned to node by for all . The algorithm starts at the last node and moves backward in the order and . Node is an endpoint, so vector has been already assigned. If in the process the next node is an endpoint, then is already given, otherwise we find the largest among the values , . Assume that the maximal value occurs at node , then we assign to node , and move to node . After all payoff vectors and are determined, then vector gives the equilibrium payoffs and the equilibrium path is obtained by nodes:
until an endpoint is reached, when the equilibrium path terminates.
At each node the number of comparisons equals the number of arcs starting at that node minus . Therefore the total number of comparisons in the algorithm is the total number of arcs minus the number of nodes.
This algorithm can be formally given as follows:
DO3 4 print sequence until an endpoint is reached
Example 10.3 Figure 10.3 shows a finite tree. In the circle at each nonterminal node we indicate the player assigned to that node. The payoff vectors are also shown at all terminal nodes. We have three players, so the payoff vectors have three elements.
First we number the nodes such that the beginning of each arc has a smaller number than its endpoint. We indicated these numbers in a box under each node. All nodes for are terminal nodes, as we start the backward induction with node . Since player is assigned to this node we have to compare the third components of the payoff vectors and associated to the endpoints of the two arcs originating from node . Since , player will select the arc to node as his best choice. Hence , and . Then we check node . By comparing the third components of vectors and it is clear that player will select node , so , and . In the graph we also indicated the choices of the players by thicker arcs. Continuing the procedure in the same way for nodes we finally obtain the payoff vector and equilibrium path .
10.1-1 An entrepreneur (E) enters to a market, which is controlled by a chain store (C). Their competition is a two-person game. The strategies of the chain store are soft (S), when it allows the competitor to operate or tough (T), when it tries to drive out the competitor. The strategies of the entrepreneur are staying in (I) or leaving (L) the market. The payoff tables of the two player are assumed to be
Find the equilibrium.
10.1-2 A salesman sells an equipment to a buyer, which has 3 parts, under the following conditions. If all parts are good, then the customer pays $ to the salesman, otherwise the salesman has to pay $ to the customer. Before selling the equipment, the salesman is able to check any one or more of the parts, but checking any one costs him $ . Consider a two-person game in which player is the salesman with strategies (how many parts he checks before selling the equipment), and player is the equipment with strategies (how many parts are defective). Show that the payoff matrix of player is given as below when we assume that the different parts can be defective with equal probability.
10.1-3 Assume that in the previous problem the payoff of the second player is the negative of the payoff of the salesman. Give a complete description of the number of equilibria as a function of the parameter values . Determine the equilibria in all cases.
10.1-4 Assume that the payoff function of the equipment is its value ( if all parts are good, and zero otherwise) in the previous exercise. Is there an equilibrium point?
Find the equilibrium with backward induction.
10.1-6 Show that in the one-player case backward induction reduces to the classical dynamic programming method.
10.1-7 Assume that in the tree of a game some nodes are so called “chance nodes” from which the game continuous with given probabilities assigned to the possible next nodes. Show the existence of the equilibrium for this more general case.
10.1-8 Consider the tree given in Figure 10.3, and double the payoff values of player , change the sign of the payoff values of player , and do not change those for Player . Find the equilibrium of this new game.