7.5. 7.5 Interactive proof systems and zero-knowledge

7.5.1. 7.5.1 Interactive proof systems and Arthur-Merlin games

In Section 7.2, the “man-in-the-middle” attack on the Diffie-Hellman protocol was mentioned. The problem here is that Bob has not verified the true identity of his communication partner before executing the protocol. While he assumes to communicate with Alice, he in fact exchanges messages with Erich. In other words, Alice's task is to convince Bob of her true identity without any doubt. This cryptographic task is called authentication. Unlike digital signatures, whose purpose is to authenticate electronically transmitted documents such as emails, electronic contracts, etc., the goal now is to authenticate individuals such as human or computer parties participating in a cryptographic protocol.

Footnote. Here, an “individual” or a “party” is not necessarily a human being; it may also be a computer program that automatically executes a protocol with another computer program.

In order to authenticate herself, Alice might try to prove her identity by a secret information known to her alone, say by giving her PIN (“Personal Identifaction Number”) or any other private information that no one knows but her. However, there is a catch. To prove her identity, she would have to give her secret away to Bob. But then, it no longer is a secret! Bob, knowing her secret, might pretend to be Alice in another protocol he executes with Chris, a third party. So the question is how to prove knowledge of a secret without giving it away. This is what zero-knowledge is all about. Zero-knowledge protocols are special interactive proof systems, which were introduced by Goldwasser, Micali, and Rackoff and, independently, by Babai and Moran. Babai and Moran's notion (which is essentially equivalent to the interactive proof systems proposed by Goldwasser et al.) is known as Arthur-Merlin games, which we will now describe informally.

Merlin and Arthur wish to jointly solve a problem , i.e., they wish to jointly decide whether or not a given input belongs to . The mighty wizard Merlin is represented by an NP machine , and the impatient King Arthur is represented by a randomised polynomial-time Turing machine . To make their decision, they play the following game, where they are taking turns to make alternating moves. Merlin's intention is always to convince Arthur that belongs to (no matter whether or not that indeed is the case). Thus, each of Merlin's moves consists in presenting a proof for “ ”, which he obtains by simulating , where is the input and describes all previous moves in this game. That is, the string encodes all previous nondeterministic choices of and all previous random choices of .

King Arthur, however, does not trust Merlin. Of course, he cannot check the mighty wizard's proofs all alone; this task simply exceeds his computational power. But he knows Merlin well enough to express some doubts. So, he replies with a nifty challenge to Merlin by picking some details of his proofs at random and requiring certificates for them that he can verify. In order to satisfy Arthur, Merlin must convince him with overwhelming probability. Thus, each of Arthur's moves consists in the simulation of , where again is the input and describes the previous history of the game.

The idea of Arthur-Merlin games can be captured via alternating existential and probabilistic quantifiers, where the former formalise Merlin's NP computation and the latter formalise Arthur's randomised polynomial-time computation.

Footnote. This is similar to the well-known characterisation of the levels of the polynomial hierarchy via alternating and quantifiers, see Section 8.4 and in particular item 3 Theorem 8.11.

In this way, a hierarchy of complexity classes can be defined, the so-called Arthur-Merlin hierarchy. We here present only the class MA from this hierarchy, which corresponds to an Arthur-Merlin game with two moves, with Merlin moving first.

Definition 7.16 (MA in the Arthur-Merlin hierarchy) The class MA contains exactly those problems for which there exists an NP machine and a randomised polynomial-time Turing machine such that for each input :

If then there exists a path of such that accepts with probability at least (i.e., Arthur cannot refute Merlin's proof for “ ”, and Merlin thus wins).

If then for each path of , rejects with probability at least (i.e., Arthur is not taken in by Merlin's wrong proofs for “ ” and thus wins).

Analogously, the classes AM, MAM, AMA,... can be defined, see Exercise 7.5-1.

In Definition 7.16, the probability threshold of for Arthur to accept or to reject, respectively, is chosen at will and does not appear to be large enough at first glance. In fact, the probability of success can be amplified using standard techniques and can be made arbitrarily close to one. In other words, one might have chosen even a probability threshold as low as , for an arbitrary fixed constant , and would still have defined the same class. Furthermore, it is known that for a constant number of moves, the Arthur-Merlin hierarchy collapses down to AM:

It is an open question whether or not any of the inclusions is a strict one.

A similar model, which can be used as an alternative to the Arthur-Merlin games, are the interactive proof systems mentioned above. The two notions use different terminology: Merlin now is called the “prover” and Arthur the “verifier”. Also, their communication is not interpreted as a game but rather as a protocol. Another difference between the two models, which appears to be crucial at first, is that Arthur's random bits are public (so, in particular, Merlin knows them), whereas the random bits of the verifier in an interactive proof system are private. However, Goldwasser and Sipser [ 95 ] proved that, in fact, it does not matter whether the random bits are private or public, so Arthur-Merlin games essentially are equivalent to interactive proof systems.

If one allows a polynomial number of moves (instead of a constant number), then one obtains the complexity class IP. Note that interactive proof systems are also called IP protocols. By definition, IP contains all of NP. In particular, the graph isomorphism problem is in IP. We will see later that IP also contains problems from that are supposed to be not in NP. In particular, the proof of Theorem 8.16 shows that the complement of the graph isomorphism problem is in AM and thus in IP. A celebrated result by Shamir [ 232 ] says that IP equals PSPACE, the class of problems solvable in polynomial space.

Let us now turn back to the problem of authentication mentioned above, and to the related notion of zero-knowledge protocols. Here is the idea. Suppose Arthur and Merlin play one of their games. So, Merlin sends hard proofs to Arthur. Merlin alone knows how to get such proofs. Being a wise wizard, he keeps this knowledge secret. And he uses his secret to authenticate himself in the communication with Arthur.

Now suppose that malicious wizard Marvin wishes to fool Arthur by pretending to be Merlin. He disguises as Merlin and uses his magic to look just like him. However, he does not know Merlin's secret of how to produce hard proofs. His magic is no more powerful than that of an ordinary randomised polynomial-time Turing machine. Still, he seeks to simulate the communication between Merlin and Arthur. An interactive proof system has the zero-knowledge property if the information communicated between Marvin and Arthur cannot be distinguished from the information communicated between Merlin and Arthur. Note that Marvin, who does not know Merlin's secret, cannot introduce any information about this secret into the simulated IP protocol. Nonetheless, he is able to perfectly copy the original protocol, so no one can tell a difference. Hence, the (original) protocol itself must have the property that it does not leak any information whatsoever about Merlin's secret. If there is nothing to put in, there can be nothing to take out.

Definition 7.17 (Zero-knowledge protocol) Let be any set in IP, and let be an interactive proof system for , where is an NPTM and is a randomised polynomial-time Turing machine. The IP protocol is a zero-knowledge protocol for if and only if there exists a randomised polynomial-time Turing machine such that simulates the original protocol and, for each , the tuples and representing the information communicated in and in , respectively, are identically distributed over the random choices in and in , respectively.

The notion defined above is called “honest-verifier perfect zero-knowledge” in the literature, since (a) it is assumed that the verifier is honest (which may not necessarily be true in cryptographic applications though), and (b) it is required that the information communicated in the simulated protocol perfectly coincides with the information communicated in the original protocol. Assumption (a) may be somewhat too idealistic, and assumption (b) may be somewhat too strict. That is why also other variants of zero-knowledge protocols are studied, see the notes at the end of this chapter.

7.5.2. 7.5.2 Zero-knowledge protocol for graph isomorphism

Let us consider a concrete example now. As mentioned above, the graph isomorphism problem (, for short) is in NP, and the complementary problem is in AM, see the proof of Theorem 8.16. Thus, both problems are contained in IP. We now describe a zero-knowledge protocol for that is due to Goldreich, Micali, and Wigderson [ 91 ]. Figure 7.12 shows this IP protocol between the prover Merlin and the verifier Arthur.

Although there is no efficient algorithm known for , Merlin can solve this problem, since is in NP. However, there is no need for him to do so in the protocol. He can simply generate a large graph with vertices and a random permutation . Then, he computes the graph and makes the pair public. The isomorphism between and is kept secret as Merlin's private information.

Of course, Merlin cannot send to Arthur, since he does not want to give his secret away. Rather, to prove that the two graphs, and , indeed are isomorphic, Merlin randomly chooses an isomorphism under the uniform distribution and a bit and computes the graph . He then sends to Arthur whose response is a challenge for Merlin: Arthur sends a random bit , chosen under the uniform distribution, to Merlin and requests to see an isomorphism between and . Arthur accepts if and only if indeed satisfies .

Figure 7.12.  Goldreich, Micali, and Wigderson's zero-knowledge protocol for Goldreich, Micali, and Wigderson's zero-knowledge protocol for \mathtt{GI} ..

Goldreich, Micali, and Wigderson's zero-knowledge protocol for \mathtt{GI} .


The protocol works, since Merlin knows his secret isomorphism and his random permutation : It is no problem for Merlin to compute the isomorphism between and and thus to authenticate himself. The secret is not given away. Since and are isomorphic, Arthur accepts with probability one. The case of two nonisomorphic graphs does not need to be considered here, since Merlin has chosen isomorphic graphs and in the protocol; see also the proof of Theorem 8.16.

Now, suppose, Marvin wishes to pretend to be Merlin when communicating with Arthur. He does know the graphs and , but he doesn't know the secret isomorphism . Nonetheless, he tries to convince Arthur that he does know . If Arthur's random bit happens to be the same as his bit , to which Marvin committed before he sees , then Marvin wins. However, if , then computing or requires knowledge of . Since is not efficiently solvable (and even too hard for a randomised polynomial-time Turing machine), Marvin cannot determine the isomorphism for sufficiently large graphs and . But without knowing , all he can do is guess. His chances of hitting a bit with are at most . Of course, Marvin can always guess, so his success probability is exactly . If Arthur challenges him in independent rounds of this protocol again and again, the probability of Marvin's success will be only . Already for , this probability is negligibly small: Marvin's probability of success is then less than one in one million.

Figure 7.13.  Simulation of the zero-knowledge protocol for Simulation of the zero-knowledge protocol for \mathtt{GI} without knowing \pi . without knowing Simulation of the zero-knowledge protocol for \mathtt{GI} without knowing \pi ..

Simulation of the zero-knowledge protocol for \mathtt{GI} without knowing \pi .


It remains to show that the protocol from Figure 7.12 is a zero-knowledge protocol. Figure 7.13 shows a simulated protocol with Marvin who does not know Merlin's secret but pretends to know it. The information communicated in one round of the protocol has the form of a triple: . If Marvin is lucky enough to choose a random bit with , he can simply send and wins: Arthur (or any third party watching the communication) will not notice the fraud. On the other hand, if then Marvin's attempt to betray will be uncovered. However, that is no problem for the malicious wizard: He simply deletes this round from the protocol and repeats. Thus, he can produce a sequence of triples of the form that is indistinguishable from the corresponding sequence of triples in the original protocol between Merlin and Arthur. It follows that Goldreich, Micali, and Wigderson's protocol for is a zero-knowledge protocol.

Exercises

7.5-1 Arthur-Merlin hierarchy:

a. Analogously to MA from Definition 7.16, define the other classes AM, MAM, AMA, ... of the Arthur-Merlin hierarchy.

b. What is the inclusion structure between the classes MA, coMA, AM, coAM, and the classes of the polynomial hierarchy defined in Definition 8.10 of Subsection 8.4.1.

7.5-2 Zero-knowledge protocol for graph isomorphism:

a. Consider the graphs and from Example 7.4 in Section 7.1.3. Execute the zero-knowledge protocol from Figure 7.12 with the graphs and and the isomorphism . Use an isomorphism of your choice, and try all possibilities for the random bits and . Repeat this Arthur-Merlin game with an unknown isomorphism chosen by somebody else.

b. Modify the protocol from Figure 7.12 such that, with only one round of the protocol, Marvin's success probability is less than .

  PROBLEMS  

7-1 Arithmetics in the ring

Let and . We say is congruent to modulo (, for short) if and only if divides the difference . For example, and . The congruence modulo defines an equivalence relation on , i.e., it is reflexive (), symmetric (if then ), and transitive ( and imply ).

The set is the remainder class of . For example, the remainder class of is

We represent the remainder class of by the smallest natural number in . For instance, represents the remainder class of . The set of all remainder classes is . On , addition is defined by , and multiplication is defined by . For example, in the arithmetics modulo , we have and . Prove that in the arithmetics modulo :

a. is a commutative ring with one;

b. , which is defined in Example 7.3, is a multiplicative group;

c. is a field for each prime number .

d. Prove that the neutral element of a group and the inverse of each group element are unique.

e. Let be a commutative ring with one. Prove that the set of all invertible elements in forms a multiplicative group. Determine this group for the ring .

7-2 Tree isomorphism

The graph isomorphism problem can be solved efficiently on special graph classes, such as on the class of trees. An (undirected) tree is a connected graph without cycles. (A cycle is a sequence of pairwise incident edges that returns to the point of origin.) The leaves of a tree are the vertices with degree one. The tree isomorphism problem is defined by

Design an efficient algorithm for this problem.

Hint. Label the vertices of the given pair of trees successively by suitable number sequences. Compare the resulting sequences of labels in the single loops of the algorithm. Starting from the leaves of the trees and then working step by step towards the center of the trees, the algorithm halts as soon as all vertices are labelled. This algorithm can be found, e.g., in [ 150 ].

7-3 Computing the determinant

Design an efficient algorithm in pseudocode for computing the determinant of a matrix. Implement your algorithm in a programming language of your choice. Can the inverse of a matrix be computed efficiently?

7-4 Low-exponent attack

a. Consider the RSA cryptosystem from Figure 7.8. For the sake of efficiency, the public exponent has been popular. However, this choice is dangerous. Suppose Alice, Bob, and Chris encrypt the same message with the same public exponent , but perhaps with distinct moduli, , , and . Erich intercepts the resulting three ciphertexts: for . Then, Erich can easily decrypt the message . How?

Hint. Erich knows the Chinese remainder theorem, which also was useful in Exercise 7.3-1. A recommended value for the public exponent is , since its binary expansion has only two ones. Thus, the square-and-multiply algorithm runs fast for this .

b. The attack described above can be extended to ciphertexts that are related with each other as follows. Let and be known, , and suppose that messages are sent and are intercepted by Erich. Further, suppose that and . How can attacker Erich now determine the original message ?

Hint. Apply so-called lattice reduction techniques (see, e.g., Micciancio and Goldwasser [ 179 ]). The attack mentioned here is due to Hastad [ 109 ] and has been strengthened later by Coppersmith [ 53 ].

c. How can the attacks described above be prevented?

  CHAPTER NOTES  

Singh's book [ 241 ] gives a nice introduction to the history of cryptology, from its ancient roots to modern cryptosystems. For example, you can find out there about recent discoveries related to the development of RSA and Diffie-Hellman in the nonpublic sector. Ellis, Cocks, and Williamson from the Communications Electronics Security Group (CESG) of the British Government Communications Head Quarters (GCHQ) proposed the RSA system from Figure 7.8 and the Diffie-Hellman protocol from Figure 7.7 even earlier than Rivest, Shamir, and Adleman and at about the same time as but independent of Diffie and Hellman, respectively. RSA and Diffie-Hellman are described in probably every book about cryptography written since their invention. A more comprehensive list of attacks against RSA than that of Section 7.3 can be found in, e.g., [ 29 ], [ 136 ], [ 187 ], [ 217 ], [ 219 ], [ 233 ].

Primality tests such as the Miller-Rabin test and factoring algorithms are also described in many books, e.g., in [ 92 ], [ 219 ], [ 223 ], [ 249 ].

The notion of strongly noninvertible associative one-way functions, on which the secret-key agreement protocol from Figure 7.11 is based, is due to Rivest and Sherman. The modification of this protocol to a digital signature scheme is due to Rabi and Sherman. In their paper [ 208 ], they also proved that commutative, associative one-way function exist if and only if . However, the one-way functions they construct are neither total nor strongly noninvertible, even if is assumed. Hemaspaandra and Rothe [ 114 ] proved that total, strongly noninvertible, commutative, associative one-way functions exist if and only if . Further investigations on this topic can be found in [ 28 ], [ 111 ], [ 113 ], [ 116 ].

The notion of interactive proof systems and zero-knowledge protocols is due to Goldwasser, Micali, and Rackoff [ 94 ]. One of the best and most comprehensive sources on this field is Chapter 4 in Goldreich's book [ 92 ]; see also the books [ 152 ], [ 198 ], [ 219 ] and the surveys [ 93 ], [ 90 ], [ 217 ]. Arthur-Merlin games were introduced by Babai and Moran [ 19 ], [ 18 ] and have been investigated in many subsequent papers. Variants of the notion of zero-knowledge, which differ from the notion in Definition 7.17 in their technical details, are extensively discussed in, e.g., [ 92 ] and also in, e.g., [ 90 ], [ 93 ], [ 217 ].