# 6. fejezet - Recursively Enumerable Languages and Turing Machines

Summary of the chapter: In this chapter, we discuss the most universal language class, the class of recursively enumerable languages, and the most universal automaton, the Turing machine. In the first section, we investigate the recursive and the recursively enumerable languages, and their closure and other properties. The second and third section is dedicated to the Turing machine. We will show two different applications. First, we are going to use the Turing machine as a universal language acceptor, then we show how we can use it as a simple but universal computing device.

## 6.1. Recursive and Recursively Enumerable Languages

At the beginning of this chapter we introduce the recursive languages and the recursively enumerable languages. These two language classes are fundamental in computability theory. There are many equivalent definitions, however, we are going to use these two definitions now, and we are going to show how these definitions fit for the concept of the Turing machine later.

Definition 30. The language LV* is recursive, if there is an algorithm, which decides about any word pV*, whether or not pL.

We can say a language L is recursive, if the word problem can be solved in L. In Chapter 1. we define the class of the recursively enumerable languages as languages which can be generated by unrestricted generative grammars. Now, we give another definition.

Definition 31. The language LV* is recursively enumerable, if there is a procedure, which specifies all the elements of L.

The two definitions of the recursively enumerable languages are equivalent. If there is a procedure, which specifies all elements of the language L, then there is a generative grammar which generates the language L, and if there is a generative grammar generating the language L, then this grammar itself is a procedure, which can be used to specify all elements of the language L.

It is obvious that each recursive language is a recursively enumerable, because we can list the words over an alphabet V, and select those words which are contained by the language L.

Theorem 31. The language L is recursive, if and only if both L and are recursively enumerable.

Proof. First, we show that, if LV* and are recursively enumerables, then L - and also - is recursive. The language L is recursively enumerable, so there is a procedure which lists the elements of L. Let us denote these words p1, p2,.... However, is recursively enumerable as well, so we have another procedure which lists the elements of . Let us denote these words r1,r2,.... Now we can combine these two procedures, to use the first one, and then the second one, alternately. What we receive is the list of all words over the alphabet V: p1, r1, p2, r2,... and we know about each one if it belongs to the language L or not.

Now, we show that if L is recursive, then L and is recursively enumerable. We already mentioned that recursive languages are recursively enumerable, because we can list all the words over an alphabet V, and add the current word to the language if it is in L. The same algorithm can be used for the language , we can list the words again, and we add them to the language if they are not in the language L.

QED.

The following theorem shows the connection between the context-sensitive and recursive languages.

Theorem 32. Each context-sensitive language is recursive, but there are recursive languages which are not context-sensitive.

Proof. The first part of the theorem states that the word problem can be solved for each context-sensitive language. This was shown already in Section 5.3.2.

For the second part, let us create the list of each possible context-sensitive generative grammar Gi = (Ni, Ti, Si, Pi), which generates numbers. (The set of the terminal letters of each grammar Gi are digits, so Ti = {0,1,2,3,4,5,6,7,8,9} for each Gi.) Now, we define language L, which contains the sequential numbers of grammars whose generated language does not contain its own sequential number (position in the list): L = {iiL(Gi)}.

We can create a list of all context-sensitive generative grammars which generates numbers, and we can decide whether or not a context-sensitive grammar generates its position in the list, so language L is recursive.

Now, suppose to the contrary that language L is context-sensitive. In this case, there is a context-sensitive grammar Gk, such that L(Gk) = L. Then, by the definition of L, if kL(Gk), then kL is a contradiction, and if kL(Gk), then kL is also a contradiction. The only possible solution is that language L is not context-sensitive.

QED.

The next theorem shows that there are languages which are not in the class of recursively enumerable languages, so the recursively enumerable language class does not contain all possible languages. The concept of the proof is similar to the previous proof.

Theorem 33. There exists a language L, which is not recursively enumerable.

Proof. Let us create the list of each generative grammar Gi = (Ni, Ti, Si, Pi) which generates numbers. (The set of the terminals of each grammar Gi are numbers: Ti = {0,1,2,3,4,5,6,7,8,9} for each grammar Gi). We have to note that it is easy to create an ordered list of all possible generative grammars which generates numbers. Now, we define language L, which contains the numbers of the grammars which does not generate the number of its position in the list: L = {iiL(Gi)}.

Now, suppose to the contrary that the language L is recursively enumerable. In this case, there is a generative grammar Gk, such that L(Gk) = L. Then, by the definition of L, if kL(Gk), then kL is a contradiction, and if kL(Gk), then kL is also a contradiction. The only possible solution is that language L is not recursively enumerable.

QED.

We have already shown that the class of context-sensitive languages is a proper subset of the class of recursive languages. It has also been proven that the class of all languages is a proper superset of the class of recursively enumerable languages. Finally, we note that the complementer language of language L defined in the proof of Theorem 33. is recursively enumerable, but not recursive. We can summarize our results in the following formula:

CSRREAL

where CS stands for context-sensitive languages, R stands for recursive languages, RE stands for recursively enumerable languages, and AL stands for all languages.

### 6.1.1. Closure Properties

Theorem 34. The class of recursive languages is closed under complement operation.

Proof. Theorem 31. states that language L is recursive, if and only if L and are both recursively enumerable. However, we can apply the same theorem for the language , and what we receive as a result is that is recursive, if and L are both recursively enumerable, which is the same condition, so if L is recursive, then is recursive, as well.

QED.

Theorem 35. The class of recursively enumerable languages is not closed under complement operation.

Proof. The proof of this theorem is easy, we need only one example, where the language itself is recursively enumerable, and the complement language is not recursively enumerable. In the proof of the Theorem 33. we have defined a language L which is not recursively enumerable. The complement language is recursively enumerable. Here we have a recursively enumerable language , whose complement is not recursively enumerable.

QED.

Theorem 36. The class of recursively enumerable languages is closed under regular operations.

Proof. We give a constructive proof here. Let the languages L1 and L2 be recursively enumerable. Let the grammar G1 = (N1, T, S1, P1) such that L(G1) = L1, and let the grammar G2 = (N2, T, S2, P2) such that L(G2) = L2. Without loss of generality we can suppose that N1N2 =∅, and the terminal symbols appear only in rules having the form Aa, where AN, aT. We define generative grammars GUn, and GCo, such that L(GUn) = L1L2, and L(GCo) = L1 · L2.

1. Union

Let S be a new start symbol, such that SN1 = SN2 = ST = ∅, and let

GUn = (N1N2 ∪ {S}, T, S, P1P2 ∪ {SS1, SS2}).

2. Concatenation

Let S be a new start symbol, such that SN1 = SN2 = ST = ∅, and let

GCo = (N1N2 ∪ {S}, T, S, P1P2 ∪ {SS1S2}).

3. Kleene star

In order to create a grammar generating the language L(GKl) = L1* we use two grammars. Let the grammar G1 = (N1, T, S1, P1), and let the grammar G2 = (N2, T, S2, P2) such that L(G1) = L(G2) = L1 \ {λ}, and N1N2 = ∅. Without loss of generality we can suppose that the terminal symbols appear only in rules having the form Aa, where AN, aT. For the grammar GKl we again use a new start symbol S, where SN1 = SN2 = ST = ∅. Then, let

GKl = (N1N2 ∪ {S}, T, S, P1P2 ∪ {S → λ, SS1, SS1S2S}).

QED.

Theorem 37. The class of recursively enumerable languages is closed under intersection.

Proof. By applying the definition of the recursively enumerable language, we can create a list of the elements of the recursively enumerable language L1 without repetitions, and yet another list, which contains the elements from the recursively enumerable language L2 without repetitions. Then, we can create a list, which contains one element from the list of the language L1, and then one element of the list of the language L2 alternately. We move an element from this combined list into the list of the L1L2, if the element appears twice.

QED.

Finally, we have to note that recursive languages are also closed under regular operations and intersection.

### 6.1.2. Normal Forms

Definition The grammar G = (N, T, S, P) is in Révész normal form, if all of its production rules has the following forms:

1. S → λ,

2. Aa,

3. AB,

4. ABC,

5. ABAC,

6. ABCB, or

7. ABB,

where A, B, CN and aT.

This normal form for unrestricted grammars was introduced by György Révész. Compared to the Kuroda normal form, the differences are the following:

• It allows the rule S → λ for grammars generating the empty word,

• instead of the rule ABCD it allows two rules, namely ABAC and ABCB,

• the only ,,really plus'' rule, which makes the difference between the monotone and unrestricted grammars is the last production rule: ABB.

As you can see, there is not a huge difference between the unrestricted grammars and the context-sensitive grammars. For generative grammars generating context-sensitive languages, only one production rule form is enough to be able to generate any recursively enumerable language.

Even more surprising results were proven by Viliam Geffert. His results put limitations not just to the form of the production rules, but also to the number of the nonterminal symbols. We introduce his results as theorems, without proofs.

Theorem 38. For each recursively enumerable language L we can give an unrestricted generative grammar G = (N, T, S, H) such that

• grammar G generates the language L, (L(G) = L),

• G has exactly 5 nonterminal symbols, (N = {S, A, B, C, D}), and

• each rule has a form:

• Sp where S is the start symbol, and p is a nonempty word, (p ∈ (NT)+),

• AB → λ, or

• CD → λ.

Theorem 39. For each recursively enumerable language L we can give an unrestricted generative grammar G = (N, T, S, H) such that

• grammar G generates the language L, (L(G) = L),

• G has exactly 4 nonterminal symbols, (N = {S, A, B, C}), and

• each rule has a form:

• Sp where S is the start symbol, and p is a nonempty word, (p ∈ (NT)+),

• AB → λ, or

• CC → λ.

Theorem 40. For each recursively enumerable language L we can give an unrestricted generative grammar G = (N, T, S, H) such that

• grammar G generates the language L, (L(G) = L),

• G has exactly 3 nonterminal symbols, (N = {S,A,B}), and

• each rule has a form:

• Sp where S is the start symbol, and p is a nonempty word, (p ∈ (NT)+),

• AA → λ, or

• BBB → λ.

Theorem 41. For each recursively enumerable language L we can give an unrestricted generative grammar G = (N, T, S, H) such that

• grammar G generates the language L, (L(G) = L),

• G has exactly 3 nonterminal symbols, (N = {S, A, B}), and

• each rule has a form:

• Sp where S is the start symbol, and p is a nonempty word, (p ∈ (NT)+), or

• ABBBA → λ.

Theorem 42. For each recursively enumerable language L we can give an unrestricted generative grammar G = (N, T, S, H) such that

• grammar G generates the language L, (L(G) = L),

• G has exactly 4 nonterminal symbols, (N = {S, A, B, C}), and

• each rule has a form:

• Sp where S is the start symbol, and p is a nonempty word, (p ∈ (NT)+), or

• ABC → λ.