2.4. Properties of Regular Languages

In the next part of this section we concentrate on the closure properties of the class of regular languages.

2.4.1. Closure Properties

By the constructive proof of Theorem 3, it is also shown that the class of regular languages is closed under the regular operations. Now let us consider the set theoretical operations: intersection and complement.

Theorem 7. The class of regular languages is closed under intersection and complement.

Proof. The proof is constructive in both cases, and deterministic finite automata are used. Let us start with the complement. Let a regular language be given by a complete deterministic finite automaton A = (Q, T, q0, δ, F) that recognizes it. This automaton has exactly one run for every word of T*, and accepts a word if this run is finished in an accepting state. Then recognize exactly those words that are not accepted by A, and thus the finite automaton accepts the complement of the original regular language.

For the intersection, let two regular languages L1 and L2 be given by complete deterministic automata A = (Q, T, q0, δ,F) and A' = (Q', T, q'0, δ', F') that recognize them, respectively. Then, let A = (Q × Q', T, (q0, q'0), δ'', F× F'), with transition function δ'' ((q,q'), a) = (δ(q,a), δ' (q',a)) for every qQ, q'Q' and aT. The states are formed by pairs of the states of the automata A and A'. Thus, A simulates the work of these two automata simultaneously and accepts exactly those words that are accepted by both of these machines. Thus the intersection of the languages L1 and L2 is accepted by a finite automaton, and thus it is also a regular language.

QED.

Example 28. Let the automaton A and A' accept the languages L(A) and L(A'), respectively. Let them be defined in the following way: the table of A as shown below.

T Qq1q2q3
aq3q2q3
bq2q2q3

the table of A' as shown below

T Q'q'1q'2
aq'1q'1
bq'2q'2

Give an automaton that accepts the complement of L(A) and an automaton that accepts the intersection of L(A) and L(A'). What are the languages accepted by these automata?

Solution:

Let us take the automaton that accepts the complement of L(A) by interchanging the role of accepting and non-accepting states in A. Let be defined by the following Cayley table:

T Q→⊂q1q2q3
aq3q2q3
bq2q2q3

Now let us construct A by using the Cartesian product of the sets of states Q and Q'.

T Q→(q1, q'1)(q2, q'1)(q3, q'1)(q1, q'2)(q2, q'2)⊂(q3, q'2)⊃
a(q3, q'1)(q2, q'1)(q3, q'1)(q3, q'1)(q2, q'1)(q3, q'1)
b(q2, q'2)(q2, q'2)(q3, q'2)(q2, q'2)(q2, q'2)(q3, q'2)

Actually, A accepts the language a(a+b)* (words starting by letter a), and A' accepts the language (a+b)* b (words that ends with letter b), the automaton accepts the language λ + b(a+b) * (words do not start with letter a over the alphabet {a,b}), while A accepts the language a (a+b)* b (words starting with a and ending with b).

Exercise 36. Let the table of A be

T Qq1q2q3
0q1q3q2
1q2q2q2

and the table of A' be

T Q'q'1q'2q'3
0q'3q'2q'3
1q'2q'1q'2

Give an automaton that accepts the intersection of the languages accepted by A and A'.

Exercise 37. Let the language L(A) be defined as the accepted language of the automaton A as it is shown in Figure 2.10.

2.10. ábra - The graph of the automaton of Exercise 37.

The graph of the automaton of Exercise 37.

Give an automaton that accepts the complement of L(A). (Hint: first the equivalent completely defined deterministic automaton must be obtained.)

Exercise 38. Let the table of A be

T Q'q1q2q3q4
aq1q3q1q1
bq1q4q2q2
cq2q4q4q3

and the table of A' be

T Q'→⊂q'1q'2
aq'1q'2
bq'2q'1
cq'2q'1

They accept the languages L(A) and L(A'), respectively.

Give automata that accept the complement of L(A) and L(A'). Give an automaton that accepts L(A) ∩ L(A').

2.4.2. Myhill-Nerode theorem

In the next part we show an if and only if characterization of the class of regular languages.

Let us define congruence relations on T* with the following property: for every u, v, wT* if uv, then uwvw. These relations are called right-congruences. A congruence relation is of finite index, if the number of classes of T* is finite.

Theorem 8. (Myhill-Nerode theorem). A language over the alphabet T is regular if and only if there is a finite index right-congruence relation on T* such that the language is obtained as (the union of) some of the classes induced by the relation.

Let a language L be given. Roughly speaking two words, u and v are equivalent if their every possible continuation w behaves in the same manner, i.e., uwL if and only if vwL. If this equivalence relation induces finitely many classes on T*, then, and only then, L is regular.

Actually, this fact is also related to the minimal, completely defined, accepting deterministic finite automaton (that uniquely identifies the given regular language): the partitions of T* can be assigned to the states of the minimal automaton: the partition containing the empty word λ is assigned to the initial state; those partitions that contain words such that their empty continuation is in L are assigned to the accepting states. The transitions can also be easily defined by using the partitions, by checking in which partition the words are that are given by the previous one by extending it with exactly one letter.

Now, we are going to give an example for how this theorem can be used to show that some of the languages are not regular.

Example 29. Let us analyze language L = {an bnn ∈ ℕ}. Let us partition the words of a* into classes: assume that ak and am are in the same class, i.e., ak ∼ am. Then, for each possible continuation (w ∈ {a,b}*) they behave in the same manner, e.g., for bk the word akbkL and thus, ambkL also. But it can only be if m = k, and so every element of a* is equivalent to only itself and not to any other element of this set. Therefore, language L induces an infinite index right-congruence relation, thus L is not regular.

Exercise 39. Show that the language containing exactly the words having the same number of 0's and 1's (over the alphabet {0,1}) is not regular.

(Apply the Myhill-Nerode theorem.)

Exercise 40. Show that the language L = {an2n ∈ ℕ} (over the alphabet {a}), i.e., the unary words having square number length, is not regular.

(Apply the Myhill-Nerode theorem.)