In the next part of this section we concentrate on the closure properties of the class of regular languages.
By the constructive proof of Theorem 3, it is also shown that the class of regular languages is closed under the regular operations. Now let us consider the set theoretical operations: intersection and complement.
Theorem 7. The class of regular languages is closed under intersection and complement.
Proof. The proof is constructive in both cases, and deterministic finite automata are used.
Let us start with the complement. Let a regular language be given by a complete deterministic
finite automaton A = (Q, T, q0, δ, F) that recognizes it.
This automaton has exactly one run for
every word of T*, and accepts a word if this run is finished in an accepting state.
Then recognize exactly those words that are not accepted
by A, and thus the finite automaton
accepts the complement of the original regular language.
For the intersection, let two regular languages L1 and L2 be given by complete deterministic automata A = (Q, T, q0, δ,F) and A' = (Q', T, q'0, δ', F') that recognize them, respectively. Then, let A∩ = (Q × Q', T, (q0, q'0), δ'', F× F'), with transition function δ'' ((q,q'), a) = (δ(q,a), δ' (q',a)) for every q ∈ Q, q' ∈ Q' and a ∈ T. The states are formed by pairs of the states of the automata A and A'. Thus, A∩ simulates the work of these two automata simultaneously and accepts exactly those words that are accepted by both of these machines. Thus the intersection of the languages L1 and L2 is accepted by a finite automaton, and thus it is also a regular language.
QED.
Example 28. Let the automaton A and A' accept the languages L(A) and L(A'), respectively. Let them be defined in the following way: the table of A as shown below.
T Q | →q1 | q2 | ⊂q3⊃ |
---|---|---|---|
a | q3 | q2 | q3 |
b | q2 | q2 | q3 |
the table of A' as shown below
T Q' | →q'1 | ⊂q'2⊃ |
---|---|---|
a | q'1 | q'1 |
b | q'2 | q'2 |
Give an automaton that accepts the complement of L(A) and an automaton that accepts the intersection of L(A) and L(A'). What are the languages accepted by these automata?
Solution:
Let us take the automaton that accepts the complement of L(A) by interchanging the role of
accepting and non-accepting states in A. Let
be defined by the following Cayley table:
T Q | →⊂q1⊃ | ⊂q2⊃ | q3 |
---|---|---|---|
a | q3 | q2 | q3 |
b | q2 | q2 | q3 |
Now let us construct A∩ by using the Cartesian product of the sets of states Q and Q'.
T Q | →(q1, q'1) | (q2, q'1) | (q3, q'1) | (q1, q'2) | (q2, q'2) | ⊂(q3, q'2)⊃ |
---|---|---|---|---|---|---|
a | (q3, q'1) | (q2, q'1) | (q3, q'1) | (q3, q'1) | (q2, q'1) | (q3, q'1) |
b | (q2, q'2) | (q2, q'2) | (q3, q'2) | (q2, q'2) | (q2, q'2) | (q3, q'2) |
Actually, A accepts the language a(a+b)* (words starting by letter a),
and A' accepts the language (a+b)* b
(words that ends with letter b), the automaton
accepts the language λ + b(a+b)
* (words do not start with letter a over the alphabet {a,b}),
while A∩ accepts the
language a (a+b)* b (words starting with a and ending with b).
Exercise 36. Let the table of A be
T Q | →q1 | q2 | ⊂q3⊃ |
---|---|---|---|
0 | q1 | q3 | q2 |
1 | q2 | q2 | q2 |
and the table of A' be
T Q' | →q'1 | ⊂q'2⊃ | ⊂q'3⊃ |
---|---|---|---|
0 | q'3 | q'2 | q'3 |
1 | q'2 | q'1 | q'2 |
Give an automaton that accepts the intersection of the languages accepted by A and A'.
Exercise 37. Let the language L(A) be defined as the accepted language of the automaton A as it is shown in Figure 2.10.
Give an automaton that accepts the complement of L(A). (Hint: first the equivalent completely defined deterministic automaton must be obtained.)
Exercise 38. Let the table of A be
T Q' | →q1 | q2 | ⊂q3⊃ | ⊂q4⊃ |
---|---|---|---|---|
a | q1 | q3 | q1 | q1 |
b | q1 | q4 | q2 | q2 |
c | q2 | q4 | q4 | q3 |
and the table of A' be
T Q' | →⊂q'1⊃ | q'2 |
---|---|---|
a | q'1 | q'2 |
b | q'2 | q'1 |
c | q'2 | q'1 |
They accept the languages L(A) and L(A'), respectively.
Give automata that accept the complement of L(A) and L(A'). Give an automaton that accepts L(A) ∩ L(A').
In the next part we show an if and only if characterization of the class of regular languages.
Let us define congruence relations on T* with the following property: for every u, v, w ∈ T* if u ∼ v, then uw ∼ vw. These relations are called right-congruences. A congruence relation is of finite index, if the number of classes of T* is finite.
Theorem 8. (Myhill-Nerode theorem). A language over the alphabet T is regular if and only if there is a finite index right-congruence relation on T* such that the language is obtained as (the union of) some of the classes induced by the relation.
Let a language L be given. Roughly speaking two words, u and v are equivalent if their every possible continuation w behaves in the same manner, i.e., uw ∈ L if and only if vw ∈ L. If this equivalence relation induces finitely many classes on T*, then, and only then, L is regular.
Actually, this fact is also related to the minimal, completely defined, accepting deterministic finite automaton (that uniquely identifies the given regular language): the partitions of T* can be assigned to the states of the minimal automaton: the partition containing the empty word λ is assigned to the initial state; those partitions that contain words such that their empty continuation is in L are assigned to the accepting states. The transitions can also be easily defined by using the partitions, by checking in which partition the words are that are given by the previous one by extending it with exactly one letter.
Now, we are going to give an example for how this theorem can be used to show that some of the languages are not regular.
Example 29. Let us analyze language L = {an bn∣n ∈ ℕ}. Let us partition the words of a* into classes: assume that ak and am are in the same class, i.e., ak ∼ am. Then, for each possible continuation (w ∈ {a,b}*) they behave in the same manner, e.g., for bk the word akbk ∈ L and thus, ambk ∈ L also. But it can only be if m = k, and so every element of a* is equivalent to only itself and not to any other element of this set. Therefore, language L induces an infinite index right-congruence relation, thus L is not regular.
Exercise 39. Show that the language containing exactly the words having the same number of 0's and 1's (over the alphabet {0,1}) is not regular.
(Apply the Myhill-Nerode theorem.)
Exercise 40. Show that the language L = {an2∣n ∈ ℕ} (over the alphabet {a}), i.e., the unary words having square number length, is not regular.
(Apply the Myhill-Nerode theorem.)