Summary of the chapter:In this chapter, we deal with a family of languages between the regular and context-free languages of the Chomsky hierarchy, i.e., the linear languages. We give an example for a non-regular linear language. A normal form for linear grammars is proven. The class of one-turn pushdown automata recognizes exactly the class of linear languages. This class is closed under union, but it is not closed under concatenation, Kleene-star, complement and intersection.

First let us recall the definition of linear grammars.

**Definition 19.** (Linear grammars). *A grammar G* = (*N, T, S, P*) *is linear if
each of its productions has one of the following forms: A* → *u, A* → *uBv,
where A,B* ∈ *N, u,v* ∈ *T*^{*}. *The languages that can
be generated by linear grammars are the linear languages.*

This class of languages are between the (classes of) type 3 and type 2 languages of the Chomsky hierarchy, thus we may call them type 2.5 languages. The linear grammars inherit the property of the regular grammars that there is at most one nonterminal in any sentential form. However, this nonterminal is not restricted to be at the end of the sentential form (as it was in the regular case), it can be in an arbitrary place.

Now, we give an example, where the nonterminal is in the middle of the sentential forms in every derivation.

G= ({S}, {a,b},S, {S→aSb, S→λ}).

*Then, every (finished) derivation in this grammar has the following form: applying the first rule n times
(n* ∈ ℕ, *n* ≥ 0*)
and then applying the second rule. Hence, the generated language is*
{*a ^{n}b^{n}*∣

Remember that in Example 29. we have shown that this language is not regular, and thus, with Example 36 we have just proven that the class of linear languages strictly includes the class of regular languages.

**Theorem 11.** *Every linear language can be generated by a grammar having productions in the following forms, only:*

A→aB, A→Ba, A→a, S→λ.

*(We call a grammar with this property a linear grammar in normal form.)*

**Proof.** The proof is constructive. Given *G* = (*N, T, S, P*) a
linear grammar, we show how one can construct a grammar *G'* that is in normal form and equivalent to *G*.

If there is a (are) rule(s) of the form *A* → λ (*A* ≠ *S*),
then first the Empty-word lemma (Theorem 1.)
is applied and grammar *G''* = (*N'', T, S', P''*) is obtained that may only contain λ in the right hand side
of the rule *S'* → λ.
Then in *G''* either *S* = *S'*, *N''* = *N*
or *S'* ∉ *N* and in this latter case *N''* = *N* ∪ {*S'*}.

Now as an intermediate step, we replace each of the rules *A* → *uBv*
(where *u* ≠ λ, *v* ≠ λ) with rules
*A* → *uX, X* → *Bv*, where *X* is a newly introduced nonterminal, i.e.,
*X* ∉ *N''*. After the substitution of each rule of this form
grammar *G'''* = (*N''', T, S', P'''*) is obtained and it is equivalent to the original grammar *G*.
(See the left side of Figure 3.1., for an example.)

**3.1. ábra - In derivations the rules with long right hand side (left) are replaced by
chains of shorter rules in two steps, causing a binary derivation tree in the resulting grammar (right).**

Now let us eliminate the rules having more than one terminal in their right hand side (i.e., they have long right hand side). Actually, rules of the form
*A* → *a*_{1}... *a _{k}* for

A→X_{1}a,_{k}X_{1}→X_{2}a_{k}_{-1}, ...,X_{k}_{-2}→X_{k}_{-1}a_{2},X_{k}_{-1}→Ba_{1}.

(See the right hand side of Figure 3.1, for an example.)

Now a grammar *G''''* = (*N'''', T, S' ,P''''*) is obtained and the set of productions *P''''* can
contain only rules of the following forms

A→a, A→aB, A→Ba, A→B, S'→ λ

(*A, B* ∈ *N'''', a* ∈ *T*). Now, as a final step of our (algorithm)
proof we need to exclude the chain rules (rules of the form *A* → *B*).
This step can be done in a similar way as we showed in the proof of Theorem 2: first the set *U*(*A*)
is determined for each nonterminal *A*,
and then the grammar *G'* = (*N'''', T, S', P'*) is obtained having
*P'* = {*A* → *r*∣∃ *B* ∈ *N''''* such that
*B* → *r* ∈ *P''''*, *r* ∉ *N''''*
and *B* ∈ *U*(*A*)}.
This grammar is in a normal form and it generates the same language as *G*, so the (construction) proof is finished.

QED.

**Example 37.** *Let*

G= ({S, A, B},{1,3,7},S,

{S→ 11S37,S→ 7A,S→B313,A→ 333B7777,A→S, B→ λ,B→ 731 }).

*Give a linear grammar in a normal form that is equivalent to G.*

*Solution: *

*Since there is a rule B* → λ *in the grammar, we start by applying the Empty-word lemma.
Then set U* = {*B*} *and it is the set of nonterminals
from which the empty word λ can be derived. Consequently, we obtain the grammar*

G''= ({S, A, B}, {1,3,7},S,

{S→ 11S37,S→ 7A,S→B313,S→ 313,A→ 333B7777,A→ 3337777,A→S,B→ 731 }).

*We have the rules S* → 11*S*37 *and A* → 333*B*7777
*having terminals on both sides of the nonterminal in the right hand side, thus,
the intermediate step results in the grammar*

G'''= ({S, A, B, X_{1},X_{2}}, {1,3,7},S,

{S→ 11X_{1},X_{1}→S37,S→ 7A,S→B313,S→ 313,A→ 333X_{2},X_{2}→B7777,A→ 3337777,A→S,B→ 731 }).

*Now let us replace the rules with more than one terminals on their right hand side by chains of rules having exactly one terminal on their
right hand sides:*

G''''= ({S, A, B, X_{1},X_{2},X_{3},X_{4},X_{5},X_{6},X_{7},X_{8},X_{9},X_{10},X_{11},X_{12},X_{13},X_{14},X_{15},X_{16},X_{17},X_{18},X_{19},X_{20},X_{21}}, {1,3,7},S,

{S→ 1X_{3},X_{3}→ 1X_{1},X_{1}→X_{4}7,X_{4}→S3,S→ 7A,S→X_{5}3,X_{5}→X_{6}1,X_{6}→B3,S→ 3X_{7},X_{7}→ 1X_{8},X_{8}→ 3,A→ 3X_{9},X_{9}→ 3X_{10},X_{10}→ 3X_{2},X_{2}→X_{11}7,X_{11}→X_{12}7,X_{12}→X_{13}7,X_{13}→B7,A→ 3X_{14},X_{14}→ 3X_{15},X_{15}→ 3X_{16},X_{16}→ 7X_{17},X_{17}→ 7X_{18},X_{18}→ 7X_{19},X_{19}→ 7,A→S,B→ 7X_{20},X_{20}→ 3X_{21},X_{21}→ 1 }).

*However, grammar G'''' contains the chain rule A → S, and thus *

U(S) = {S},U(A) = {A,S},

*for the other nonterminals the trivial sets are obtained since they do not appear in any chain rules: *
*U*(*B*) = {*B*} and
*U*(*X _{i}*) = {

G'= ({S, A, B, X_{1},X_{2},X_{3},X_{4},X_{5},X_{6},X_{7},X_{8},X_{9},X_{10},X_{11},X_{12},X_{13},X_{14},X_{15},X_{16},X_{17},X_{18},X_{19},X_{20},X_{21}}, {1,3,7},S,

{S→ 1X_{3},A→ 1X_{3},X_{3}→ 1X_{1},X_{1}→X_{4}7,X_{4}→S3,S→ 7A,A→ 7A,S→X_{5}3,A→X_{5}3,X_{5}→X_{6}1,X_{6}→B_{3},S→ 3X_{7},A→ 3X_{7},X_{7}→ 1X_{8},X_{8}→ 3,A→ 3X_{9},X_{9}→ 3X_{10},X_{10}→ 3X_{2},X_{2}→X_{11}7,X_{11}→X_{12}7,X_{12}→X_{13}7,X_{13}→B7,A→ 3X_{14},X_{14}→ 3X_{15},X_{15}→ 3X_{16},X_{16}→ 7X_{17},X_{17}→ 7X_{18},X_{18}→ 7X_{19},X_{19}→ 7,B→ 7X_{20},X_{20}→ 3X_{21},X_{21}→ 1 }).

*It is linear, in normal form and equivalent to G.*

As special cases of linear grammars the right-linear grammars (i.e., our regular grammars) and also the so-called left-linear grammars are defined.

**Definition 20.** (Left-linear grammars). *A grammar G* = (*N, T, S, P*)
*is left-linear if each of its productions
has one of the following forms: A* → *u, A* → *Bu,
where A,B* ∈ *N, u* ∈ *T*^{*}.

We state the following interesting result about the languages that can be generated by left-linear grammars, without proof.

**Theorem 12.** *The language class that can be generated by left-linear grammars is exactly the class of regular languages.*

We note here that even though regular languages can be generated by using left linear or right linear rules, using both in the same grammar leads to a different language class.

Finally, in this section, we provide a few exercises.

**Exercise 49.** *Give a linear grammar that generates the language *

{0

1^{m}2^{n}∣^{n}n, m∈ ℕ}.

**Exercise 50.** *Give a linear grammar that generates the language *

{

a^{3}^{n}b^{2}∣^{n}n∈ ℕ}.

**Exercise 51.** *Give a linear grammar in normal form that generates the language*

{

ucv∣u, v∈ {a,b}^{*}such that the number ofa's are the same inuandv}.

**Exercise 52.** *Let*

G= ({S, A, B, C}, {a, b, c},S,

{S→aaSbc,S→B,S→cccC,A→ λ,A→aaa,A→aCabc,B→A,B→Bbbb,C→cAb,C→c}).

*Give a linear grammar in normal form that is equivalent to G.*