1.3. Generative Grammars

The generative grammar is a universal method to define languages. It was introduced by Noam Chomsky in the 1950s. The formal definition of the generative grammar is the following:

Definition 3. The generative grammar (G) is the following quadruple:

G = ( N, T, S, P )

where

The following properties hold: N ∩ T = and S N.

Let us denote the union of the sets N and T by V ( V = N T). Then, the form of the production rules is V *N V * → V *.

Informally, we have two disjoint alphabets, the first alphabet contains the so called start symbol, and we also have a set of productions. The productions have two sides, both sides are words over the joint alphabet, and the word on the left hand side must contain a nonterminal symbol. We have a special notation for the production rules. Let the word p be the left hand side of the rule, and the word q be the right hand side of the rule, then we use the pqP expression.

Example 8. Let the generative grammar G be G = ( N, T, S, P ), where

N = {S,A},
T = {a,b}, and
P = {
  SbAbS,
  bAbbSab,
  A → λ,
  Saa
}.

In order to understand the operation of the generative grammar, we have to describe how we use the production rules to generate words. First of all, we should give the definition of the derivation.

Definition 4. Let G = ( N, T, S, P ) be a generative grammar, and let p and q be words over the joint alphabet V = N T. We say that the word q can be derived in one step from the word p, if p can be written in a form p = uxw, q can be written in a form q = uyw, and x → y P. (Denoted by p q.) The word q can be derived from the word p, if q = p or there are words r1, r2,..., rn such that r1 = p, rn = q and the word ri can be derived in one step from the word ri-1, for each 2 ≤ i ≤ n. (Denoted by p * q.)

Now, that we have a formal definition of the derivation, let us see an example for deeper understanding.

Example 9. Let the generative grammar G be G = ( N, T, S, P ) , where

N = {S,A},
T = {0,1}, and
P = {
  S → 1,
  S → 1A,
  AAA,
  A → 0,
  A → 1
}.

Let the words p,t and q be p = A0S0, t = A01A0, and q = A0110. In this example, the word t can be derived from the word p in one step, (p t), because p can be written in a form p = uxw, where u = A0, x = S, w = 0, and the word t can be written in a form t = uyw, where y = 1A, and S → 1A P.

The word q can be derived from the word p, (p *q), because there exist words r1,r2 and r3 such that r1 = p, r2 = t, r3 = q and r1 r2 and r2r3.

Now we have all the definitions to demonstrate how we can use the generative grammar to generate a language.

Definition 5. Let G = ( N, T, S, P ) be a generative grammar. The language generated by the grammar G is

L(G)={ppT*, S* p}.

The above definition claims that the language generated by the grammar G contains each word over the terminal alphabet which can be derived from the start symbol S.

Example 10. Let the generative grammar G be G = ( N, T, S, P ), where

N = {S,A},
T = {a,b}, and
P = {
  Sbb,
  SASA,
  Aa
}.

In this example, the word bb can be derived from the start symbol S in one step, because S → bb P, so the word bb is in the generated language, bb L(G).

The word abba can be derived from the start symbol, because S ASA, ASA aSA, aSA abbA, abbA abba, so the word abba is also in the generated language, abba L(G).

The word bab can not be derived from the start symbol, so it is not in the generated language, bab L(G).

In this case, it is easy to determine the generated language, L(G) = {aibbaii ≥ 0}.

Exercise 3. Create a generative grammar G, which generates the language L = {a*b+c*}!

Exercise 4. Create a generative grammar G, which generates the language of all binary numbers!