In formal language theory, parsing - or the so called syntactic analysis - is a process when the parser determines if a given string
can be generated by a given grammar. This is very important for compilers and interpreters. For example, it is not too difficult to create
a context-free grammar *G _{P}* generating all syntactically correct Pascal programs.
Then, we can use a parser to decide - about a Pascal program
written by a programmer - if the program is in the generated language

We have a given Chomsky normal form grammar *G* = (*N, T, S, P*) and a word
*p* = *a*_{1}*a*_{2}... *a _{n}*.
The Cocke-Younger-Kasami algorithm is a well
known method to decide wether

*Example 45.* *In this example, we use the CYK algorithm to show that the grammar
G generates the word abbaa.*

G= ({S, A, B}, {a,b},S, P)P= {S→SA,S→AB,A→BS,B→SA,S→a,A→a,B→b}

*As you can see, S* ∈ *M*(5,1), *so abbaa* ∈ *L*(*G*).

**Exercise 60.** *Use the CYK algorithm to show that the grammar G generates the word baabba.*

G= ({S, A, B, X, Y, Z}, {a,b},S, P)P={S→AY,Y→XB,X→BA,X→ZA,Z→BX,A→b,B→a}

**Exercise 61.** *Use the CYK algorithm to decide if the word cbacab can be generated by grammar G.*

G= (S, A, B, C, D}, {a,b,c},S, P)P= {S→AB,A→CA,A→SS,B→CD,A→b,C→a,C→b,D→c}

The Earley algorithm is designed to decide if a context-free grammar generates a terminal word. Sometimes it is not comfortable to create and use an equivalent Chomsky normal form grammar for a λ-free context-free grammar, because the Chomsky normal form grammar could have many more production rules than the original grammar. This is why the Earley algorithm is more widely used than the CYK algorithm for computerized lexical analysis. Although the Earley algorithm looks more complicated for humans, - and actually, it is more complicated compared to the the very simple CYK algorithm, - but after the implementation, there is no difference between the complexity of the two algorithms for computers.

Now we are going to show the steps of the λ-free version of the Earley algorithm. It can work with rules having form
*A* → λ as well,
with minor modification, but in practice we do not need the extended version.

Let *G* = (*N, T, S, P*) be a λ-free, context-free grammar, and
*p* = *a*_{1}*a*_{2}... *a _{n}* ∈

The steps of the algorithm are the following:

Let

*S*→*.q*∈*M*(0,0) if*S*→*q*∈*P*, and let*j*= 0.Let

*A*→*.q*∈*M*(*j,j*) if*A*→*q*∈*P*, and there exists an integer*k*≤*j*such that*B*→*r.At*∈*M*(*k,j*).Let

*j*=*j*+1 and let*i*=*j*-1.Let

*A*→*ra*∈_{j}.t*M*(*i,j*) if*A*→*r.a*∈_{j}t*M*(*i,j*-1).Let

*A*→*rB.t*∈*M*(*i,j*) if there exists an integer*i*≤*k*<*j*such that*A*→*r.Bt*∈*M*(*i,k*), and*B*→*q.*∈*M*(*k,j*).If

*i*> 0 then*i*=*i*-1 and goto 4.If

*i*= 0 and*j*<*n*then goto 2.If

*i*= 0 and*j*=*n*then finished.

Here *q* ∈ (*T* ∪ *N*)^{+}, *A, B* ∈
*N, r, t* ∈ (*T* ∪ *N*)^{*},
and of course *i,j,k* are integers.

Grammar *G* generates the word *p* (*p* ∈ *L*(*G*)),
if and only if there is a production rule in *M*(0,*n*), whose left hand side is the start symbol *S*,
and there is a dot at the end of the right hand side of the rule.

**Examplee 46.** *In this example, we have a λ-free context-free grammar G,
and we have to decide if the word a*a+a can be generated by this grammar.*

G= ({S, A, B}, {a,+,*,(,)},S, P)P= {S→S+AA→A*BB→ (S)S→AA→BB→a}

*As you can see, the top right cell contains a rule, whose left hand side is the start symbol S, and there is a dot at the end of the
right hand side of the rule, so a+a*a* ∈ *L*(*G*).

**Exercise 62.** *Use the Earley algorithm to decide if the word* 100110 *can be generated by grammar G.*

G= ({S, A, B}, {0,1},S, P)P= {S→ 0A1,S→ 1B0,A→B1,B→S1,A→ 0,B→ 1 }

**Exercise 63.** *Use the Earley algorithm to decide if the word bbabb can be generated by grammar G.*

G= ({S, A, B}, {a,b},S, P)P= {S→BAB→bAB,A→BAb,B→SbA,A→a,B→b}