4. fejezet - Context-free Languages


4.1. Notation Techniques for Programming Languages
4.1.1. Backus-Naur Form
4.1.2. Syntax Diagram
4.2. Chomsky Normal Form
4.3. Pumping Lemma for Context-free Languages
4.4. Closure Properties
4.5. Parsing
4.5.1. The CYK Algorithm
4.5.2. The Earley Algorithm
4.6. Pushdown Automata
4.6.1. Acceptance by Empty Stack
4.6.2. Equivalence of PDAs and Context-free Grammars
4.6.3. Deterministic PDA
4.6.4. One-turn Pushdown Automata

Summary of the chapter: This chapter will mainly deal with the properties of the type-2 language class of the Chomsky hierarchy, called context-free languages. This language class has many practical applications used in various areas of computer science. We will mention some of the most important ones. First, we discuss the notation techniques used to describe the syntax of programming languages, the Backus-Naur form, and the syntax diagram. Second, we introduce a normal form for context-free languages. This normal form will be used in Section 4.5., which is dedicated to parsing. The first pumping lemma, the Bar-Hillel lemma will be explained, and the closure properties of the context-free language class will be proven. In the last part of this chapter we introduce the pushdown automaton, we show its features, and its applications.

4.1. Notation Techniques for Programming Languages

Notation techniques were introduced as simple methods to describe different parts of programming languages. These parts contain terminal and nonterminal symbols. Terminals are given, and nonterminals can be built up from terminals and already defined nonterminals by using simple operations. These operations are the following:

  1. Concatenation, when symbols are written after each other.

  2. Alternation is a selection from different possibilities.

  3. Option is a special selection between a symbol and the empty word.

  4. Repetition, when a symbol can be repeated any (≥ 0) number of times.

In this section we introduce two well known techniques, the Backus-Naur form (BNF) and the Syntax diagram, but many others have been introduced for a variety of reasons. For example, the Extended Backus-Naur form is an extended version of the standard BNF.

4.1.1. Backus-Naur Form

BNF was designed by Peter Naur in 1963 as a simplified version of the notation technique of John Backus. It was used first to describe the programming language ALGOL60. Table 4.1. shows the marking of the operations used by BNF.

4.1. táblázat - Operations of the BNF metasyntax.

Definition Concatenation Alternation Option Repetition
∷= [] {}

As you can see, concatenation does not have any special mark, we just write the symbols after each other. We use a terminal symbol as it is, for example, the mark of one as a number is 1. For nonterminals we use their names between angle brackets. We have a special mark to define nonterminal symbols, followed by the description of the nonterminal.

Example 38. In this example, we describe a non-negative binary number using BNF metasyntax.

   < digit > ::= 0 ∣ 1 
   < positive > ::= [ + ] 1 { < digit > }
   < number > ::= 0 ∣ < positive >

4.1.2. Syntax Diagram

A syntax diagram is a graphical notation technique. It uses simple graphs, each of them has an entry and an end point. The concatenation, alternation, option and repetition operations are implemented in the structure of the graph.

Example 39. Figure 4.1. describes a non-negative binary number using the syntax diagram.

4.1. ábra - Syntax diagram.

Syntax diagram.