1.2. Formal Systems

In this section, the definition of basic rewriting systems in general and a specific one, the Markov algorithm is presented.

Definition 1. A formal system (or rewriting system) is a pair W = ( V, P ), where V is an alphabet and P is a finite subset of the Cartesian product V *× V *. The elements of P are the (rewriting) rules or productions. If ( p, q ) P, then we usually write it in the form p → q. Let r, s V *, and we say that s can be directly derived (or derived in one step) from r (formally: r s) if there are words p, p', p'', q V * such that r = p' pp'', s = p' qp'' and p → q P. The word s can be derived from r (formally: r * s) if there is a sequence of words r0, r1, ..., rk ( k ≥ 1 ) such that r0 = r, rk = s and ri ri+1 holds for every 0 ≤ i < k. Moreover, we consider the relation r * r for every word r V *. (Actually, the relation * is the reflexive and transitive closure of the relation ⇒.

A rewriting (or derivation) step can be understood as follows: the word s can be obtained from the word r in such a way that in s the right hand side q of a production of P is written instead of the left hand side p of the same production in r.

Example 5. Let W = ( V, P ) with V = { a, b, c, e, l, n, p, r, t } and P = { able → can, r → c, pp → b }. Then applerat applecat ablecat cancat, and thus applerat * cancat in the system W.

Now, we are going to present a special version of the rewriting systems that is deterministic, and was given by the Russian mathematician A. A. Markov as normal algorithms.

Definition 2. M = ( V, P, H ) is a Markov (normal) algorithm where

The execution of the algorithm on input w V * is as follows:

  1. Find the first production in P such that its left-hand-side p is a subword of w. Let this production be p → q. If there is no such a production, then step 5 applies.

  2. Find the first (leftmost) occurrence of p in w (such that w = p'pp'' and p'p contains the subword p exactly once: as a suffix).

  3. Replace this occurrence of p in w by the right hand side q of the production.

  4. If p → q H, i.e., the applied production is a halting production, then the algorithm is terminated with the obtained word (string) as an output.

  5. If there are no productions in P that can be applied (none of their left-hand-side is a subword of w), then the algorithm terminates and w is its output.

  6. If a rewriting step is done with a non halting production, then the algorithm iterates (from step 1) for the newly obtained word as w.

We note here that rules of type λ → q also can be applied to insert word q to the beginning of the current (input) word.

The Markov algorithm may be terminated with an output, or may work forever (infinite loop).

Example 6. Let M = ({1,2,3 },{21→ 12, 31→ 13, 32→ 23},{}) be a Markov algorithm. Then, it is executed to the input 1321 as follows:

1321 ⇒ 1312 ⇒ 1132 ⇒ 1123.

Since there are no more applicable productions, it is terminated. Actually, it is ordering the elements of the input word in a non-decreasing way.

Example 7. Let

W=({a,b,c},{aa → bbb, ac→ bab, bc → a},{}).

Let us apply the algorithm W to the input


As it can be seen in Animation 1, the output is


(For better understanding in the Animation the productions of the algorithm is numbered.)

Animation 1.

Exercise 1. Execute the Markov algorithm

M = ({ 1, + }, { 1 + → + 1, + + → +, + → λ } ,{ + → λ })

on the following input words:

Exercise 2. Execute the Markov algorithm M = ({1,×, a,b },

{ × 11 → a × 1, × 1 → a, 1aa1b, baab, b1 → 1b, a1 → a, abb, b → 1 } ,{ })

on the following input words:

The previous two exercises are examples for Markov algorithms that compute unary addition and unary multiplication, respectively.

It is important to know that this model of the algorithm is universal in the sense that every problem that can be solved in an algorithmic way can be solved by a Markov algorithm as well.

In the next section, other variations of the rewriting systems are shown: the generative grammars, which are particularly highlighted in this book. As we will see, the generative grammars use their productions in a non-deterministic way.