Chapter 29. Perfect Arrays

Sequences of elements of given sets of symbols have a great importance in different branches of natural sciences. For example in biology the 4-letter set containing the nucleotides (adenine, cytosine, guanine and thymine) and the 20-letter set containing the amino-acids (alanine, cysteine, asparagin-acid, glutamine-acid, phenyl, glycine, histidine, isoleucine, lysine, leucine, methionine, asparagine, proline, glutamine, arginine, serine, threonine, valine, triptophan, tyrosine) play leading role []. Mathematics uses the 10-letter set , while computer science prefer the binary set .

Arrays with elements from given sets of symbols have various applications, e.g. Dénes and Keedwell [] reported on application in frequency allocation of multibeam satellites, Harvit [] in designing mask configuration for spectrometers, Mitchell and Paterson [] in cryptography, Ma [] in picture coding and processing, Arazi [] in position recovery of cars.

Complexity is a basic characteristic of arrays of symbols, since affects the cost of storage and reproduction, and the quantity of information stored in the arrays. The usual complexity measures are based on the time (or memory) needed for generating or recognizing them.

Cyclic sequences in which every possible sequence of a fixed length occurs exactly once have been studied for more than a hundred years []. This mathematical problem was extended to two-dimensional arrays by Reed and Stewart [] who gave an example of sized perfect map. Fan et al. in 1985 [] proved the existence of binary perfect maps for larger sizes.

In the first seven sections of this chapter we deal with the existence and construction of perfect arrays. The last three sections superperfect sequences, the subword complexity and its extension, the -complexity, are analysed.

29.1. 29.1 Basic concepts

Let be the set of integers. For we denote the set by and the set by . Let and , and . Let , , and be vectors of length , an infinite vector with .

A d-dimensional n-ary array A is a mapping .

If there exist a vector and an array such that

then A is a b-periodic array and is a period of . The -sized subarrays of are the -periodic -ary arrays.

Although our arrays are infinite we say that a b-periodic array is b-sized.

Indexset of a -periodic array is the Cartesian product

A dimensional -periodic -ary array is called -perfect, if all possible -ary arrays of size appear in exactly once as a subarray.

Here is the alphabet size, gives the number of dimensions of the “window” and the perfect array , the vector characterizes the size of the window, and the vector is the size of the perfect array .

An -perfect array is called -cellular, if divides for . A cellular array consists of disjoint subarrays of size , called cells. In each cell the element with smallest indices is called the head of the cell. The contents of the cell is called pattern.

The product of the elements of a vector is called the volume of the vector and is denoted by . The number of elements of the perfect array is called the volume of and is denoted by .

If , then the -perfect array is called symmetric. If is symmetric and , then is called doubly symmetric. If is doubly symmetric and

1. , then is called a double sequence;

2. , then is called a double square;

3. , then is called a double cube.

According to this definition, all perfect sequences are doubly symmetric. In the case of symmetric arrays we use the notion and in the case of doubly symmetric arrays we use instead of .

The first known result originates from Flye-Sainte who proved the existence of -perfect sequences for all possible values of in 1894.

One-dimensional perfect arrays are often called de Bruijn or Good sequences. Two-dimensional perfect arrays are called also perfect maps or De Bruijn tori.

Even De Bruijn sequences—introduced by Antal Iványi and Zoltán Tóth in 1988—are useful in construction of perfect arrays when the size of the alphabet is an even number and the window size is . Their definition is as follows.

If is an even integer then an -perfect sequence is called even, if and imply is even.

Iványi and Tóth in 1988 and later Hurlbert and Isaak in 1994 provided a constructive proof of the existence of even sequences. The later algorithm is stronger since it constructs a universal infinite sequence whose prefixes are even sequences for the corresponding alphabet size.

Lexicographic indexing of an array for means that the index is defined as

The concept of perfectness can be extended to infinite arrays in various ways. In growing arrays introduced by Hurlbert and Isaak in 1994 the window size is fixed, the alphabet size is increasing and the prefixes grow in all directions.

Let and be positive integers with and be a strictly increasing sequence of positive integers. An array is called -growing, if the following conditions hold:

1. for ;

2. ;

3. the prefix of is -perfect array for .

For the growing arrays we use the terms growing sequence, growing square and growing cube.

For the new alphabet size is

where is the product of the prime divisors of not dividing .

Note, that alphabet size and new alphabet size have the property that .

Another possibility for the extension of concept of perfectness is to increase the window size. Then we get supercomplex arrays studied in Section 29.8.

Exercises

29.1-1 Gather one-dimensional examples of the applications of perfect arrays.

29.1-2 Gather two- and three-dimensional examples of the application of perfect arrays.