20.5. 20.5 A()-indexes

In case of 1-indexes, nodes of the same class fit to the same label sequences starting from the root. This means that the nodes of a class cannot be distinguished by their ancestors. Modifying this condition in such a way that indistinguishability is required only locally, that is, nodes of the same class cannot be distinguished by at most generations of ancestors, we obtain an index that is coarser and consists of less classes than the 1-index. So the size of the index decreases, which also decreases the cost of the evaluation of queries. The 1-index was safe and exact, which we would like to preserve, since these guarantee that the result we get when evaluating the queries according to the index is the result we would have obtained by evaluating the query according to the original graph. The A()-index is also safe, but it is not exact, so this has to be ensured by modification of the evaluation algorithm.

Definition 20.30 The k-bisimulation is anequivalence relation on the nodes of a graph defined recursively as

i) if and only if ,

ii) if and only if and if there is an edge from node to node , then there is a node from which there is an edge to node and , also, if there is an edge from node to node , then there is a node from which there is an edge to node and .

In case and are k-bisimilar. The classes of the partition according to the A()-index are the equivalence classes of the -bisimulation.

The “A” in the notation refers to the word “approximative”.

Note that the partition belonging to is the basic partition, and by increasing we refine this, until the coarsest 1-index is reached.

Denote by the label sequences of length at most to which fits in graph . The following properties of the A()-index can be easily checked.

Claim 20.31

1. If and are -bisimilar, then .

2. If is a node of the A()-index and , then .

3. The A()-index is exact in case of simple expressions of length at most .

4. The A()-index is safe.

5. The -bisimulation is a (not necessarily proper) refinement of the -bisimulation.

The A()-index compares the -distance half-neighbourhoods of the nodes which contain the root, so the equivalence of the nodes is not affected by modifications outside this neighbourhood, as the following proposition shows.

Claim 20.32 Suppose that the shortest paths from node to nodes and contain more than edges. Then adding or deleting an edge from to does not change the -bisimilarity of and .

We use a modified version of the PT -algorithm for creating the A()-index. Generally, we can examine the problem of approximation of the coarsest stable refinement.

Definition 20.33 Let be a partition of in the directed graph , and let be a sequence of partitions such that and is the coarsest refinement of that is stable with respect to . In this case, partition is the -step approximation of the coarsest stable refinement of .

Note that every term of sequence is a refinement of , and if , then is the coarsest stable refinement of . It can be checked easily that an arbitrary approximation of the coarsest stable refinement of can be computed greedily, similarly to the PT -algorithm. That is, if a block of is not stable with respect to a block of , then split according to , and consider the partition instead of .

Naive-Approximation( )

  1     2  
                     FOR
                   
                   
                  
                     TO
                   
                     3    
                     DO
                   
                     4       
                     FOR
                   all  such that    5          
                     DO
                   
                     6  
                     RETURN
                   
                   
               

Note that the algorithm Naive-Approximation could also be improved similarly to the PT -algorithm.

Algorithm Naive-Approximation can be used to compute the A()-index, all we have to notice is that the partition belonging to the A()-index is stable with respect to the partition belonging to the A()-index in graph . It can be shown that the computation cost of the A()-index obtained this way is , where is the number of edges in graph .

A( )-Index-Evaluation( )

  1  let  be the A()-index of    2  
                  
                     Index-Evaluation(
                     
                     )
                     3  
                     FOR
                   all    4    
                     DO
                   
                  
                     IF
                   
                     5       
                     THEN
                   
                     6  
                     RETURN
                   
                   
               

The A()-index is safe, but it is only exact for simple expressions of length at most , so in step 4, we have to check for all elements of set whether it satisfies query , and we have to delete those from the result that do not fit to query . We can determine using a finite nondeterministic automaton whether a given node satisfies expression as in Proposition 20.6, but the automaton has to run in the other way. The number of these checks can be reduced according to the following proposition, the proof of which is left to the Reader.

Claim 20.34 Suppose that in the graph belonging to the A()-index, index node fits to a label sequence that ends with , . If all label sequences of the form s'.s that start from the root satisfy expression in graph , then all elements of satisfy expression .

Exercises

20.5-1 Denote by the A()-index of . Determine whether .

20.5-2 Prove Proposition 20.31.

20.5-3 Prove Proposition 20.32.

20.5-4 Prove Proposition 20.34.

20.5-5 Prove that the algorithm Naive-approximation generates the coarsest k-step stable approximation.

20.5-6 Let be a set of indexes, the elements of which are A()-, A()-, , A()-indexes, respectively. is minimal, if by uniting any two elements of , is not stable with respect to , . Prove that for arbitrary graph, there exists a unique minimal the elements of which are coarsest A()-indexes, .