Chapter 10. Association rules

Table of Contents

Extraction of association rules
Asszociációs szabályok kinyerése nem tranzakciós adathalmazból
Evaluation of performance for association rules
Performance of association rules - Simpson's paradox

Extraction of association rules

Description

The process shows, using the Extended Bakery dataset, how association rules can be extracted from a transactional dataset. The emphasis is on the items that are present in the transactional datasets from the possible items, i.e. on the items which form part of the given transaction, and not those which are missing from it. If such a transactional dataset is in an uncompressed sparse matrix representation, so all records contain a binomial value for each of the possible items, the extraction of association rules can be executed without any complex transformation, the only thing that has to be kept in mind is that the attributes representing the individual items should be of a binomial type. Using these, the frequent item sets can be extracted, and based on these, the association rules valid for the dataset can be extracted.

Input

Extended Bakery [Extended Bakery]

Output

Using the FP-Growth algorithm on the version of the dataset that contains 20000 records, the following frequent item sets are created:

Figure 10.1. List of the frequent item sets generated

List of the frequent item sets generated

Interpretation of the results

Based on these frequent item sets, the appropriate association rules can be created. It can be set the rules meeting what kind of criteria should be considered valid - by default, a required level of confidence can be set, but filtering can be done based on other values as well. Using the emerging rules, deeper conclusions can be drawn regarding the connections between the data. Among other things, the table representation of the rules can aid this, as in this representation, different kinds of filters can be utilized to filter out the rules considered interesting, for example by outcome or by confidence level:

Figure 10.2. List of the association rules generated

List of the association rules generated

Besides the table representation, a graphic representation can also be used, with available filtering conditions that are similar to those of the former:

Figure 10.3. Graphic representation of the association rules generated

Graphic representation of the association rules generated

Video

Workflow

assoc_exp1.rmp

Keywords

frequent item sets
association rules
transactional data
binomial attributes

Operators

Create Association Rules
FP-Growth
Numerical to Binominal
Read AML