Table of Contents
The process shows, using the Extended Bakery dataset, how association rules can be extracted from a transactional dataset. The emphasis is on the items that are present in the transactional datasets from the possible items, i.e. on the items which form part of the given transaction, and not those which are missing from it. If such a transactional dataset is in an uncompressed sparse matrix representation, so all records contain a binomial value for each of the possible items, the extraction of association rules can be executed without any complex transformation, the only thing that has to be kept in mind is that the attributes representing the individual items should be of a binomial type. Using these, the frequent item sets can be extracted, and based on these, the association rules valid for the dataset can be extracted.
Extended Bakery [Extended Bakery]
Using the FP-Growth algorithm on the version of the dataset that contains 20000 records, the following frequent item sets are created:
Based on these frequent item sets, the appropriate association rules can be created. It can be set the rules meeting what kind of criteria should be considered valid - by default, a required level of confidence can be set, but filtering can be done based on other values as well. Using the emerging rules, deeper conclusions can be drawn regarding the connections between the data. Among other things, the table representation of the rules can aid this, as in this representation, different kinds of filters can be utilized to filter out the rules considered interesting, for example by outcome or by confidence level:
Besides the table representation, a graphic representation can also be used, with available filtering conditions that are similar to those of the former: