Part I. Data mining tools

Introduction

In this part, data mining tools and softwares are overviewed. There are three necessary conditions of the succesful data mining. First, we need an appropriate data sets to perform data mining. In practice, this is often a task-oriented data-mart generated from the enterprise data warehouse. In the education, and so in this curriculum, the datasets are taken from widely used data repository. All datasets are attached to this material. Another important condition to data mining is the data mining expert. We hope that this curriculum will be able to contribute to the education of this professionals. Finally, the key is the software with which data mining is performed. They can be classified on the basis of several criteria, e.g., business or free, self-contained or integrated, general or specific, theme-oriented or not. The most up-to-date information on this topic can be found on the website KDnuggets. The reader can get fresh information on current job openings, courses, conferences etc. from here as well.

In the curriculum two softwares are discussed in detail: a leading one from the free data mining softwares, RapidMiner 3.5 and one of the most widely used commercial data mining softwares, SAS® Enterprise Miner™ Version 7.1. The list of the data mining softwares below is based on the KDnuggets portal.