The experiment illustrates how existing SAS data sets can be made available to the
Enterprise Miner™ by the
Input Data operator.
In the experiment, an earlier prepared SAS dataset will be read. A SAS dataset can be created by using the
SAS® System or the
SAS® Enterprise Guide™.
In order to load a SAS file that we would like to use we need to know the path to the file.
The file may be on the local machine, but also can be on a remote SAS server. The SAS file can be
read by using a wizard that guides you through the entire process. Then, the original dataset is
sampled by the
Sample operator where a part of the relatively large data file
Individual household electric power consumption [UCI MLR]
A dataset which contains the
10 percent of the original dataset. At the sampling,
absolute and relative sample size can be chosen as well. It is also possible to set the
Random Seed parameter which controls the cycle of the pseudo-random number
generator. If the same value is set to on different machines we get the same random sample.
We also set the method of sampling, e.g. simple random, clustered or stratified.
Whenever we rerun the process, the current state of the data set will be imported to the system, so
Input Data operator can be used to retrieve data files and to rerun the
data mining process based on them, which are updated constantly by other SAS based systems.