Importing data from a CSV file

Description

The process demonstrates how to import data from CSV datasets by the File Import operator. In the experiment, the Bodyfat dataset of the StatLib data repository is used. In order to open the dataset we would like to use, we need to know the path to this file which can be on the local machine or on a remote SAS server as well. This path can be assigned step by step in a menu.

Figure 14.4. The list of file in the File Import operator

The list of file in the File Import operator

Input

Bodyfat [StatLib]

Note

The dataset was donated by Roger W. Johnson to the StatLib.

The process of import can be parametrized in the File Import operator. We can set the maximal number of records, the maximal number of attributes and the separator character. It is also possible to define the number of rows which determines the file structure.

Figure 14.5. The parameters of the File Import operator

The parameters of the File Import operator

Output

A datatset which consists of the imported data.

Figure 14.6. A small portion of the dataset

A small portion of the dataset

Figure 14.7. The metadata of the resulting dataset

The metadata of the resulting dataset

Interpretation of the results

Whenever we rerun the process, the current state of the data set will be imported to the system, so the Input Data operator can be used to reload data files and to rerun the data mining process based on them, which are updated constantly by other SAS based systems.

Video

Workflow

sas_import_exp2.xml

Keywords

importing data
CSV file

Operators

File Import
Graph Explore
Statistic Explore