The experiment presents vizualization and dimension reduction methods by the help of the
dataset. Multidimensional datasets can be vizualized by the
operator. Dimension reduction can be performed by the
operator. After the dimension reduction, it becomes much easier to display multi-dimensional datasets
in the space of principal components.
Graph Explore operator provides several graphical tools for displaying multi-dimensional
datasets, which plays a key role in the preprocessing step of data mining. Some of these are extensions of
well-known tools such as two- and three-dimensional scatterplots and bar charts supplemented by a number of options
such as the use of colors and symbols. Other techniques such as parallel axis or the radar plot, however,
are only characteristics of data mining software tools.
The Pricipal Components Analysis (PCA) can be performed by the
Principal Components operator.
In the operator the following settings can be defined: the dependency structure as covariance or correlation,
the cut-off condition as the number of eigenvalues or the cumulative eigenvalue ratio.
The main result of principal component analysis is the principal component coordinates of individual records, which can be used in the further data analysis and visualization.
The experiment shows that how we can display high-dimensional data sets and perform dimension reduction.
In our experiment, the original
4-dimensional data set that can not be displayed
using the standard scatterplot, is managed to reduce to
2 dimensions such that
95 percent of the information contained in the data is preserved.