The process presents the Kohonen's vector quantization (VQ) and the self-organizing map (SOM) algorithms
Maximum Variance (R15)
dataset. These algorithms can be fitted by the
The dataset consists of
600 two-dimensional records, which are grouped into
15 groups. The points are located around the point with coordinates (10, 10) and
they are farther from each other as they are far from the center. The difficulty of the task is
that the groups which are around the center almost fuse. In the figure below these points are depicted
by coloring the different groups.
First, the method of Kohonen's vector quantization is used. By this method we got
The results can be seen on the figure below.
The size of clusters can be depicted by a simple pie chart.
A table displays all the statistics which characterize the clusters, among others the frequency of clusters, the standard deviation of clusters, the maximum distance from the center of clusters, and the number of the adjacent cluster with the distance between them.
Then, the method of batch SOM algorithm is applied for the same dataset. In this case, the numbers of row and column
segments should be defined where
6 was chosen. The results are shown in the following two figures.
The first one is the schematic graph of the SOM/Kohonen operator on the resulting net where the coloring shows
the frequency of each cell.
The second figure is a scatterplot which displays the resulting clusters in the coordinate system of original input attributes.
The experiment shows how to use two unsupervised data mining techniques such as vector quantization and
self-organizing maps. The two methods are particularly effective for examining
data. However, being important prototype methods, they can greatly simplify the further analysis in higher dimension